Validity Definitions & Descriptions

printer Printer Friendly Version

Author David Towler

Information describing the various types of validity used in testing.

In order for a test to be useful and in most cases, legal, it must be "valid". In other words, can it be proven that it measures what it says it will measure, do so consistently, accurately and reliably over time? Good tests are validated in a variety of ways. Some descriptions and explanations of validity measures appear below:

Criterion-Related Validation of an Aptitude Test

Criterion-related validation studies typically include the following steps (Crocker & Algina, 1986):

1. Identification of an appropriate criterion and a method for measuring it. For a mechanical aptitude test, the criterion might be subsequent job performance data.

2. Identification of an appropriate sample of examinees representative of those for whom the test will eventually be used.

3. Administration of the test and recording of each examinee’s score.

4. Measurement of performance on the criterion for each examinee.

5. Determination of the strength of the relationship between test scores and criterion performance.

There are two types of criterion-related validation: predictive and concurrent.

Predictive Validation Study

Predictive validity refers to the extent to which test scores predict criterion measurements that will be made at some point in the future. In the case of validating a paper-and-pencil mechanical aptitude test, applicants take the test but their test scores are not used to make hiring decisions. Over time and after a sufficient number of job applicants have been tested, job performance data is gathered and correlated with mechanical aptitude test scores. A positive correlation would be evidence of the predictive validity of the mechanical aptitude test.

Concurrent Validation Study

Concurrent validity refers to the relationship between test scores and criterion measures made at the time the test was given. In this case, a sufficient number of current job incumbents take the paper-and-pencil aptitude test. At approximately the same time, job performance data is collected on the incumbents and correlated with their mechanical aptitude test scores. A positive correlation would be evidence of the concurrent validity of the mechanical aptitude test.

References

Crocker,L., & Algina, J. (1986). Introduction to classical & modern test theory. Florida: Harcourt Brace Jovanovich College Publishers.

Face Validity

Face validity is very closely related to content validity. While content validity depends on a theoretical basis for assuming if a test is assessing all domains of a certain criterion (e.g. does assessing addition skills yield in a good measure for mathematical skills? - To answer this you have to know, what different kinds of arithmetic skills mathematical skills include ) face validity relates to whether a test appears to be a good measure or not. This judgment is made on the "face" of the test, thus it can also be judged by the amateur.

Content Validity

This is a non-statistical type of validity that involves “the systematic examination of the test content to determine whether it covers a representative sample of the behaviour domain to be measured” (Anatasi & Urbina, 1997 p114). A test has content validity built into it by careful selection of which items to include (Anatasi & Urbina, 1997). Items are chosen so that they comply with the test specification which is drawn up through a thorough examination of the subject domain. Foxcraft et al (2004, p. 49) note that by using a panel of experts to review the test specifications and the selection of items the content validity of a test can be improved. The experts will be able to review the items and comment on whether the items cover a representative sample of the behaviour domain.

References

Validity (statistics) - From Wikipedia, the free encyclopedia

Re-printable with permission.