2. Validity and Reliability
by Demetrice Gipson
1. Validity Clearly if a test is to be used in any kind of decision making, or indeed if the test information is to have any use at all, there should be ability to identify the types of evidence that indicate the test’s validity for the purpose it is being used for.
2. Content Validity Evidence It is established by inspecting test questions to see whether they correspond to what the user decides should be covered by the test. A test with good content validity evidence matches or fits the instructional objectives.
3. Concurrent criterion-related validity evidence Concurrent criterion-related validity evidence deals with measures that can be administered at the same time as the measure to be validated. It is determined by administering both the new test and the established test to a group of respondents, then finding the correlation between the two sets of test scores.
4. Criterion-Related Validity Evidence Scores from a test are correlated with an external criterion. Two types of criterion-related validity evidence; concurrent and predictive.
5. Predictive validity evidence Predictive validity evidence refers to how well the test predicts some future behavior of the examinees. It is particularly useful and important for aptitude tests, which attempt to predict how well test takers will do in some future setting.
6. Construct Validity Evidence There is evidence of this test if its relationship to other information corresponds well with some theory; which is a logical explanation or rationale that can account for the interrelationships among a set of variables. Any information that lets you know whether a result from the test corresponds to what you would expect tells you something about the construct validity evidence for a test.
7. Reliability The reliability of a test refers to the consistency with which it yields the same rank for individuals who take the test more than once.
8. Alternate Forms of Equivalence These forms can be used to obtain an estimate of the reliability of the scores from the test. Both of these forms are administered to a group of students, and the correlation between the two sets of scores is determined.
9. Test-Retest or Stability This is a method of estimating reliability that is exactly what its name implies. Tests are given twice and the correlation between the first set of scores and the second set of scores is determined.
10. Internal Consistency Items ought to be correlated with each other and the test ought to be internally consistent. It is reasonable to assume that people who get one item right will be more likely to get another, similar items right.
11. Split-half methods Each item is assigned to one half or the other and then the total score for each student on each half is determined and the correlation between the two scores for both halves is computed.
12. Kuder-Richardson methods This method measures the extent to which items within one form of the test have as much in common with one another as do the items in that one form with corresponding items in an equivalent form.