# Validity and Reliablity

## 1. Reliability: the consistency with which a test yields the same rank for individuals who take the test more than once (p. 338).

### 1.1. Test-Retest: a method of estimating reliability. A test is given twice and the correlation between the first set of first set of scores and the second set of scores is determined.

1.1.1. **The main problem with Test-Retest is that some form of memory or experience is involved when the test is taken the second time.

1.1.2. The interval between testing must be considered: the longer the interval between tests, the less reliable the test results will be.

### 1.2. Alternate Forms (Equivalence): a method of using two forms of a test to obtain an estimate of the reliability of the scores of a test.

1.2.1. Both forms of a test are given to students and the correlation between the two sets of scores is determined.

### 1.3. Internal Consistency: a method of ensuring that items in a test are correlated. *If a student does well on one item, then he or she should do well on other, similar items*

1.3.1. Split halves: the process of splitting a test into two equal halves and finding the correlation between the two.

1.3.1.1. **Only works when items of different levels of difficulty are randomly spread throughout a test**

1.3.2. Odd-Even Reliability: test items are divided into two halves by placing all odd-numbered items into one half and all even-numbered items into the other half.

## 2. Validity: achieved when a test measures what it's supposed to measure (P. 326).

### 2.1. Content Validity: Established by inspecting test questions to see whether or not they match learning objectives.

2.1.1. **Easy to see in achievement tests, but difficult to see in personality or aptitude tests**

2.1.2. Gives information about whether a test looks valid, but does not indicate whether the reading level is too high or if items are poorly written.

### 2.2. Criterion-Related: a method of correlating scores from a test with an extreme criterion.

2.2.1. Concurrent--deals with measures that can be given at the same time as the original measure to be validated. **A test developer who designs a short screening test that measures IQ might show that the test is highly correlated with the Binet IV and establish concurrent-related validity evidence for the test**

2.2.2. Yields a numeric value (a correlation coefficient) called a validity coefficient.

2.2.3. Is determined by giving the new test and the established test to test-takers then finding correlation between the two sets of scores.

### 2.3. Construct Validity: used if a test is being created to measure something not previously measures, or not measured well, and no criterion exists to compare the test to.

2.3.1. **A test has construct validity if its relationship to other information corresponds well with some theory**

2.3.2. Any information that shows whether results from a test correspond to what is expected says something about the construct validity evidence for that test.