Validity and Reliability in Assessment

Get Started. It's Free
or sign up with your email address
Rocket clouds
Validity and Reliability in Assessment by Mind Map: Validity and Reliability in Assessment

1. References

1.1. Borich, G.D., and Kubiszyn, T. (2013). Educational testing & measurement (10th ed.). United States of America: John Wiley & Sons, Inc.

2. Types of Validity

2.1. Content Validity

2.1.1. Content validity is the simplest form of validity in assessment. It is created by making sure that the content of test questions matches up to the objectives that the test is supposed to measure. If the measurement matches the assessment, there is high validity.

2.2. Concurrent Criterion-Related Validity

2.2.1. This type of validity has to do with comparing an assessment to a similar one that already has a reputation as a successful assessment, such as the SAT. When using concurrent validity, it is important to administer both the new test and the established test at the same time to eliminate outside factors from affecting the comparison. If there is a high correlation between the results of both tests, that gives validity to the new one.

2.3. Construct Validity Evidence

2.3.1. This type of validity is determined by establishing whether or not the relationship between the test and information is affected by a theory. This means that if you were to apply any rationale or logical reasoning to the results of the assessment, and the results positively represent what you expected based on your rationale, then the assessment is valid.

3. Ways to Estimate Reliability

3.1. Test-Retest Reliability

3.1.1. This means of estimating a test's reliability requires the test to be administered twice. The sets of scores from the first time it is given and the second time it is given are compared, and the correlation between them gives the estimate of reliability. The test-retest method can be effective, but it can be considered problematic. If the time span between tests is too short, students may recall information/questions from the first time they took the test that helps them the second time. However, if the time span is too long they may be influenced by other factors such as gaining more instruction or information that helps them on the second administration of the test.

3.2. Alternative Form Reliability

3.2.1. This method of estimating reliability requires two different forms of a test that are designed to measure the exact same skills and knowledge. The two forms of the test are administered to one group of students at the same time, and the sets of scores from each of the two tests are compared for any correlations. The main problem with this method is that it requires a lot of time and effort to create two quality tests. This method is often seen in standardized testing, helping the test-makers to determine the reliability and quality of new test questions.

3.3. Internal Consistency Reliability

3.3.1. This method of estimating reliability is most effective when used for tests that consistently measure only one concept. It is applied by splitting the set of questions into two groups and finding correlations between the two groups. When the test items are split into the first half of the test and the second half, it is called "split halves" reliability. When the test items are split up into odd-numbered questions and even-numbered questions, it is called "odd-even reliability." The decision on how to split the questions is based on whether or not they are grouped already from front to back, based on difficulty.