Reliability & Validity

Track and organize your meetings within your company

Get Started. It's Free
or sign up with your email address
Rocket clouds
Reliability & Validity by Mind Map: Reliability & Validity

1. Validity Evidence

1.1. Why Evaluate tests?

1.1.1. "A test may be used for more than one purpose and with people who have different characteristics, and the test may be more or less valid, reliable, or accurate when used for different purposes and with different persons.

1.1.2. "We can define validity, reliability and accuracy as follows:

1.1.2.1. a. Validity: Does the test measure what it is suppose to measure?

1.1.2.2. b. Reliability: Does the test yield the same or similar score ranking consistently?

1.1.2.3. c. Accuracy: Does the test score fairly closely approximate an individual's true level of ability, skill, or aptitude?

1.1.2.3.1. Reference: (Kubiszyn & Borich, 2010, p. 329)

1.2. Types of Validity Evidence

1.2.1. 1. Content Validity Evidence: "established by inspecting test questions to see whether they correspond to what the user decides should be covered by the test." (Kubiszyn & Borich, 2010, p. 330).

1.2.2. 2. Criterion- Related Validity Evidence: "scores from a test are correlated with an external criterion. There are two types of criterion-related validity evidence:

1.2.2.1. a. Concurrent: "deals with measures that can be administered at the same time as the measure to be validated."

1.2.2.2. b. Predictive: "determined by administering the test to a group of subjects, then measuring the subjects on whatever the test is suppose to predict after a period of time has elapsed."

1.2.3. 3. Construct Validity Evidence: " A test has construct validity evidence if its relationship to other information corresponds well with some theory." (Kubiszyn & Borich, 2010, p. 332)

1.2.3.1. REVIEW:

1.2.3.2. What have the authors been saying?

1.2.3.3. 1. Is the test valid for the intended purpose?

1.2.3.4. 2. Does the test measure what it is supposed to measure?

1.2.3.5. 3. Does the test do the job it was designed to do?

1.2.3.5.1. Reference: (Kubiszyn & Borich, 2010, p. 333)

2. Interpreting Validity Coefficients

2.1. "Validity coefficients enable us to estimate the extent to which a test measures what it is suppose to measure." (Kubiszyn & Borich, 2010, p. 334).

2.2. The Role of Validity coefficients

2.2.1. a. Content Validity Evidence: "established by comparing test items with instructional objectives (e.g., with the aid of a test blueprint) to determine whether the items match or measure the objectives." (Kubiszyn & Borich, 2010, p. 334).

2.2.2. b. Concurrent and Predictive Validity Evidence: " require the correlation of a predictor or concurrent measure with a criterion measure." (Kubiszyn & Borich, 2010, p. 334).

3. Principles of Validity

3.1. 1. Principle 1: Concurrent validity coefficients are generally higher than predictive validity coefficients. This does not mean, however, that the test with the higher validity coefficient is more suitable for a given purpose.

3.2. 2. Principle 2: Group validity affects the size of the validity coefficient. Higher validity coefficient are derived from heterogenous groups than from heterogenous groups.

3.3. 3. Principle 3: The relevance and reliability of the criterion should be considered in the interpretation of validity coefficients.

3.3.1. Reference: (Kubiszyn & Borich, 2010, p. 335-337).

4. Summary of Validity

5. Methods of Estimating Reliability

5.1. There are several ways to estimate the reliability of scores from a test.There are three basic methods most often used:

5.2. a. Test-retest or Stability: 'a method of estimating reliability that is exactly what its name implies. The test is given twice and the correlation between the first set of scores and the second set of scores is determined." (Kubiszyn & Borich, 2010, p. 341).

5.2.1. Example: Early in the year, my daughter who is in the fourth grade was given a practice standardized test. She did well on it. Most recently, her teacher gave my daughter the same practice standardized test and my daughters grades were similar from when she took it the first time. In the time from early in the year, it is my understanding that my daughters teacher did not review the material that was on the test.

5.2.1.1. Problem with Test-retest.

5.2.1.2. a. "The main problem with test-retest reliability data is that there is usually some memory or experience involved the second time the test is taken." (Kubiszyn & Borich, 2010, p. 342).

5.3. b. Alternate Forms or Equivalence: "If there are two equivalent forms of a test, these forms can be used to obtain an estimate of the reliability of the scores from the test. Both forms are administered to a group of students, and the correlation between the two sets of scores is determined." (Kubiszyn & Borich, 2010, p. 343).

5.4. c. Internal Consistency: "items ought to be correlated with each other, and the test ought to be internally consistent. One approach to determining a test's internal consistency, called split halves, involves splitting the test into two equivalent halves and determining the correlation between them." (Kubiszyn & Borich, 2010, p. 343).

5.5. d. Split-half Method: "each item is assigned to one half or the other. Then, the total score for each student on each half is determined and the correlation between the two total scores for both halves is computed." (Kubiszyn & Borich, 2010, p. 343-344).

5.6. e. Kuder-Richardson Method: "The strength of this estimate of reliability depends on the extent to which the entire test represents a single, fairly consistent measure of a concept." (Kubiszyn & Borich, 2010, p. 344).

6. Problems with Internal Consistency estimates

6.1. Internal consistency techniques are useful and popular measures of reliability because they involve only one test administration and are free from memory and practice effects. However, there are some problems with these methods.

6.2. a. "First one should only be used if the entire test consist of similar items measuring a single concept." (Kubiszyn & Borich, 2010, p. 345).

6.3. b. "A second problem is that measure of internal consistency yield inflated estimates of reliability when used with speeded tests. A speeded test consists entirely of easy or relatively easy item tasks with strict time limit." (Kubiszyn & Borich, 2010, p. 346).

7. Interpreting Reliability Coefficients Principles

7.1. Principle 1: Group variability affects the size of the reliability coefficient, Higher coefficients result from heterogeneous groups than from homogenous groups.

7.2. Principle 2: Scoring reliability limits test score reliability. If test are scored unreliably, error is introduced that will limit the reliability of the test scores.

7.3. Principle 3: All other factors being equal, the more items included in a test, the higher the reliability of the scores.

7.4. Principle 4: Reliability of test scores tends to decrease as tests become too easy or too difficult.

7.4.1. Reference: (Kubiszyn & Borich, 2010, p. 346-348).

8. Summary of Reliability

9. Latanya Coleman

10. REFERENCE: Kubiszyn, T. & Borich, G. (2010). Educational testing & measurement: Classroom application and practice (9th ed.). John Wiley & Sons, Inc., Hoboken, NJ.