Validity & Reliability

Find the right structure and content for your course and set up a syllabus

Get Started. It's Free
or sign up with your email address
Validity & Reliability by Mind Map: Validity & Reliability

1. Important to Learning & Assessment

1.1. Validity

1.1.1. "Does the test measure what it is supposed to measure?" (Kubiszyn & Borich, 2010, p. 329)

1.2. Reliability

1.2.1. "Does the test yield the same or similar score rankings (all other factors being equal) consistently?" (2010, p. 329)

1.3. Accurracy

1.3.1. "Does the test score fairly closely approximate an individual's true level of ability, skill, or aptitude?" (2010, p. 329)

2. If it can be demonstrated that a test measures what it is suppose to measure, then it has validity evidence.

2.1. Content Validity Evidence

2.1.1. Simplest

2.1.2. In the classroom testing context, it answers the question "Does the test measure the instruction objectives?" (Kubiszyn & Borich, 2010, p. 330)

2.1.3. Does not yield a numerical evidence, yields a logical judgment...

2.1.4. These three assume that some criterion exists external to the test that can be used to anchor or validate the test (2010, p. 332).

2.2. Criterion-Related Validity Evidence

2.2.1. Concurrent Validity Evidence

2.2.1.1. Deals with measures that can be administered at the same time as the measure to be validated (2010, p. 330)..

2.2.2. Predictive Validity Evidence

2.2.2.1. Refers to how well the test predicts some future behavior of the examinees (2010, p. 331).

2.2.2.1.1. Yields numerical indices of validity

2.3. Construct Validity Evidence

2.3.1. If a test's relationship to other data corresponds soundly with some theory (a logical explanation that accounts for the interrelationships amongst a set of variables)

2.3.2. It is different than concurrent validity evidence because there is no recognized second measure obtainable of what it is that one is attempting to measure, and it is different than predictive validity evidence because no measure of future behavior is available.

3. Week 5 Discussion 2

4. by: Amy Dodson

4.1. 05-09-2012

5. VALIDITY

6. Reference Kubiszyn, T., & Borich, G. (2010). Educational testing & measurement: Classroom application and practice (9th ed.). John Wiley & Sons, Inc., Hoboken, NJ. ISBN 978-0-470-52281-3

7. A test is considered reliable if it consistently yields the identical or similar ranks for the individuals who take a test more than one time, with the expectation that what is being measured has not changed.

8. Test–Retest or Stability

8.1. A method of estimating reliability – A test is administered two times and “the correlation between the first set of scores and the second set of scores is determined” (Kubiszyn & Borich, 2010, p. 341).

8.2. The major problem with the data of this method is the individual, who took the test, remembering items on the first test when taking the test a second time.

8.3. Time interval reflects the reliability of the test plus unknown changes in the students on the attribute being measured (2010, p. 342).

9. Alternate Forms or Equivalence

9.1. Achieved by giving two alternate (different) or equivalent forms of a test to the identical group and then correlating their scores.

9.2. The time interval between the two tests should be as brief as possible.

10. Internal Consistency

10.1. One can estimate the reliability of the scores (for a test) by using the internal consistency method.

10.2. Test items should be correlated with one another, and the test should be internally reliable (consistent).

10.3. Split-halves (or odd-even) reliability method

10.3.1. Divide a test into halves and correlate the halves with one another. Because these correlations are based on half tests, the obtained correlations underestimate the reliability of the whole test (2010, p. 349).

10.3.2. The Spearman–Brown prophecy formula is used to correct these estimates to what they would be if they were based on the whole test (2010, 349).

10.4. Kuder–Richardson methods

10.4.1. Measures the extent to which items within one form of the test have as much in common with one another as do the items in that one form with corresponding items in an equivalent form (2010, p. 344).

10.4.2. The strength of this estimate of reliability is dependent upon the extent to which the whole test “represents a single, fairly consistent measure of a concept” (2010, p. 344).

11. RELIABILITY