Various Types of Validity & Reliability
Various Types of Validity & Reliability

This mind map represents the various types of validity and reliability, and explains why they are important in learning and assessment.

For learning and assessments


"Does the test measure what it is supposed to measure?" (Kubiszyn & Borich, 2010 p. 329).


"Does the test yield the same or similar score rankings (all other factors being equal) consistently?" (Kubiszyn & Borich, 2010, p. 329).


"Does the test score fairly closely approximate an individual’s true level of ability, skill, or aptitude?" (Kubiszyn & Borich, 2010, p. 329).


Kubiszyn, T. & Borich, G (2010). Educational testing & measurement: Classroom application and practice (9th ed). John Wiley & Sons, Inc., Hoboken, NJ

by: Lorylynn Reyes


Test Retest or Stability

“Test –retest estimates of reliability are obtained by administering the same test twice to the same group of individuals, with a small time interval between testing, and correlating the scores. The longer the time interval, the lower test –retest estimates will be” (Kubiszyn & Borich, 2010, p. 349). Validity is shown if duplicate scores of a repeated test is observed.

Alternate-form estimates of reliability

“Alternate-form estimates of reliability are obtained by administering two alternate or equivalent forms of a test to the same group and correlating their scores. The time interval between testings is as short as possible” (Kubiszyn & Borich, 2010, p. 349). Shows reliability of test as being interchangeable.

Internal consistency estimates of reliability

“Internal consistency estimates of reliability fall into two general categories: split-half or odd–even estimates and item–total correlations, such as the Kuder –Richardson (KR) procedure. These estimates should be used only when the test measures a single or unitary trait” (Kubiszyn & Borich, 2010, p. 349). Shows reliability of scores across a single subject to be scored consistently.

Split-half and odd–even estimates

Kuder –Richardson methods


Content Validity Evidence

“Content validity evidence is assessed by systematically comparing a test item with instructional objectives to see if they match. Content validity evidence does not yield a numerical estimate of validity” (Kubiszyn & Borich, 2010, p. 339). Educational goals should be able to shape curriculum and the "content" must match the test. A poor test would fail to match the instructional and educational goals of a class.

Criterion-Related Evidence

“In establishing criterion-related validity evidence, scores from a test are correlated with an external criterion” (Kubiszyn & Borich, 2010, p. 330).

Concurrent validity evidence

Predictive validity evidence

Construct validity evidence

“Construct validity evidence is determined by finding whether test results correspond with scores on other variables as predicted by some rationale or theory” (Kubiszyn & Borich, 2010, p. 340). A test score will correlate with the instruction or aptitude of a student. This can help an educator view the relationship of instruction into test score results.