Discussion 2 week 4

Validity and Reliability

Get Started. It's Free
or sign up with your email address
Discussion 2 week 4 by Mind Map: Discussion 2 week 4

1. Interpreting Reliability Coefficients- prove useful as guidelines in evaluating the reliability of scores from a test.

1.1. PRINCIPLE 1: Group variability affects the size of the reliability coefficient. Higher coefficients result from heterogeneous groups than from homogeneous groups.

1.2. PRINCIPLE2: Scoring reliability limits test score reliability. If tests are scored unreliably, error is introduced that will limit the reliability of the test scores.

1.3. PRINCIPLE3: All other factors being equal, the more items included in a test, the higher there liability of the scores.

1.4. PRINCIPLE4: Reliability of test scores tends to decrease as tests become too easy or too difficult.

2. Reliability- The consistency of measurements. If a test item is reliable, it can be correlated with other items to collectively measure a construct or content mastery.

2.1. Methods of Estimating

2.1.1. Test-Retest or Stability - Give the same test twice to the same group with any time interval between tests

2.1.2. Alternate forms or Equivalence - (similar in content, difficulty level, arrangement, type of assessment, etc.)Give two forms of the test to the same group in close succession.

2.1.3. Internal Consistency- items ought to be correlated with each other,

2.1.3.1. Split-half method- involves splitting the test into two equivalent halves and determining the correlation between them. This can be done by assigning all items in the first half of the test to one form and all items in the second half of the test to the other form. However, this approach is only appropriate when items of varying difficulty are randomly spread across the test

2.1.3.2. Kuder-Richard Methods- These methods measure the extent to which items within one form of the test have as much in common with one another as do the items in that one form with corresponding items in an equivalent form. The strength of this estimate of reliability depends on the extent to which the entire test represents a single, fairly consistent measure of a concept.

3. VALIDITY- Denotes the extent to which an instrument is measuring what it is supposed to measure.

3.1. Content Validity Evidence- Whether the individual items of a test represent what you actually want to assess. A test can sometimes look valid but measure something entirely different than what is intended, such as guessing ability, reading level, or skills that may have been acquired before instruction. Content validity evidence is, therefore, more a minimum requirement for a useful test than it is a guarantee of a good test.

3.2. Criterion- Related Validity Evidence Criterion- A method for assessing the validity of an instrument by comparing its scores with another criterion known already to be a measure of the same trait or skill. Criterion-related validity is usually expressed as a correlation between the test in question and the criterion measure. The correlation coefficient is referred to as a validity coefficients.

3.2.1. Concurrent Criterion - Related Validity Evidence- The extent to which a procedure correlates with the CURRENT behavior of subjects

3.2.2. Predictive Validity Evidence- The extent to which a procedure allows accurate predictions about a subject’s FUTURE behavior

3.3. Construct Validity Evidence- The extent to which a test measures a theoretical construct or attribute. A test’s construct validity is often assessed by its convergent and discriminate validity. " Many different kinds of theories can be used to help determine the construct validity evidence of a test. For instance, if a test is supposed to be a test of arithmetic computation skills, you would expect scores on it to improve after intensive coaching in arithmetic, but not after intensive coaching in a foreign language."