Validity and Reliability

马上开始. 它是免费的哦
注册 使用您的电邮地址
Validity and Reliability 作者: Mind Map: Validity and Reliability

1. Validity... " When testing for validity the assessors must check the content of the test and compare the content with their learning outcomes and objectives. Testing for validity is important because as a teacher you should ensure that your students are receiving fair and accurate tests with questions that meet your expected learning outcomes.

1.1. Content ...The content validity evidence for a test is established by inspecting test questions to see whether they correspond to what the user decides should be covered by the test. The assessor should ensure that the questions on the test are related to what was studied and the learning outcomes that he wants his students to know.

1.2. Criterion...Criterion-related validity evidence, scores from a test are correlated with an external criterion. The external criterion should be pertinent to the original test; e.g. comparing the scores from a month's worth of vocab quizes to a final vocab test at the end of the month. A student that did well on the weekly quizes should have a negative correlation with the vocab test.

1.2.1. Predictive validity evidence refers to how well the test predicts some future behavior of the examinees. Many universities prefer this type of validity test as they can use it as a basis for academic success in higher education. Some examples of predictive tests are; SAT, ACT, and the GRE. To test predictive validity you simply need to wait and see if how cose the predictions were to reality.

1.2.2. Concurrent criterion-related.... validity evidence deals with measures that can be administered at the same time as the measure to be validated. The assessor should compare the test with an already establish test that's been validated over time. They should administer both of the tests to their students and then find the correlations using their data. The results of the correlation will give a numeric value.

1.3. Construct ... A test has construct validity evidence if its relationship to other information corresponds well with some theory. The test scores should be compared to what the assessors expect the results would be. As an example; in an University language arts gen. ed. class the professor should expect the English majors to recieve scores higher than those of Science majors. This type of validity measurement has no 2nd statistical correspondent, purely theory of expected results.

1.4. FACE...Face validity is concerned with how a measure or procedure appears." Literally an examination of the face value of a procedure measuring how it looks, whether or not it appears to be a worthwhile test, questioning the design, and whether or not it will work reliably. There are no outside theories that are used in conjuction with face validity, simply the assessors or other observors opinions.

2. RELIABILITY..."Does the test yield the same or similar score rankings (all other factors being equal) consistently? . After repeated trials the data should be examined to ensure that there are no quantitative anomolies that could infer an unreliable or inconsistent outcome. Ensuring reliability of tests is important because a teacher should only administer tests that have been proven to do what they are intended to do. This intention being- an accurate and fair portrayal of student's academic learning and progress throughout the course.

2.1. TEST RE-TEST AND STABILITY..."is a method of estimating reliability that is exactly what its name implies" (Kubiszyn & Borich, 2013). The test is given twice and the correlation between the first set of scores and the second set of scores is determined. It's exactly what the name sounds like. The assesses take the test two times and the scores are compared with each other, checking for correlations. A problem with this method of testing reliability is that taking the assessment the second time would skew the data, unless the test-takers managed to forget the entire test layout in between the assessment periods.

2.2. ALTERNATE FORMS OF EQUAVALENCE....If there are two equivalent forms of a test, these forms can be used to obtain an estimate of the reliability of the scores from the test. Similar to the test-retest method, though by not using the same test two times you can eliminate the problem of skewed results upon taking the second test. The two tests are taken and the data compared; however, this method does require the assessor to make two good tests. Potentially a lot more work.

2.3. INTERNAL CONSISTENCY ..."If the test in question is designed to measure a single basic concept, it is reasonable to assume that people who get one item right will be more likely to get other, similar items right" (Kubiszyn & Borich, 2013). The test should be consistent within itself. If a test has many questions related to the same topics or subjects, then it would make sense that a student who answers one of these questions correctly would have a higher probability of answering questions correctly with similar topics. There are a couple of different methods to ensure internal consistency.

2.3.1. SPLIT-HALF METHOD..."to find the split-half (or odd-even) reliability, each item is assigned to one half or the other" (Kubiszyn & Borich, 2013). This method of internal consistenct splits the test in half, generally an odd-even split. Each half of the test is assessed and then compared with each other to get the final correllative data.

2.3.2. KUDER-RICHARDSON METHOD ...These methods measure the extent to which items within one form of the test have as much in common with one another as do the items in that one form with corresponding items in an equivalent form. This is a data heavy method for checking reliability that requires two tests with corresponding items. Every question from every student has to be analyzed to come to a statistical conclusion. At the end of all the calculations you should be left with a number that will tell you how reliable the test questions are in comparison to each other as well as the total test reliability.

2.4. INTERRATER..."Interrater reliability is the extent to which two or more individuals (coders or raters) agree. This method of addressing reliability examines the ways in which we rate assessees; especially in more subjective terms such as oral exams. The raters will assess the test-taker and then confer about their scores, coming to some sort of agreement on the best score or best way to rate their score.