
1 minute read
Classification accuracy
Accuracy for Identifying Students At-Risk for Word Reading Difficulties
In addition to providing information to teachers on student’s phonics mastery, Star Phonics can be used to identify students considered “at risk” for reading difficulties, including characteristics of dyslexia, and thus requiring additional instruction and diagnostic assessment. In such cases, correlation coefficients are of lesser interest than classification accuracy statistics, such as overall accuracy of classification and area under the curve.
Area under the ROC curve (AUC) is a summary measure of diagnostic accuracy. The National Center For Intensive Intervention has set an AUC of 0.80 or higher as indicating convincing evidence that an assessment can accurately predict among students with satisfactory and unsatisfactory reading performance
To evaluate classification accuracy, student’s scores on Star Phonics were compared to their performance on the Woodcock Johnson Reading Mastery test (WJRM). Using the WJRM cut score as the criterion for identifying performance “at risk,” Star Phonics developers calculated AUC for Star Phonics. Coefficients range from 0.823 to 0.969 (average 0.92), demonstrating high agreement between the results of these two measures (See Table 3). This indicates that Star Phonics does a very good job of discriminating between students who performed satisfactorily and unsatisfactorily on the Woodcock Johnson assessment.
Validity
Test validity was long described as the degree to which a test measures what it is intended to measure. A more current description is that a test is valid to the extent that there are evidentiary data to support specific claims as to what the test measures, the interpretation of its scores, and the uses for which it is recommended or applied. Evidence of test validity is often indirect and incremental, consisting of a variety of findings that in the aggregate are consistent with the theory that the test measures the intended construct(s), or is suitable for its intended uses and interpretations of its scores. Determining the validity of a test involves the use of data and other information both internal and external to the test instrument itself. Star Phonics assessments meet validity expectations on all accounts.
Criterion validity (or criterion-related validity) measures how well one measure predicts an outcome on another measure. A test has this type of validity if it is useful for predicting performance or behavior in another situation (past, present, or future). The first measure is sometimes called the predictor variable or the estimator. The second measure is called the criterion variable as long as the measure is known to be a valid tool for measuring similar outcomes or skills. Star Phonics was evaluated using concurrent validity which is when the predictor and criterion data are collected at the same time. (See Table 4).