|
Sign In to gain access to subscriptions and/or personal tools.
|
Applied Psychological Measurement, Vol. 30, No. 6,
493-508 (2006)
DOI: 10.1177/0146621605287423
Equating Scores From Adaptive to Linear Tests
Wim J. van der Linden
University of Twente, the Netherlands
Two local methods for observed-score equating are applied to the problem of equating an adaptive test to a linear test. In an empirical study, the methods were evaluated against a method based on the test characteristic function (TCF) of the linear test and traditional equipercentile equating applied to the ability estimates on the adaptive test for a population of test takers. The two local methods were generally best. Surprisingly, the TCF method performed slightly worse than the equipercentile method. Both methods showed strong bias and uniformly large inaccuracy, but the TCF method suffered from extra error due to the lower asymptote of the test characteristic function. It is argued that the worse performances of the two methods are a consequence of the fact that they use a single equating transformation for an entire population of test takers and therefore have to compromise between the individual score distributions.
Key Words: computerized adaptive testing (CAT) equipercentile equating local equating score reporting reference tests test characteristic function equating
References
- Braun, H. I., & Holland, P. W. (1982). Observed score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). New York: Academic Press.
- Harris, D. B., & Crouse, J. D. (1993). A study of criteria used in equating. Applied Measurement in Education, 6, 195-240.
- Kolen, M. J., & Brennan, R. L. (1995). Test equating: Methods and practices. New York: Springer-Verlag.
- Lawrence, I., & Feigenbaum, M. (1997). Linking scores for computer-adaptive and paper-and-pencil administrations of the SAT (Research Report No. 97-12). Princeton, NJ: Educational Testing Service.
- Lord, F. M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
- Lord, F. M., & Wingersky, M. S. (1984). Comparison of IRT true-score and equipercentile observed-score "equatings." Applied Psychological Measurement, 8, 452-461.
- Segall, D. O. (1997). Equating the CAT-ASVAB. In W. A. Sands, B. K. Waters, & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 181-198). Washington, DC: American Psychological Association.
- Thissen, D., & Mislevy, R. J. (1990). Testing algorithms. In H. Wainer (with N. J. Dorans, D. Eignor, R. Flaugher, B. F. Green, R. J. Mislevy, L. Steinberg, & D. Thissen) (Ed.), Computerized adaptive testing: A primer. Hillsdale, NJ: Lawrence Erlbaum.
- van der Linden, W. J. (2000a). Constrained adaptive testing with shadow tests. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 27-52). Boston: Kluwer.
- van der Linden, W. J. (2000b). A test-theoretic approach to observed-scored equating. Psychometrika, 65, 437-456.[CrossRef]
- van der Linden, W. J. (2001). Adaptive testing with equated number-correct scoring. Applied Psychological Measurement, 25, 343-355.[Abstract/Free Full Text]
- van der Linden, W. J. (2005). Linear models for optimal test design. New York: Springer.
- van der Linden, W. J. (2006). Equating error in observed-score equating. Applied Psychological Measurement, 30, 355-378.[Abstract/Free Full Text]
- van der Linden, W. J., & Luecht, R. M. (1998). Observed-equating as a test assembly problem. Psychometrika, 63, 401-418.[CrossRef]
- Wilks, M. B., & Gnanadesikan, R. (1968). Probability plotting methods for the analysis of data. Biometrika, 55, 1-17.[Abstract/Free Full Text]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
|