|
Sign In to gain access to subscriptions and/or personal tools.
|
Applied Psychological Measurement, Vol. 32, No. 4,
334-347 (2008)
DOI: 10.1177/0146621607300854
© 2008 SAGE Publications
Comparison of Parametric and Nonparametric Bootstrap Methods for Estimating Random Error in Equipercentile Equating
Zhongmin Cui
Zhongmin.Cui{at}act.org
Michael J. Kolen
University of Iowa
This article considers two methods of estimating standard errors of equipercentile equating: the parametric bootstrap method and the nonparametric bootstrap method. Using a simulation study, these two methods are compared under three sample sizes (300, 1,000, and 3,000), for two test content areas (the Iowa Tests of Basic Skills Maps and Diagrams and the ACT English), for two test lengths (24 items and 75 items), and for different parametric models (polynomial log-linear models with fitted degrees of C =2 through 10). One thousand bootstrap samples were used to estimate standard errors of equating. The parametric bootstrap method was found to estimate standard errors of equating more accurately than the nonparametric bootstrap method in most of the situations examined.
Key Words: equating standard errors bootstrap polynomial log-linear
References
- ACT. (1997). ACT Assessment technical manual. Iowa City, IA: American College Testing.
- Braun, H.I., & Holland, P.W. (1982). Observed-score test equating: A mathematical analysis of some ETS equating procedures. In P. W. Holland & D. B. Rubin (Eds.), Test equating (pp. 9-49). New York: Academic.
- Brennan, R.L., Harris, D.J., & Hanson, B.A. (1987). The bootstrap and other procedures for examining the variability of estimated variance components in testing contexts (ACT Research Report 87-7). Iowa City, IA: American College Testing.
- Casella, G., & Berger, R.L. (2001). Statistical inference (2nd ed.). Pacific Grove, CA: Duxbury.
- Efron B., & Tibshirani, R.J. (1993). An introduction to the bootstrap (Monographs on Statistics and Applied Probability No. 57). New York: Chapman & Hall.
- Hoover, H.D., Frisbie, D.A., & Dunbar, S.B. (1993). Guide to research and development: Iowa Tests of Basic Skills, Form K and L. Iowa City, IA: Iowa Testing Programs.
- Kendall, M., & Stuart, A. (1977). The advanced theory of statistics (4th ed., Vol. 1). New York: Macmillan.
- Kolen, M.J. (1985). Standard errors of Tucker equating. Applied Psychological Measurement, 9, 209-223.[Abstract]
- Kolen, M.J., & Brennan, R.L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer-Verlag.
- Lord, F.M. (1980). Applications of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
- Lord, F.M. (1982). The standard error of equipercentile equating. Journal of Educational Statistics, 7, 165-174.[CrossRef]
- Mislevy, R., & Bock, R.D. (1990). Bilog: Item analysis and test scoring with binary logistic models [Computer software]. Mooresville, IN: Scientific Software.
- Press, W.H., Teukolsky, S.A., Vetterling, W.T., & Flannery, B.P. (1997). Numerical recipes in C: The art of scientific computing (2nd ed.). Cambridge, UK: Cambridge University Press.
- Zeng, L. (1991). Standard errors of linear equating for the single-group design (ACT Research Report 91-4). Iowa City, IA: American College Testing.
- Zeng, L., Kolen, M.J., Hanson, B.A., Cui, Z., & Chien, Y. (2004). RAGE-RGEQUATE [Computer software]. Iowa City: University of Iowa.

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
|