|
Sign In to gain access to subscriptions and/or personal tools.
|
Applied Psychological Measurement, Vol. 32, No. 1,
81-97 (2008)
DOI: 10.1177/0146621607311580
Anchor Test Type and Population Invariance: An Exploration Across Subpopulations and Test Administrations
Neil J. Dorans
Educational Testing Service, ndorans{at}ets.org
Jinghua Liu
Educational Testing Service
Shelby Hammond
Rural Vermont, Montpelier
This exploratory study was built on research spanning three decades. Petersen, Marco, and Stewart (1982) conducted a major empirical investigation of the efficacy of different equating methods. The studies reported in Dorans (1990) examined how different equating methods performed across samples selected in different ways. Recent population sensitivity studies have examined whether equating methods yield comparable results across subpopulations. The current study confirms earlier research and clarifies the role of population invariance studies in assessing equating results. A content-appropriate anchor produced solid equating results under small ability differences and divergence of equating results for different methods under large ability differences. Results showed a content-inappropriate anchor did not produce sound score equatings but did yield a strong degree of invariance. Lack of population invariance of equating results can be taken as evidence that a linking is not an equating. The existence of invariance does not mean, however, that equating has been achieved.
Key Words: Index terms: linear equating population invariance anchor test selection variable
References
- Angoff, W.H. (1971). Scales, norms and equivalent scores. In R. L. Thorndike (Ed.), Educational measurement (2nd ed., pp. 508-600). Washington, DC: American Council on Education.
- Dorans, N. J. (Ed.). (1990). Selecting samples for equating: To match or not to match [Special issue]. Applied Measurement in Education, 3(1).
- Dorans, N. J. (Ed.). (2004a). Assessing the population sensitivity of equating functions [Special issue]. Journal of Educational Measurement, 41(1).
- Dorans, N.J. (2004b). Using subpopulation invariance to assess test score equity. Journal of Educational Measurement, 41, 43-68.[CrossRef][ISI]
- Dorans, N.J., Liu, J., & Hammond, S. (2006). The role of the anchor test in achieving population invariance across subpopulations and test administrations. In A. A. von Davier & M. Liu (Eds.), Population invariance of test equating and linking: Theory extension and applications across exams (pp. 131-160). Princeton, NJ: Educational Testing Service.
- Dorans, N.J., & Holland, P.W. (2000). Population invariance and equatability of tests: Basic theory and the linear case. Journal of Educational Measurement, 37, 281-306.[CrossRef]
- Eignor, D.R., Stocking, M.L., & Cook, L.L. (1990). Simulation results of the effects on linear and curvilinear observed- and true-score equating procedures of matching on a fallible criterion. Applied Measurement in Education, 3, 37-52.[CrossRef]
- Holland, P.W. (2004). Three methods of linear equating for the NEAT design. Unpublished manuscript.
- Kolen, M.J. (1990). Does matching in equating work? A discussion. Applied Measurement in Education, 3, 97-104.[CrossRef]
- Kolen, M.J. (2004). Population invariance in equating: Concept and history. Journal of Educational Measurement, 41, 3-14.[CrossRef][ISI]
- Kolen, M.J., & Brennan, R.L. (2004). Test equating, scaling, and linking: Methods and practices (2nd ed.). New York: Springer-Verlag.
- Lawrence, I.M., & Dorans, N.J. (1988). A comparison of observed score and true score equating methods for representative samples and samples matched on an anchor test (ETS Research Rep. No. RR-88-23). Princeton, NJ: Educational Testing Service.
- Lawrence, I.M., & Dorans, N.J. (1990). A comparison of several equating methods for representative samples and samples matched on an anchor test. Applied Measurement in Education, 3, 19-36.[CrossRef]
- Liu, M. & Holland, P.W. (2008). Exploring Population Sensitivity of Linking Functions Across Three Law School Admission Test Administrations. Applied Psychological Measurement, 32, 27-44.[Abstract/Free Full Text]
- Livingston, S.A., Dorans, N.J., & Wright, N.K. (1990). What combination of sampling and equating methods works best? Applied Measurement in Education, 3, 73-95.[CrossRef]
- Lord, F.M. (1980). Application of item response theory to practical testing problems. Hillsdale, NJ: Lawrence Erlbaum.
- Petersen, N.S., Marco, G.L., & Stewart, E.E. (1982). A test of the adequacy of linear score equating models. In P. W. Holland & D. R. Rubin (Eds.), Test equating. New York: Academic Press.
- Pommerich, M., & Dorans, N. J. (Eds.). (2004). Concordance [Special issue]. Applied Psychological Measurement, 28(4).
- Schmitt, A.P., Cook, L.L., Dorans, N.J., & Eignor, D.R. (1990). The sensitivity of equating results to different sampling strategies. Applied Measurement in Education, 3, 53-71.[CrossRef]
- Skaggs, G. (1990). To match or not to match samples on ability for equating: A discussion of five articles. Applied Measurement in Education, 3, 105-113.[CrossRef]
- von Davier, A.A., Holland, P.W., & Thayer, D.T. (2004a). The chain and post-stratification methods for observed-score equating and their relationship to population invariance. Journal of Educational Measurement, 41, 15-32.[CrossRef][ISI]
- von Davier, A.A., Holland, P.W., & Thayer, D.T. (2004b). The kernel method of test equating. New York: Springer-Verlag.
- von Davier, A.A., & Wilson, C. (2008). Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating Across Two Subgroups of Examinees and Two Test Formats. Applied Psychological Measurement, 32, 11-26.[Abstract/Free Full Text]
- Wright, N.K., & Dorans, N.J. (1993). Using the selection variable for matching or equating (ETS Research Rep. No. RR-93-04). Princeton, NJ: Educational Testing Service.
- Yang, W.-L. (2004). Sensitivity of linkings between AP multiple-choice scores and composite scores to geographical region: An illustration of checking for population invariance. Journal of Educational Measurement, 41, 33-41.[CrossRef][ISI]
- Yang, W.-L., & Gao, R. (2008). Invariance of Score Linkings Across Gender Groups for Forms of a Testlet-Based College-Level Examination Program Examination. Applied Psychological Measurement, 32, 45-61.[Abstract/Free Full Text]
- Yi, Q., Harris, D.J., & Gao, X. (2008). Invariance of Equating Functions Across Different Subgroups of Examinees Taking a Science Achievement Test. Applied Psychological Measurement, 32, 62-80.[Abstract/Free Full Text]
- Yin, P., Brennan, R.L., & Kolen, M.J. (2004). Concordance between ACT and ITED scores from different populations. Applied Psychological Measurement, 28, 274-289.[Abstract]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
R. L. Brennan
A Discussion of Population Invariance
Applied Psychological Measurement,
January 1, 2008;
32(1):
102 - 114.
[PDF]
|
 |
|
|