Applied Psychological Measurement

 

Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Free Access - Register Here

Click here for more information

Sign In to gain access to subscriptions and/or personal tools.
This Article
Right arrow Free Full Text (Free PDF) Free
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Kim, S.-H.
Right arrow Articles by Cohen, A. S.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Applied Psychological Measurement, Vol. 22, No. 2, 131-143 (1998)
DOI: 10.1177/01466216980222003

A Comparison of Linking and Concurrent Calibration Under Item Response Theory

Seock-Ho Kim

University of Georgia

Allan S. Cohen

University of Wisconsin at Madison

Applications of item response theory (IRT) to practical testing problems, including equating, differential item functioning, and computerized adaptive testing, require a common metric for item parameter estimates. This study compared three methods for developing a common metric under IRT: (1) linking separate calibration runs using equating coefficients from the characteristic curve method, (2) concurrent calibration based on marginal maximum a posteriori estimation, and (3) concurrent calibration based on marginal maximum likelihood estimation. For smaller numbers of common items, linking using the characteristic curve method yielded smaller root mean square differences for both item discrimination and difficulty parameters. For larger numbers of common items, the three methods yielded similar results.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
JOURNAL OF EDUCATIONAL AND BEHAVIORAL STATISTICSHome page
S. Kim and M. J. Kolen
Effects on Scale Linking of Different Definitions of Criterion Functions for the IRT Characteristic Curve Methods
Journal of Educational and Behavioral Statistics, December 1, 2007; 32(4): 371 - 397.
[Abstract] [Full Text] [PDF]


Home page
Educational and Psychological MeasurementHome page
A. Notenboom and P. Reitsma
Investigating the Dimensions of Spelling Ability
Educational and Psychological Measurement, December 1, 2003; 63(6): 1039 - 1059.
[Abstract] [PDF]


Home page
Applied Psychological MeasurementHome page
B. A. Hanson and A. A. Beguin
Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent Estimation in the Common-Item Equating Design
Applied Psychological Measurement, March 1, 2002; 26(1): 3 - 24.
[Abstract] [PDF]


Home page
Applied Psychological MeasurementHome page
S.-H. Kim and A. S. Cohen
A Comparison of Linking and Concurrent Calibration Under the Graded Response Model
Applied Psychological Measurement, March 1, 2002; 26(1): 25 - 41.
[Abstract] [PDF]