|
Sign In to gain access to subscriptions and/or personal tools.
|
Developing a Common Metric in Item Response Theory
Martha L. Stocking
Educational Testing Service
Frederic M. Lord
Educational Testing Service
A common problem arises when independent esti mates of item parameters from two separate data sets must be expressed in the same metric. This problem is frequently confronted in studies of horizontal and ver tical equating and in studies of item bias. This paper discusses a number of methods for finding the appro priate transformation from one metric to another met ric and presents a new method. Data are given com paring this new method with a current method, and recommendations are made.
Applied Psychological Measurement, Vol. 7, No. 2,
201-210 (1983)
DOI: 10.1177/014662168300700208

CiteULike Complore Connotea Del.icio.us Digg Reddit Technorati Twitter What's this?
This article has been cited by other articles:

|
 |

|
 |
 
K. T. Han
IRTEQ: Windows Application That Implements Item Response Theory Scaling and Equating
Applied Psychological Measurement,
September 1, 2009;
33(6):
491 - 493.
[PDF]
|
 |
|

|
 |

|
 |
 
D. C. Rivers, A. W. Meade, and W. Lou Fuller
Examining Question and Context Effects in Organization Survey Data Using Item Response Theory
Organizational Research Methods,
July 1, 2009;
12(3):
529 - 553.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
G. E. Miller and S. J. Fitzpatrick
Expected Equating Error Resulting From Incorrect Handling of Item Parameter Drift Among the Common Items
Educational and Psychological Measurement,
June 1, 2009;
69(3):
357 - 368.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. M. Woods
Likelihood-Ratio DIF Testing: Effects of Nonnormality
Applied Psychological Measurement,
October 1, 2008;
32(7):
511 - 526.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
T. S. Behrend, L. Foster Thompson, A. W. Meade, D. A. Newton, and M. S. Grayson
Measurement Invariance in Careers Research: Using IRT to Study Gender Differences in Medical Students' Specialization Decisions
Journal of Career Development,
September 1, 2008;
35(1):
60 - 83.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Liu, E. M. Schulz, and L. Yu
Standard Error Estimation of 3PL IRT True Score Equating With an MCMC Method
Journal of Educational and Behavioral Statistics,
September 1, 2008;
33(3):
257 - 278.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
C. A. Scherbaum and H. W. Goldstein
Examining the Relationship Between Race-Based Differential Item Functioning and Item Difficulty
Educational and Psychological Measurement,
August 1, 2008;
68(4):
537 - 553.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
Huiqin Hu, W. T. Rogers, and Z. Vukmirovic
Investigation of IRT-Based Equating Methods in the Presence of Outlier Common Items
Applied Psychological Measurement,
June 1, 2008;
32(4):
311 - 333.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. A. von Davier and C. Wilson
Investigating the Population Sensitivity Assumption of Item Response Theory True-Score Equating Across Two Subgroups of Examinees and Two Test Formats
Applied Psychological Measurement,
January 1, 2008;
32(1):
11 - 26.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
W.-L. Yang and Rui Gao
Invariance of Score Linkings Across Gender Groups for Forms of a Testlet-Based College-Level Examination Program Examination
Applied Psychological Measurement,
January 1, 2008;
32(1):
45 - 61.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. A. von Davier and C. Wilson
IRT True-Score Test Equating: A Guide Through Assumptions and Applications
Educational and Psychological Measurement,
December 1, 2007;
67(6):
940 - 957.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S. Kim and M. J. Kolen
Effects on Scale Linking of Different Definitions of Criterion Functions for the IRT Characteristic Curve Methods
Journal of Educational and Behavioral Statistics,
December 1, 2007;
32(4):
371 - 397.
[Abstract]
[Full Text]
[PDF]
|
 |
|

|
 |

|
 |
 
J. Anderson Koenig and J. S. Roberts
Linking Parameters Estimated With the Generalized Graded Unfolding Model: A Comparison of the Accuracy of Characteristic Curve Methods
Applied Psychological Measurement,
November 1, 2007;
31(6):
504 - 524.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. W. Meade, G. J. Lautenschlager, and E. C. Johnson
A Monte Carlo Examination of the Sensitivity of the Differential Functioning of Items and Tests Framework for Tests of Measurement Invariance With Likert Data
Applied Psychological Measurement,
September 1, 2007;
31(5):
430 - 455.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
B. A. Baker, A. L. Caison, and A. W. Meade
Assessing Gender-Related Differential Item Functioning and Predictive Validity With the Institutional Integration Scale
Educational and Psychological Measurement,
June 1, 2007;
67(3):
545 - 559.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
L. Yao and K. A. Boughton
A Multidimensional Item Response Modeling Approach for Improving Subscale Proficiency Estimation and Classification
Applied Psychological Measurement,
March 1, 2007;
31(2):
83 - 105.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. Li, D. M. Bolt, and J. Fu
A Test Characteristic Curve Linking Method for the Testlet Model
Applied Psychological Measurement,
September 1, 2005;
29(5):
340 - 356.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
R. L. Tate
Equating for Long-Term Scale Maintenance of Mixed Format Tests Containing Multiple Choice and Constructed Response Items
Educational and Psychological Measurement,
December 1, 2003;
63(6):
893 - 914.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S. B. Craig and R. B. Kaiser
Applying Item Response Theory to Multisource Performance Ratings: What are the Consequences of Violating the Independent Observations Assumption?
Organizational Research Methods,
January 1, 2003;
6(1):
44 - 60.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
W.-C. Lee, B. A. Hanson, and R. L. Brennan
Estimating Consistency and Accuracy Indices for Multiple Classifications
Applied Psychological Measurement,
December 1, 2002;
26(4):
412 - 432.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
J.-S. Kim and B. A. Hanson
Test Equating Under the Multiple-Choice Model
Applied Psychological Measurement,
September 1, 2002;
26(3):
255 - 270.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
J. A. Wollack, D. M. Bolt, A. S. Cohen, and Y.-S. Lee
Recovery of Item Parameters in the Nominal Response Model: A Comparison of Marginal Maximum Likelihood Estimation and Markov Chain Monte Carlo Estimation
Applied Psychological Measurement,
September 1, 2002;
26(3):
339 - 352.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
B. A. Hanson and A. A. Beguin
Obtaining a Common Scale for Item Response Theory Item Parameters Using Separate Versus Concurrent Estimation in the Common-Item Equating Design
Applied Psychological Measurement,
March 1, 2002;
26(1):
3 - 24.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Linking and Concurrent Calibration Under the Graded Response Model
Applied Psychological Measurement,
March 1, 2002;
26(1):
25 - 41.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. S. Wells, M. J. Subkoviak, and R. C. Serlin
The Effect of Item Parameter Drift on Examinee Ability Estimates
Applied Psychological Measurement,
March 1, 2002;
26(1):
77 - 87.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
M. D. Hidalgo-Montesinos and J. A. Lopez-Pina
Two-Stage Equating in Differential Item Functioning Detection under the Graded Response Model with the Raju Area Measures and the Lord Statistic
Educational and Psychological Measurement,
February 1, 2002;
62(1):
32 - 44.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ogasawara
Least Squares Estimation of Item Response Theory Linking Coefficients
Applied Psychological Measurement,
December 1, 2001;
25(4):
373 - 383.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen, M. T. Kane, and S.-H. Kim
The Precision of Simulation Study Results
Applied Psychological Measurement,
June 1, 2001;
25(2):
136 - 145.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
G. S. Kaskowitz and R. J. De Ayala
The Effect of Error in Item Parameter Estimates on the Test Response Function Method of Linking
Applied Psychological Measurement,
March 1, 2001;
25(1):
39 - 52.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ogasawara
Standard Errors of Item Response Theory Equating/Linking by Response Function Methods
Applied Psychological Measurement,
March 1, 2001;
25(1):
53 - 67.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
H. Ogasawara
Item Response Theory True Score Equatings and Their Standard Errors
Journal of Educational and Behavioral Statistics,
January 1, 2001;
26(1):
31 - 50.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
E. Muraki, C. M. Hombo, and Y.-W. Lee
Equating and Linking of Performance Assessments
Applied Psychological Measurement,
December 1, 2000;
24(4):
325 - 337.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
B. B. Ellis and A. D. Mead
Assessment of the Measurement Equivalence of a Spanish Translation of the 16PF Questionnaire
Educational and Psychological Measurement,
October 1, 2000;
60(5):
787 - 807.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
Y. H. Li and R. W Lissitz
An Evaluation of the Accuracy of Multidimensional IRT Linking
Applied Psychological Measurement,
June 1, 2000;
24(2):
115 - 138.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
R. J. De Ayala and M. Sava-Bolesta
Item Parameter Recovery for the Nominal Response Model
Applied Psychological Measurement,
March 1, 1999;
23(1):
3 - 19.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
Detection of Differential Item Functioning Under the Graded Response Model With the Likelihood Ratio Test
Applied Psychological Measurement,
December 1, 1998;
22(4):
345 - 355.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
N. G. Waller
Computer Program Exchange: LINKDIF: Linking Item Parameters and Calculating IRT Measures of Differential Functioning of Items and Tests
Applied Psychological Measurement,
December 1, 1998;
22(4):
392 - 392.
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Linking and Concurrent Calibration Under Item Response Theory
Applied Psychological Measurement,
June 1, 1998;
22(2):
131 - 143.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
F. B. Baker
An Investigation of the Item Parameter Recovery Characteristics of a Gibbs Sampling Procedure
Applied Psychological Measurement,
June 1, 1998;
22(2):
153 - 169.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
J. R. Donoghue and S. P. Isham
A Comparison of Procedures to Detect Item Parameter Drift
Applied Psychological Measurement,
March 1, 1998;
22(1):
33 - 51.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
M. L. Stocking and C. Lewis
Controlling Item Exposure Conditional on Ability in Computerized Adaptive Testing
Journal of Educational and Behavioral Statistics,
January 1, 1998;
23(1):
57 - 75.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
F. B. Baker
Empirical Sampling Distributions of Equating Coefficients for Graded and Nominal Response Instruments
Applied Psychological Measurement,
June 1, 1997;
21(2):
157 - 172.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Huang, A. T. Church, and M. S. Katigbak
Identifying Cultural Differences in Items and Traits: Differential Item Functioning in the NEO Personality Inventory
Journal of Cross-Cultural Psychology,
March 1, 1997;
28(2):
192 - 218.
[Abstract]
|
 |
|

|
 |

|
 |
 
T. Ackerman
Graphical Representation of Multidimensional Item Response Theory Analyses
Applied Psychological Measurement,
December 1, 1996;
20(4):
311 - 329.
[Abstract]
|
 |
|

|
 |

|
 |
 
T. Davey, T.C. Oshima, and K. Lee
Linking Multidimensional Item Calibrations
Applied Psychological Measurement,
December 1, 1996;
20(4):
405 - 416.
[Abstract]
|
 |
|

|
 |

|
 |
 
F. B. Baker
An Investigation of the Sampling Distributions of Equating Coefficients
Applied Psychological Measurement,
March 1, 1996;
20(1):
45 - 57.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. R. Budgell, N. S. Raju, and D. A. Quartetti
Analysis of Differential Item Functioning in Translated Assessment Instruments
Applied Psychological Measurement,
December 1, 1995;
19(4):
309 - 321.
[Abstract]
|
 |
|

|
 |

|
 |
 
N. S. Roju, W. J. van der Linden, and P. F. Fleer
IRT-Based Internal Measures of Differential Functioning of Items and Tests
Applied Psychological Measurement,
December 1, 1995;
19(4):
353 - 368.
[Abstract]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Minimum {chi}2 Method for Equating Tests Under the Graded Response Model
Applied Psychological Measurement,
June 1, 1995;
19(2):
167 - 176.
[Abstract]
|
 |
|

|
 |

|
 |
 
R.J. De Ayala
The Influence of Multidimensionality on the Graded Response Model
Applied Psychological Measurement,
June 1, 1994;
18(2):
155 - 170.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
G. J. Lautenschlager, V. L. Flaherty, and D.-G. Park
IRT Differential Item Functioning: An Examination of Ability Scale Purifications
Educational and Psychological Measurement,
March 1, 1994;
54(1):
21 - 31.
[Abstract]
|
 |
|

|
 |

|
 |
 
M. P.F. Berger
D-Optimal Sequential Sampling Designs for Item Response Theory Models
Journal of Educational and Behavioral Statistics,
January 1, 1994;
19(1):
43 - 56.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
R. E. Millsap and H. T. Everson
Methodology Review: Statistical Approaches for Assessing Measurement Bias
Applied Psychological Measurement,
December 1, 1993;
17(4):
297 - 334.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen, S.-H. Kim, and F. B. Baker
Detection of Differential Item Functioning in the Graded Response Model
Applied Psychological Measurement,
December 1, 1993;
17(4):
335 - 350.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
F. B. Baker
Equating Tests Under The Nominal Response Model
Applied Psychological Measurement,
September 1, 1993;
17(3):
239 - 251.
[Abstract]
|
 |
|

|
 |

|
 |
 
B. B. Ellis, P. Becker, and H. D. Kimmel
An Item Response Theory Evaluation of an English Version of the Trier Personality Inventory (TPI)
Journal of Cross-Cultural Psychology,
June 1, 1993;
24(2):
133 - 148.
[Abstract]
|
 |
|

|
 |

|
 |
 
R. Bontempo
Translation Fidelity of Psychological Scales: An Item Response Theory Analysis of an Individualism-Collectivism Scale
Journal of Cross-Cultural Psychology,
June 1, 1993;
24(2):
149 - 166.
[Abstract]
|
 |
|

|
 |

|
 |
 
A. S. Cohen and S.-H. Kim
A Comparison of Lord's {chi}2 and Raju's Area Measures In Detection of DIF
Applied Psychological Measurement,
March 1, 1993;
17(1):
39 - 52.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. Camilli
A Conceptual Analysis of Differential Item Functioning in Terms of a Multidimensional Item Response Model
Applied Psychological Measurement,
June 1, 1992;
16(2):
129 - 147.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
F. B. Baker
Equating Tests Under the Graded Response Model
Applied Psychological Measurement,
March 1, 1992;
16(1):
87 - 96.
[Abstract]
|
 |
|

|
 |

|
 |
 
K. Yamamoto and J. Mazzeo
Chapter 4: Item Response Theory Scale Linking in NAEP
Journal of Educational and Behavioral Statistics,
January 1, 1992;
17(2):
155 - 173.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Two Area Measures for Detecting Differential Item Functioning
Applied Psychological Measurement,
September 1, 1991;
15(3):
269 - 278.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Liou
Effect of Scale Adjustment on the Comparison of Item and Ability Parameters
Applied Psychological Measurement,
September 1, 1990;
14(3):
313 - 321.
[Abstract]
|
 |
|

|
 |

|
 |
 
F. B. Baker
Some Observations on the Metric of PC-BILOG Results
Applied Psychological Measurement,
June 1, 1990;
14(2):
139 - 150.
[Abstract]
|
 |
|

|
 |

|
 |
 
D.-G. Park and G. J. Lautenschlager
Improving IRT Item Bias Detection With Iterative Linking and Ability Scale Purification
Applied Psychological Measurement,
June 1, 1990;
14(2):
163 - 173.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. Skaggs and J. Stevenson
A Comparison of Pseudo-Bayesian and Joint Maximum Likelihood Procedures for Estimating Item Parameters in the Three-Parameter IRT Model
Applied Psychological Measurement,
December 1, 1989;
13(4):
391 - 402.
[Abstract]
|
 |
|

|
 |

|
 |
 
R. J. Mislevy and M. L. Stocking
A Consumer's Guide to LOGIST and BILOG
Applied Psychological Measurement,
March 1, 1989;
13(1):
57 - 75.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. J. Lautenschiager and D.-G. Park
IRT Item Bias Detection Procedures: Issues of Model Misspecification, Robustness, and Parameter Linking
Applied Psychological Measurement,
December 1, 1988;
12(4):
365 - 376.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. L. Candell and F. Drasgow
An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory
Applied Psychological Measurement,
September 1, 1988;
12(3):
253 - 260.
[Abstract]
|
 |
|

|
 |

|
 |
 
L. L. Cook and N. S. Paterson
Problems Related to the Use of Conventional and Item Response Theory Equating Methods in Less Than Optimal Circumstances
Applied Psychological Measurement,
September 1, 1987;
11(3):
225 - 244.
[Abstract]
|
 |
|

|
 |

|
 |
 
C. L. Hulin
A Psychometric Theory of Evaluations of Item and Scale Translations: Fidelity Across Languages
Journal of Cross-Cultural Psychology,
June 1, 1987;
18(2):
115 - 142.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. D. Vale
Linking Item Parameters Onto a Common Scale
Applied Psychological Measurement,
December 1, 1986;
10(4):
333 - 344.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. L. Candell and C. L. Hulin
Cross-Language and Cross-Cultural Comparisons in Scale Translations: Independent Sources of Information about Item Nonequivalence
Journal of Cross-Cultural Psychology,
December 1, 1986;
17(4):
417 - 440.
[Abstract]
|
 |
|

|
 |

|
 |
 
D. N. M. de Gruijter
The Use of Item Statistics in the Calibration of an Item Bank
Applied Psychological Measurement,
September 1, 1986;
10(3):
231 - 237.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. Skaggs and R. W. Lissitz
An Exploration of the Robustness of Four Test Equating Models
Applied Psychological Measurement,
September 1, 1986;
10(3):
303 - 317.
[Abstract]
|
 |
|

|
 |

|
 |
 
K. Leung and F. Drasgow
Relation between Self-Esteem and Delinquent Behavior in Three Ethnic Groups: An Application of Item Response Theory
Journal of Cross-Cultural Psychology,
June 1, 1986;
17(2):
151 - 167.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. Skaggs and R. W. Lissitz
IRT Test Equating: Relevant Issues and a Review of Recent Research
Review of Educational Research,
January 1, 1986;
56(4):
495 - 529.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
D.R. Divgi
A Minimum Chi-Square Method for Developing a Common Metric in Item Response Theory
Applied Psychological Measurement,
December 1, 1985;
9(4):
413 - 415.
[Abstract]
|
 |
|

|
 |

|
 |
 
F. B. Baker
Ability Metric Transformations Involved in Vertical Equating Under Item Response Theory
Applied Psychological Measurement,
July 1, 1984;
8(3):
261 - 271.
[Abstract]
|
 |
|

|
 |

|
 |
 
M. S. Wingersky and F. M. Lord
An Investigation of Methods for Reducing Sampling Error in Certain IRT Procedures
Applied Psychological Measurement,
July 1, 1984;
8(3):
347 - 364.
[Abstract]
|
 |
|
|
|