|
Sign In to gain access to subscriptions and/or personal tools.
|
Applied Psychological Measurement, Vol. 5, No. 2,
159-173 (1981)
DOI: 10.1177/014662168100500202
Item Bias in a Test of Reading Comprehension
Robert L. Linn
University of Illinois at Champaign/Urbana
Michael V. Levine
University of Illinois at Champaign/Urbana
C. Nicholas Hastings
University of Illinois at Champaign/Urbana
James L. Wardrop
University of Illinois at Champaign/Urbana
The possibility that certain features of items on a reading comprehension test may lead to biased esti mates of the reading achievement of particular sub groups of students was investigated. Eight nonover lapping subgroups of students were defined by the combinations of three factors: student grade level (fifth or sixth), income level of the neighborhood in which the school was located (low and middle or above), and race of the student (black or white). Es timates of student ability and item parameters were obtained separately for each of the eight subgroups using the three-parameter logistic model. Bias in dices were computed based on differences in item characteristic curves for pairs of subgroups. A cri terion for labeling an item as biased was developed using the distribution of bias indices for subgroups of the same race that differed only in income level or grade level. Using this criterion, three items were consistently identified as biased in four independent comparisons of subgroups of black and white stu dents. Comparisons of content and format charac teristics of items that were identified as biased with those that were not, or between items biased in dif ferent directions, did not lead to the identification of any systematic content differences. The study did provide strong support for the viability of the esti mation procedure; item characteristics, estimated with samples from different populations were very similar. Some suggestions for improvements in methodology are offered.
References
- Anastasi, A. Psychological testing (4th ed.). New York: Macmillan, 1976.
- Angoff, W.H., & Ford, S.F. Item-race interaction on a test of scholastic aptitude. Journal of Educational Measurement, 1973,10, 95-106.
- Bianchini, J.C., & Loret, P.G. Anchor test study final report. Project report and Volumes 1 through 30; and Anchor test study supplement. Volumes 31 through 33. Berkeley CA: Educational Testing Service, 1974. (ERIC Document Reproduction Service Nos. ED 092 601 through ED 092 634).
- Birnbaum, A. Some latent trait models and their use in inferring an examinee's ability . In F. M. Lord & M. R. Novick (Eds.), Statistical theories of mental test scores. Reading MA: Addison-Wesley, 1968.
- Cleary, T.A. Test bias: Prediction of grades of Negro and white students in integrated colleges. Journal of Educational Measurement, 1968, 5, 115-124.[CrossRef][ISI]
- Cleary, T.A., & Hilton, T.L. An investigation of item bias. Educational and Psychological Measurement, 1968, 28, 61-75.[CrossRef][ISI]
- Coffman, W.E. Sex differences in responses to items in an aptitude test. In I. J. Lehmann (Ed.), Eighteenth Yearbook. East Lansing MI: National Council on Measurement in Education, 1961.
- Durost, W.N., Bixler, H.H., Wrightstone, J.W., Prescott, G.A., & Balow, I.H. Metropolitan achievement tests, Form F. New York: Harcourt, Brace, & Jovanovich, 1970.
- Eells, K., Davis A., Havighurst, R.J., Herrick, V.E., & Tyler, R.W. Intelligence and cultural differences. Chicago: Chicago Press, 1951.
- Harms, R.A. A comparative concurrent validation of selected estimators of test item bias . Unpublished doctoral dissertation, University of South Florida, 1978.
- Hunter, J.E. A critical analysis of the use of item means and item-test correlations to determine the presence or absence of content bias in achievement test items . Paper presented at the National Institute of Education Conference on Test Bias, Annapolis MD, December 1975.
- Ironson, G.H., & Subkoviak, M.J. A comparison of several methods of assessing bias. Journal of Educational Measurement, 1979, 16, 209-225.[CrossRef][ISI]
- Ironson, G.H. A comparative analysis of several methods of assessing item bias. Paper presented at the annual meeting of the American Educational Research Association, Toronto, Canada, April 1978.
- Linn, R.L. Fair test use in selection. Review of Educational Research , 1973, 43, 139-161.[Free Full Text]
- Linn, R.L., Levine, M.V., Hastings, C.N., & Wardrop, J.L. An investigation of item bias in a test of reading comprehension 1980. (ERIC Document Reproduction Service No. ED 184 091.
- Lord, F.M. A study of item bias using item characteristic curve theory. In Y. H. Poortingal (Ed.), Basic problems in cross-cultural psychology. Amsterdam: Swets & Zeitlinger, 1977. (a)
- Lord, F.M. Practical applications of item characteristic curve theory. Journal of Educational Measurement, 1977, 14, 117-138. (b)[CrossRef][ISI]
- Lord, F.M. Applications of item response theory to practical testing problems. Hillsdale NJ: Erlbaum, 1980.
- Petersen, N.S., & Novick, M.R. An evaluation of some models for culture-fair selection. Journal of Educational Measurement. 1976, 13, 3-29.
- Rudner, L.M. An evaluation of select approaches for biased item identification. Unpublished doctoral dissertation, Catholic University of America , 1977.
- Shepard, L., Camilli, G., & Averill, M. Comparison of six procedures for detecting test item bias using both internal and external ability criteria. Paper presented at the annual meeting of the National Council on Measurement in Education , Boston, April 1980.
- Warm, T.A. A primer of item response theory (Technical Report No. 941078. Oklahoma City: U. S. Coast Guard Institute, Department of Transportation, 1978. (NTIS No. AD A063072)
- Wood, R.L., Wingersky, M.S., & Lord, F.M. LOGIST: A computer program for estimating examinee ability and item characteristic curve parameters (ETS RM 76-6. Princeton NJ: Educational Testing Service, 1976.
- Wright, B.D. Solving measurement problems with the Rasch model. Journal of Educational Measurement, 1977, 14, 97-116. tract No. US-NIE-C-400-76-0116. We thank William Tierre for his help with the data preparation and analysis.[CrossRef][ISI]

CiteULike Connotea Del.icio.us Digg Reddit Technorati What's this?
This article has been cited by other articles:

|
 |

|
 |
 
J. S. Roberts, J. R. Donoghue, and J. E. Laughlin
Characteristics of MML/EAP Parameter Estimates in the Generalized Graded Unfolding Model
Applied Psychological Measurement,
June 1, 2002;
26(2):
192 - 207.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Linking and Concurrent Calibration Under the Graded Response Model
Applied Psychological Measurement,
March 1, 2002;
26(1):
25 - 41.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
M. Roznowski and J. Reith
Examining the Measurement Quality of Tests Containing Differentially Functioning Items: Do Biased Items Result in Poor Measurement?
Educational and Psychological Measurement,
April 1, 1999;
59(2):
248 - 269.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen and S.-H. Kim
An Investigation of Linking Methods Under the Graded Response Model
Applied Psychological Measurement,
June 1, 1998;
22(2):
116 - 130.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Linking and Concurrent Calibration Under Item Response Theory
Applied Psychological Measurement,
June 1, 1998;
22(2):
131 - 143.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
K. E. Ryan
Book Reviews : Methods for Identifying Biased Test Items by Gregory Camilli and Lorrie A. Shepard, Sage Publications, 1994, 174 pp
American Journal of Evaluation,
February 1, 1997;
18(1):
73 - 76.
[PDF]
|
 |
|

|
 |

|
 |
 
P. Narayanon and H. Swaminathan
Identification of Items that Show Nonuniform DIF
Applied Psychological Measurement,
September 1, 1996;
20(3):
257 - 274.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen, S.-H. Kim, and J. A. Wollack
An Investigation of the Likelihood Ratio Test For Detection of Differential Item Functioning
Applied Psychological Measurement,
March 1, 1996;
20(1):
15 - 26.
[Abstract]
|
 |
|

|
 |

|
 |
 
Z. S. Feinstein
Effects of Differing Item Parameters on Closed-Interval DIF Statistics
Applied Psychological Measurement,
June 1, 1995;
19(2):
131 - 142.
[Abstract]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Minimum {chi}2 Method for Equating Tests Under the Graded Response Model
Applied Psychological Measurement,
June 1, 1995;
19(2):
167 - 176.
[Abstract]
|
 |
|

|
 |

|
 |
 
S.-H. Kim, A. S. Cohen, and H.-O. Kim
An Investigation of Lord's Procedure for the Detection of Differential Item Functioning
Applied Psychological Measurement,
September 1, 1994;
18(3):
217 - 228.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
K. M. Mazor, B. E. Clauser, and R. K. Hambleton
Identification of Nonuniform Differential Item Functioning Using a Variation of the Mantel-Haenszel Procedure
Educational and Psychological Measurement,
June 1, 1994;
54(2):
284 - 291.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. J. Lautenschlager, V. L. Flaherty, and D.-G. Park
IRT Differential Item Functioning: An Examination of Ability Scale Purifications
Educational and Psychological Measurement,
March 1, 1994;
54(1):
21 - 31.
[Abstract]
|
 |
|

|
 |

|
 |
 
R. E. Millsap and H. T. Everson
Methodology Review: Statistical Approaches for Assessing Measurement Bias
Applied Psychological Measurement,
December 1, 1993;
17(4):
297 - 334.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen, S.-H. Kim, and F. B. Baker
Detection of Differential Item Functioning in the Graded Response Model
Applied Psychological Measurement,
December 1, 1993;
17(4):
335 - 350.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
A. S. Cohen and S.-H. Kim
A Comparison of Lord's {chi}2 and Raju's Area Measures In Detection of DIF
Applied Psychological Measurement,
March 1, 1993;
17(1):
39 - 52.
[Abstract]
|
 |
|

|
 |

|
 |
 
M. D. Miller and T.C. Oshima
Effect of Sample Size, Number of Biased Items, and Magnitude of Bias on a Two-Stage Item Bias Estimation Method
Applied Psychological Measurement,
December 1, 1992;
16(4):
381 - 388.
[PDF]
|
 |
|

|
 |

|
 |
 
K. E. Ryan and L. F. Bachman
Differential item functioning on two tests of EFL proficiency
Language Testing,
June 1, 1992;
9(1):
12 - 29.
[PDF]
|
 |
|

|
 |

|
 |
 
K. Yamamoto and J. Mazzeo
Chapter 4: Item Response Theory Scale Linking in NAEP
Journal of Educational and Behavioral Statistics,
January 1, 1992;
17(2):
155 - 173.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
S.-H. Kim and A. S. Cohen
A Comparison of Two Area Measures for Detecting Differential Item Functioning
Applied Psychological Measurement,
September 1, 1991;
15(3):
269 - 278.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
D.-G. Park and G. J. Lautenschlager
Improving IRT Item Bias Detection With Iterative Linking and Ability Scale Purification
Applied Psychological Measurement,
June 1, 1990;
14(2):
163 - 173.
[Abstract]
|
 |
|

|
 |

|
 |
 
H. J. Rogers and R. K. Hambleton
Evaluation of Computer Simulated Baseline Statistics for Use in Item Bias Studies
Educational and Psychological Measurement,
June 1, 1989;
49(2):
355 - 369.
[Abstract]
|
 |
|

|
 |

|
 |
 
F. Drasgow
An Evaluation of Marginal Maximum Likelihood Estimation for the Two-Parameter Logistic Model
Applied Psychological Measurement,
March 1, 1989;
13(1):
77 - 90.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. J. Lautenschiager and D.-G. Park
IRT Item Bias Detection Procedures: Issues of Model Misspecification, Robustness, and Parameter Linking
Applied Psychological Measurement,
December 1, 1988;
12(4):
365 - 376.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. L. Candell and F. Drasgow
An Iterative Procedure for Linking Metrics and Assessing Item Bias in Item Response Theory
Applied Psychological Measurement,
September 1, 1988;
12(3):
253 - 260.
[Abstract]
|
 |
|

|
 |

|
 |
 
G. L. Candell and C. L. Hulin
Cross-Language and Cross-Cultural Comparisons in Scale Translations: Independent Sources of Information about Item Nonequivalence
Journal of Cross-Cultural Psychology,
December 1, 1986;
17(4):
417 - 440.
[Abstract]
|
 |
|

|
 |

|
 |
 
C. D. McCauley and J. Mendoza
A Simulation Study of Item Bias Using a Two-Parameter Item Response Model
Applied Psychological Measurement,
December 1, 1985;
9(4):
389 - 400.
[Abstract]
|
 |
|

|
 |

|
 |
 
Zheng Chen and G. Henning
Linguistic and cultural bias in language proficiency tests
Language Testing,
December 1, 1985;
2(2):
155 - 163.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. H. Hui and H. C. Triandis
Measurement in Cross-Cultural Psychology: A Review and Comparison of Strategies
Journal of Cross-Cultural Psychology,
June 1, 1985;
16(2):
131 - 152.
[Abstract]
|
 |
|

|
 |

|
 |
 
B. Muthen and J. Lehman
Multiple Group IRT Modeling: Applications to Item Bias Analysis
Journal of Educational and Behavioral Statistics,
January 1, 1985;
10(2):
133 - 142.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
H.D. Hoover and M. J. Kolen
The Reliability of Six Item Bias Indices
Applied Psychological Measurement,
April 1, 1984;
8(2):
173 - 181.
[Abstract]
|
 |
|

|
 |

|
 |
 
L. Shepard, G. Camilli, and D. M. Williams
Accounting for Statistical Artifacts in Item Bias Research
Journal of Educational and Behavioral Statistics,
January 1, 1984;
9(2):
93 - 128.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
C. H. Hui, F. Drasgow, and B.-H. Chang
Analysis of the Modernity Scale: An Item Response Theory Approach
Journal of Cross-Cultural Psychology,
September 1, 1983;
14(3):
259 - 278.
[Abstract]
|
 |
|

|
 |

|
 |
 
R. K. Hambleton
Application of Item Response Models to Criterion-Referenced Assessment
Applied Psychological Measurement,
January 1, 1983;
7(1):
33 - 44.
[Abstract]
|
 |
|

|
 |

|
 |
 
C. L. Hulin, R. I. Lissak, and F. Drasgow
Recovery of Two- and Three-Parameter Logistic Item Characteristic Curves: A Monte Carlo Study
Applied Psychological Measurement,
June 1, 1982;
6(3):
249 - 260.
[Abstract]
[PDF]
|
 |
|

|
 |

|
 |
 
R. E. Traub and R. G. Wolfe
Chapter 8: Latent Trait Theories and the Assessment of Educational Achievement
Review of Research in Education,
January 1, 1981;
9(1):
377 - 435.
[PDF]
|
 |
|
|