Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Click here for FREE ACCESS to this landmark database

Click here for more information on The Virtual Advisor

Sign In to gain access to subscriptions and/or personal tools.
Applied Psychological Measurement
This Article
Right arrow Full Text (PDF)
Right arrow References
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Right arrow Citing Articles via Scopus
Google Scholar
Right arrow Articles by Houston, W. M.
Right arrow Articles by Svec, J. C.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?

Adjustments for Rater Effects in Performance Assessment

Walter M. Houston

American College Testing

Mark R. Raymond

American College Testing

Joseph C. Svec

American College Testing

Alternative methods to correct for rater leniency/stringency effects (i.e., rater bias) in per formance ratings were investigated. Rater bias effects are of concern when candidates are evaluated by different raters. The three correction methods evaluated were ordinary least squares (OLS), weighted least squares (WLS), and imputation of the missing data (IMPUTE). In addition, the usual procedure of averaging the observed ratings was investigated. Data were simulated from an essentially {tau}-equivalent measure ment model, with true scores and error scores nor mally distributed. The variables manipulated in the simulations were method of correction (OLS, WLS, IMPUTE, averaging the observed ratings), amount of missing data (50% missing, 75% missing), rater bias (low, high), and number of examinees or can didates (N = 50, N = 100). The accuracy of the methods in estimating true scores was assessed based on the square root of the average squared difference between the estimated and known true scores. The three correction methods consistently outperformed the procedure of averaging the observed ratings. IMPUTE was superior to the least squares methods.

Key Words: Index terms: EM algorithm, incomplete data • incomplete rating designs • least squares adjustments • performance assessment • rater calibration.

Applied Psychological Measurement, Vol. 15, No. 4, 409-421 (1991)
DOI: 10.1177/014662169101500411


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
JAOA: Journal of the American Osteopathic AssociationHome page
J. R. Boulet, J. R. Gimpel, D. J. Dowling, and M. Finley
Assessing the Ability of Medical Students to Perform Osteopathic Manipulative Treatment Techniques
J Am Osteopath Assoc, May 1, 2004; 104(5): 203 - 211.
[Abstract] [Full Text] [PDF]