Applied Psychological Measurement

 

Advanced Search

Journal Navigation

Journal Home

Subscriptions

Archive

Contact Us

Table of Contents

Free Access - Register Here

Click here for free access to the SAGE eReference platform!

Sign In to gain access to subscriptions and/or personal tools.
This Article
Right arrow Abstract Freely available
Right arrow Free Full Text (Free PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Milligan, G. W.
Right arrow Articles by Cooper, M. C.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?
Applied Psychological Measurement, Vol. 11, No. 4, 329-354 (1987)
DOI: 10.1177/014662168701100401


Reviews

Methodology Review: Clustering Methods

Glenn W. Milligan

Ohio State University

Martha C. Cooper

Ohio State University

A review of clustering methodology is presented, with emphasis on algorithm performance and the re sulting implications for applied research. After an over view of the clustering literature, the clustering process is discussed within a seven-step framework. The four major types of clustering methods can be characterized as hierarchical, partitioning, overlapping, and ordina tion algorithms. The validation of such algorithms re fers to the problem of determining the ability of the methods to recover cluster configurations which are known to exist in the data. Validation approaches in clude mathematical derivations, analyses of empirical datasets, and monte carlo simulation methods. Next, interpretation and inference procedures in cluster anal ysis are discussed. inference procedures involve test ing for significant cluster structure and the problem of determining the number of clusters in the data. The paper concludes with two sets of recommendations. One set deals with topics in clustering that would ben efit from continued research into the methodology. The other set offers recommendations for applied anal yses within the framework of the clustering process.

References

  • Anderberg, M.R. (1973). Cluster analysis for researchers. New York: Academic Press.
  • Andes, N. (1986, June). Validation of cluster solutions using discriminant analysis and bootstrap techniques. Paper presented at the meeting of the Classification Society of North America, Columbus OH.
  • Arnold, S.J. (1979). A test for clusters. Journal of Marketing Research, 19, 545-551.[CrossRef]
  • Bailey, T.A., & Dubes, R. (1982). Clustering validity profiles. Pattern Recognition, 15, 61-83.
  • Baker, F.B. (1974). Stability of two hierarchical grouping techniques. Case I: Sensitivity to data errors. Journal of the American Statistical Association, 69, 440-445.[CrossRef]
  • Baker, F.B., & Hubert, L.J. (1975). Measuring the power of hierarchical cluster analysis . Journal of the American Statistical Association, 70, 31- 38.
  • Ball, G.H., & Hall, D.J. (1965). ISODATA, a novel method of data analysis and pattern classification. Menlo Park CA: Stanford Research Institute. (NTIS No. AD 699616)
  • Bayne, C.K., Beauchamp, J.J., Begovich, C.L., & Kane, V.E. (1980). Monte carlo comparisons of selected clustering procedures. Pattern Recognition, 12, 51-62.
  • Beale, E.M.L. (1969). Cluster analysis. London: Scientific Control Systems.
  • Begovich, C.L., & Kane, V.E. (1982). Estimating the number of groups and group membership using simulation cluster analysis. Pattern Recognition, 15, 335-342.[CrossRef]
  • Blashfield, R.K. (1976). Mixture model tests of cluster analysis: Accuracy of four agglomerative hierarchical methods. Psychological Bulletin, 83, 377-388.[CrossRef][ISI]
  • Blashfield, R.K. (1977a). A consumer report on cluster analysis software: (3) Iterative partitioning methods (NSF grant DCR 74-20007). State College PA: Pennsylvania State University, Department of Psychology.
  • Blashfield, R.K. (1977b). The equivalence of three statistical packages for performing hierarchical cluster analysis. Psychometrika , 42, 429-431.[CrossRef]
  • Blashfield, R.K. (1980). The growth of cluster analysis: Tryon, Ward, and Johnson. Multivariate Behavioral Research, 15, 439-458.[CrossRef]
  • Blashfield, R.K., & Aldenderfer, M.S. (1978). The literature of cluster analysis. Multivariate Behavioral Research, 13, 271-295.[CrossRef]
  • Blashfield, R.K., & Morey, L.C. (1980). A comparison of four clustering methods using MMPI monte carlo data. Applied Psychological Measurement, 4, 57-64.[Medline] [Order article via Infotrieve]
  • Bock, H.H. (1985). On some significance tests in cluster analysis . Journal of Classification, 2, 77-108.
  • Calinski, R.B., & Harabasz, J. (1974). A dendrite method for cluster analysis. Communications in Statistics, 3, 1-27.
  • Cattell, R.B. (1952). The three basic factor-analytic research designs: Their inter-relations and derivatives. Psychological Bulletin, 49, 499-520.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Cattell, R.B. (1978). The scientific use of factor analysis. New York: Plenum Press.
  • Cormack, R.M. (1971). A review of classification. Journal of the Royal Statistical Society, Series A, 134, 321-367.
  • Corter, J.E., & Tversky, A. (1986). Extended similarity trees. Psychometrika , 51, 429-451.[CrossRef]
  • Cronbach, L.J., & Gleser, G.C. (1953). Assessing the similarity between profiles. Psychological Bulletin, 50, 456-473.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Cunningham, K.M., & Ogilvie, J.C. (1972). Evaluation of hierarchical grouping techniques: A preliminary study. Computer Journal, 15, 209-213.[Abstract]
  • D'Andrade, R.G. (1978). U-statistic hierarchical clustering. Psychometrika, 43, 59-67.[CrossRef]
  • Day, W. H. E. (Ed.). (1986). Consensus classifications [Special issue]. Journal of Classification , 3(2).
  • De Soete, G., DeSarbo, W.S., & Carroll, J.D. (1985). Optimal variable weighting for hierarchical clustering: An alternating least squares approach. Journal of Classification , 2, 173-192.
  • Dubes, R., & Jain, A.K. (1979). Validity studies in clustering methodologies. Pattern Recognition, 11, 235-254.[CrossRef]
  • Dubes, R., & Jain, A.K. (1980). Clustering methodologies in exploratory data analysis . Advances in Computers, 19, 113-228.
  • Duda, R.O., & Hart, P.E. (1973). Pattern classification and scene analysis. New York: Wiley.
  • Edelbrock, C. (1979). Comparing the accuracy of hierarchical clustering algorithms: The problem of classifying everybody. Multivariate Behavioral Research, 14, 367-384.[CrossRef]
  • Edelbrock, C., & McLaughlin, B. (1980). Hierarchical cluster analysis using intraclass correlations: A mixture model study. Multivariate Behavioral Research, 15, 299-318.[CrossRef]
  • Edwards, A.W.F., & Cavalli-Sforza, L. (1965). A method for cluster analysis. Biometrics, 21, 362-375.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Everitt, B.S. (1979). Unresolved problems in cluster analysis. Biometries, 35, 169-181.
  • Everitt, B.S. (1980). Cluster analysis (2nd ed.). London: Heinemann.
  • Everitt, B.S. (1981). A monte carlo investigation of the likelihood ratio test for the number of components in a mixture of normal distributions . Multivariate Behavioral Research, 16, 171-180.[CrossRef]
  • Fisher, L., & Van Ness, J.W. (1971). Admissible clustering procedures . Biometrika, 58, 91-104.[Abstract/Free Full Text]
  • Fleiss, J.L., Lawlor, W., Platman, S.R., & Fieve, R.R. (1971). On the use of inverted factor analysis for generating typologies. Journal of Abnormal Psychology, 77, 127-132.[CrossRef]
  • Fleiss, J.L., & Zubin, J. (1969). On the methods and theory of clustering. Multivariate Behavioral Research, 4, 235-250.
  • Friedman, H.P., & Rubin, J. (1967). On some invariant criteria for grouping data. Journal of the American Statistical Association, 62, 1159-1178.[CrossRef]
  • Goldstein, S.G., & Linden, J.D. (1969). A comparison of multivariate grouping techniques commonly used with profile data. Multivariate Behavioral Research , 4, 103-114.
  • Good, I.J. (1982). An index of separateness of clusters and a permutation test for its statistical significance. Journal of Statistical Computing and Simulation, 15, 81-84.
  • Gordon, A.D. (1987). A review of hierarchical classification. Journal of the Royal Statistical Society, Series A, 150, 119-137.
  • Gower, J.C. (1967). A comparison of some methods of cluster analysis . Biometrics, 23, 623-628.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Gower, J.C. (1975). Goodness-of-fit criteria for classification and other patterned structures. In G. Estabrook (Ed.), Proceedings of the 8th International Conference on Numerical Taxonomy. San Francisco: Freeman.
  • Gross, A.L. (1972). A monte carlo study of the accuracy of a hierarchical grouping procedure. Multivariate Behavioral Research, 7, 379-389.[CrossRef]
  • Harrigan, K.R. (1985). An application of clustering for strategic group analysis. Strategic Management Journal, 6, 55-73.[ISI]
  • Hartigan, J.A. (1975). Clustering algorithms. New York: Wiley.
  • Hartigan, J.A. (1977). Distribution problems in clustering. In J. Van Ryzin (Ed.), Classification and clustering (pp. 45-71). New York: Academic Press.
  • Hartigan, J.A. (1978). Asymptotic distributions for clustering criteria . Annals of Statistics, 6, 117-131.
  • Hartigan, J.A. (1985). Statistical theory in clustering. Journal of Classification, 2, 63-76.
  • Hubert, L.J. (1974). Some applications of graph theory to clustering . Psychometrika, 39, 283-309.[CrossRef][ISI]
  • Hubert, L.J., & Arable, P. (1985). Comparing partitions. Journal of Classification, 2, 193 -218 .[CrossRef][ISI]
  • Hubert, L.J., & Baker, F.B. (1977). The comparison and fitting of given classification schemes. Journal of Mathematical Psychology, 16, 233-253.[CrossRef]
  • Jancey, R.C. (1966). Multidimensional group analysis. Australian Journal of Botany, 14, 127 -130.[Medline] [Order article via Infotrieve]
  • Jardine, N., & Sibson, R. (1971). Mathematical taxonomy. New York: Wiley.
  • Johnson, S.C. (1967). Hierarchical clustering schemes. Psychometrika, 32, 241-254.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Kaufman, R.L. (1985). Issues in muitivariate cluster analysis: Some simulation results. Sociological Methods and Research, 13, 467-486.
  • Kleiner, B., & Hartigan, J.A. (1981). Representing points in many dimensions by trees and castles (with comments and rejoinder). Journal of the American Statistical Association, 76, 260-276.[CrossRef]
  • Kruskal, J.B., & Landwehr, J.M. (1983). Icicle plots: Better displays for hierarchical clustering. The American Statistician, 37, 162-168.[CrossRef]
  • Kuiper, F.K., & Fisher, L. (1975). A monte carlo comparison of six clustering procedures . Biometrics, 31, 777-783.[CrossRef]
  • Lance, G.N., & Williams, W.T. (1967). A general theory of classificatory sorting strategies: I. Hierarchical systems. Computer Journal , 9, 373-380.
  • Lee, K.L. (1979). Multivariate tests for clusters. Journal of the American Statistical Association, 74, 708-714.[CrossRef]
  • Ling, R.F. (1973). A probability theory of cluster analysis. Journal of the American Statistical Association, 68, 159-164.[CrossRef]
  • Lorr, M. (1983). Cluster analysis for the social sciences. San Francisco: Jossey-Bass.
  • Marriott, F.H.C. (1971). Practical problems in a method of cluster analysis . Biometrics, 27, 501-514.[CrossRef][ISI][Medline] [Order article via Infotrieve]
  • Matula, D.W. (1977). Graph theoretic techniques for cluster analysis . In J. Van Ryzin (Ed.), Classification and clustering (pp. 95-129). New York: Academic Press.
  • McIntyre, R.M., & Blashfield, R.K. (1980). A nearest-centroid technique for evaluating the minimum-variance clustering procedure. Multivariate Behavioral Research, 15, 225-238.[CrossRef]
  • McQuitty, L.L. (1987). Pattern-analytic clustering. New York: University Press of America.
  • Mezzich, J. (1978). Evaluating clustering methods for psychiatric diagnosis. Biological Psychiatry, 13, 265-346.[ISI][Medline] [Order article via Infotrieve]
  • Milligan, G.W. (1979). Ultrametric hierarchical clustering algorithms . Psychometrika, 44, 343-346.[CrossRef]
  • Milligan, G.W. (1980). An examination of the effect of six types of error perturbation on fifteen clustering algorithms. Psychometrika , 45, 325-342.[CrossRef][ISI]
  • Milligan, G.W. (1981a). A monte carlo study of thirty internal criterion measures for cluster analysis. Psychometrika, 46, 187-199.[CrossRef]
  • Milligan, G.W. (1981b). A review of monte carlo tests of cluster analysis . Multivariate Behavioral Research, 16, 379-407.[CrossRef]
  • Milligan, G.W. (1985). An algorithm for generating artificial test clusters . Psychometrika, 50, 123-127.[CrossRef]
  • Milligan, G.W. (1987a). A study of the beta-flexible clustering method (WPS 87-61). Columbus OH: Ohio State University, Faculty of Management Sciences.
  • Milligan, G.W. (1987b). A validation study of a variable weighting algorithm (WPS 87-111). Columbus OH: Ohio State University, Faculty of Management Sciences.
  • Milligan, G.W., & Cooper, M.C. (1985). An examination of procedures for determining the number of clusters in a data set. Psychometrika, 50, 159-179.[CrossRef][ISI]
  • Milligan, G.W., & Cooper, M.C. (1986). A study of the comparability of external criteria for hierarchical cluster analysis. Multivariate Behavioral Research, 21, 441-458.[CrossRef]
  • Milligan, G.W., & Cooper, M.C. (in press). A study of standardization of variables in cluster analysis . Journal of Classification.
  • Milligan, G.W., & Isaac, P. (1980). The validation of four ultrametric clustering algorithms. Pattern Recognition, 12, 41-50.[CrossRef]
  • Milligan, G.W., & Mahajan, V. (1980). A note on procedures for testing the quality of a clustering of a set of objects. Decision Sciences, 11, 669-677.
  • Milligan, G.W., & Sokol, L.M. (1980). A two-stage clustering algorithm with robust recovery characteristics. Educational and Psychological Measurement , 40, 755-759.[Abstract]
  • Mojena, R. (1977). Hierarchical grouping methods and stopping rules: An evaluation. Computer Journal, 20, 359-363.[Abstract]
  • Morey, L.C., Blashfield, R.K., & Skinner, H.A. (1983). A comparison of cluster analysis techniques within a sequential validation framework. Multivariate Behavioral Research, 18, 309-329.[CrossRef]
  • Needham, R.M. (1967). Automatic classification in linguistics. The Statistician, 17, 45-54.
  • Ozawa, K. (1985). A stratificational overlapping cluster scheme . Pattern Recognition, 18, 279-286.[CrossRef]
  • Peay, E.R. (1975). Nonmetric grouping: Clusters and cliques. Psychometrika, 40, 297-313.[CrossRef]
  • Punj, G., & Stewart, D.W. (1983). Cluster analysis in marketing research: Review and suggestions for application. Journal of Marketing Research , 20, 134-148.[CrossRef]
  • Rand, W.M. (1971). Objective criteria for the evaluation of clustering methods. Journal of the American Statistical Association, 66, 846 -850.[CrossRef]
  • Rohlf, F.J. (1974). Methods of comparing classifications. Annual Review of Ecology and Systematics, 5, 101-113.
  • Romesburg, H.C. (1984). Cluster analysis for researchers. Belmont CA: Lifetime Learning Publications.
  • Sarle, W.S. (1983). Cubic clustering criterion (Tech. Rep. A-108) . Cary NC: SAS Institute.
  • SAS Institute (1985). SAS user's guide: Statistics, version 5 edition. Cary NC : Author.
  • Scheibler, D., & Schneider, W. (1985). Monte carlo tests of the accuracy of cluster analysis algorithms-A comparison of hierarchical and nonhierarchical methods. Multivariate Behavioral Research, 20, 283-304.
  • Scott, A.J., & Symons, M.J. (1971). Clustering methods based on likelihood ratio criteria . Biometrics, 27, 387-397.[CrossRef][ISI]
  • Shepard, R.N., & Arabie, P. (1979). Additive clustering: Representation of similarities as combinations of discrete overlapping properties. Psychological Review, 86, 87-123.[CrossRef][ISI]
  • Skinner, H.A. (1978). Differentiating the contribution of elevation, scatter, and shape in profile similarity. Educational and Psychological Measurement, 38, 297-308.[Abstract]
  • Sneath, P.H.A. (1969). Evaluation of clustering methods. In A. J. Cole (Ed.), Numerical taxonomy (pp. 257-271). New York: Academic Press.
  • Sneath, P.H.A. (1977). A method for testing the distinctness of clusters: A test of the disjunction of two clusters in Euclidean space as measured by their overlap. Mathematical Geology, 9, 123-143.[CrossRef]
  • Sneath, P.H.A. (1980). The risk of not recognizing from ordinations that clusters are distinct. Classification Society Bulletin, 4, 22-43.
  • Sneath, P.H.A., & Sokal, R.R. (1973). Numerical taxonomy. San Francisco : Freeman.
  • Soon, S.C. (in press). On detection of extreme data points in cluster analysis. (Doctoral dissertation, Ohio State University, 1988.) Dissertation Abstracts International.
  • Späth, H. (1980). Cluster analysis algorithms. New York: Wiley.
  • Tryon, R.C., & Bailey, D.C. (1970). Cluster analysis. New York : McGraw-Hill.
  • Turner, M.E. (1969). Credibility and cluster. Annals of the New York Academy of Sciences, 161, 680-688.
  • Ward, J.H. (1963). Hierarchical grouping to optimize an objective function. Journal of the American Statistical Association, 58, 236-244.[CrossRef][ISI]
  • Williams, W.T., Lance, G.N., Dale, M.B., & Clifford, H.T. (1971). Controversy concerning the criteria for taxonometric strategies. Computer Journal , 14, 162-165.[Abstract]
  • Wolfe, J.H. (1970). Pattern clustering by multivariate mixture analysis . Multivariate Behavioral Research, 5, 329-350.
  • Wong, M.A. (1982). A hybrid clustering method for identifying high-density clusters. Journal of the American Statistical Association, 77, 841- 847.
  • Wong, M.A., & Lane, T. (1983). A kth nearest neighbor clustering procedure. Journal of the Royal Statistical Society, Series B, 45, 362 - 368.
  • Wong, M.A., & Schaak, C. (1982). Using the kth nearest neighbor clustering procedure to determine the number of subpopulations. Proceedings of the Statistical Computing Section, American Statistical Association, 40-48.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us   Add to Digg Digg   Add to Reddit Reddit   Add to Technorati Technorati    What's this?


This article has been cited by other articles:


Home page
Criminal Justice and BehaviorHome page
M. T. Huss and A. Ralston
Do Batterer Subtypes Actually Matter? Treatment Completion, Treatment Response, and Recidivism Across a Batterer Typology
Criminal Justice and Behavior, June 1, 2008; 35(6): 710 - 724.
[Abstract] [PDF]


Home page
J Interpers ViolenceHome page
R. M. Bossarte, T. R. Simon, and M. H. Swahn
Clustering of Adolescent Dating Violence, Peer Violence, and Suicidal Behavior
J Interpers Violence, June 1, 2008; 23(6): 815 - 833.
[Abstract] [PDF]


Home page
J. Gerontol. B Psychol. Sci. Soc. Sci.Home page
K. L. Fiori, J. Smith, and T. C. Antonucci
Social Network Types Among Older Adults: A Multidimensional Approach
J. Gerontol. B. Psychol. Sci. Soc. Sci., November 1, 2007; 62(6): P322 - P330.
[Abstract] [Full Text] [PDF]


Home page
Canadian Journal of School PsychologyHome page
N. S. Koushik, C. D. Saunders, and B. P. Rourke
Patterns of Cognitive Functioning in a Clinic-Referred Sample of Preschool Children
Canadian Journal of School Psychology, June 1, 2007; 22(1): 94 - 107.
[Abstract] [PDF]


Home page
AssessmentHome page
M. R. Beg, J. E. Casey, and C. D. Saunders
A Typology of Behavior Problems in Preschool Children
Assessment, June 1, 2007; 14(2): 111 - 128.
[Abstract] [PDF]


Home page
Educational and Psychological MeasurementHome page
C. DiStefano and R. W. Kamphaus
Investigating Subtypes of Child Development: A Comparison of Cluster Analysis and Latent Class Cluster Analysis in Typology Creation
Educational and Psychological Measurement, October 1, 2006; 66(5): 778 - 794.
[Abstract] [PDF]


Home page
Journal of Language and Social PsychologyHome page
T. A. Kinney
Themes And Perceptions Of Written Sexually Harassing Messages And Their Link To Distress
Journal of Language and Social Psychology, March 1, 2003; 22(1): 8 - 28.
[Abstract] [PDF]


Home page
Journal of Social and Personal RelationshipsHome page
J. A. Hess
Distance Regulation in Personal Relationships: The Development of a Conceptual Model and a Test of Representational Validity
Journal of Social and Personal Relationships, October 1, 2002; 19(5): 663 - 683.
[Abstract] [PDF]


Home page
Journal of Travel ResearchHome page
D. Fodness and B. Murray
A Typology of Tourist Information Search Strategies
Journal of Travel Research, November 1, 1998; 37(2): 108 - 119.
[Abstract] [PDF]


Home page
Applied Psychological MeasurementHome page
C. W. Deville and S. Prometric
An Empirical Link of Content and Construct Validity Evidence
Applied Psychological Measurement, June 1, 1996; 20(2): 127 - 139.
[Abstract] [PDF]


Home page
Am Educ Res JHome page
D. L Speece and D. H Cooper
Ontogeny of School Failure: Classification of First-Grade Children
American Educational Research Journal, January 1, 1990; 27(1): 119 - 140.
[Abstract] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Free Full Text (Free PDF) Free
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Alert me to new issues of the journal
Right arrow Add to Saved Citations
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Request Reprints
Right arrow Add to My Marked Citations
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Milligan, G. W.
Right arrow Articles by Cooper, M. C.
Right arrow Search for Related Content
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us   Add to Digg   Add to Reddit   Add to Technorati  
What's this?