Main Content

addK

Evaluate additional numbers of clusters

    Description

    example

    updatedEvaluation = addK(evaluation,klist) returns a clustering evaluation object updatedEvaluation, which contains the evaluation data in the clustering evaluation object evaluation and additional evaluation data for the proposed number of clusters specified in klist.

    Examples

    collapse all

    Create a clustering evaluation object using evalclusters, and then use addK to evaluate additional numbers of clusters.

    Load the fisheriris data set. The data contains length and width measurements from the sepals and petals of three species of iris flowers.

    load fisheriris

    Cluster the flower measurement data using kmeans, and use the Calinski-Harabasz criterion to evaluate proposed solutions for 1 to 5 clusters.

    evaluation = evalclusters(meas,"kmeans","CalinskiHarabasz","KList",1:5)
    evaluation = 
      CalinskiHarabaszEvaluation with properties:
    
        NumObservations: 150
             InspectedK: [1 2 3 4 5]
        CriterionValues: [NaN 513.9245 561.6278 530.4871 456.1279]
               OptimalK: 3
    
    
    

    The clustering evaluation object evaluation contains data on each proposed clustering solution. The returned value of OptimalK indicates that the optimal solution is three clusters.

    Evaluate proposed solutions for 6 to 10 clusters using the same criterion. Add these evaluations to the original clustering evaluation object.

    evaluation = addK(evaluation,6:10)
    evaluation = 
      CalinskiHarabaszEvaluation with properties:
    
        NumObservations: 150
             InspectedK: [1 2 3 4 5 6 7 8 9 10]
        CriterionValues: [NaN 513.9245 561.6278 530.4871 456.1279 469.5068 449.6410 435.8182 413.3837 386.5571]
               OptimalK: 3
    
    
    

    The updated values for InspectedK and CriterionValues show that evaluation now evaluates proposed solutions for 1 to 10 clusters. The OptimalK value is still 3, indicating that the optimal solution is still three clusters.

    Input Arguments

    collapse all

    Clustering evaluation data, specified as a CalinskiHarabaszEvaluation, DaviesBouldinEvaluation, GapEvaluation, or SilhouetteEvaluation clustering evaluation object. Create a clustering evaluation object by using evalclusters.

    Additional number of clusters to evaluate, specified as a positive integer vector. If any values in klist overlap with clustering solutions already evaluated in the evaluation object, then addK ignores the overlapping values.

    Data Types: single | double

    Output Arguments

    collapse all

    Updated clustering evaluation data, returned as a CalinskiHarabaszEvaluation, DaviesBouldinEvaluation, GapEvaluation, or SilhouetteEvaluation clustering evaluation object. updatedEvaluation contains data on the proposed clustering solutions included in evaluation and data on the additional proposed number of clusters specified in klist.

    For all clustering evaluation objects, addK updates the InspectedK and CriterionValues properties to include the proposed clustering solutions specified in klist and their corresponding criterion values. If the software finds a new optimal number of clusters and optimal clustering solution, then addK also updates the OptimalK and OptimalY properties.

    For certain clustering evaluation objects, addK updates these additional property values:

    • LogW, ExpectedLogW, StdLogW, and SE (for gap criterion evaluation objects)

    • ClusterSilhouettes (for silhouette criterion evaluation objects)

    Version History

    Introduced in R2014a