Main Content

pcaComponent

Pipeline component for principal component analysis (PCA)

Since R2026a

    Description

    pcaComponent is a pipeline component that performs principal component analysis (PCA). The pipeline component uses the functionality of the pca function during the learn phase to compute the principal component coefficients and variable means. During the run phase, the component transforms new data using the coefficients and mean values.

    Creation

    Description

    component = pcaComponent creates a pipeline component for principal component analysis (PCA).

    component = pcaComponent(Name=Value) sets writable Properties using one or more name-value arguments. For example, you can specify the principal component algorithm, number of components, and variance explained by selected components.

    example

    Properties

    expand all

    Structural Parameters

    The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.

    This property is read-only after the component is created.

    Observation weights flag, specified as 0 (false) or 1 (true). If UseWeights is true, the component adds a third input "Weights" to the Inputs component property, and a third input tag 3 to the InputTags component property.

    Example: c = pcaComponent(UseWeights=1)

    Data Types: logical

    Learn Parameters

    The software sets learn parameters when you create the component. You can modify learn parameters using dot notation any time before you use the learn object function. Any unset learn parameters use the corresponding default values.

    Principal component algorithm, specified as one of the following values.

    ValueDescription
    "svd"Singular value decomposition (SVD) of X.
    "eig"Eigenvalue decomposition (EIG) of the covariance matrix. The EIG algorithm is faster than SVD when the number of observations, n, exceeds the number of variables, p, but is less accurate because the condition number of the covariance is the square of the condition number of X.
    "als"

    Alternating least squares (ALS) algorithm. This algorithm finds the best rank-k approximation by factoring X into a n-by-k left factor matrix, L, and a p-by-k right factor matrix, R, where k is the number of principal components. The factorization uses an iterative method starting with random initial values.

    ALS is designed to handle missing values. If you specify "als", the component ignores the value of MissingBehavior.

    Example: c = pcaComponent(Algorithm="eig")

    Example: c.Algorithm = "als"

    Data Types: char | string

    Indicator for centering columns, specified as 1 (true) or 0 (false).

    The component centers the data by subtracting column means before computing singular value decomposition or eigenvalue decomposition. Any NaN missing values are omitted when the component computes the mean.

    Example: c = pcaComponent(Centered=false)

    Example: c.Centered = 1

    Data Types: logical

    Action to take for missing values, specified as one of the following values.

    ValueDescription
    "complete"

    If an observation contains at least one missing value, it is not used in calculations.

    "pairwise"

    The components computes the (i,j) element of the covariance matrix using the rows with no NaN values in the columns i or j of the first data argument of learn.

    This option only applies when Algorithm is "eig". If you don’t specify the algorithm or specify "svd" as the algorithm, the component sets Altorighm to "eig" during the learn phase.

    "all"

    The predictor data is expected to have no missing values. The component uses all of the data and terminates execution if any NaN value is found.

    Example: c = pcaComponent(MissingBehavior="all")

    Example: c.MissingBehavior = "pairwise"

    Data Types: char | string

    Number of principal components requested, specified as a scalar integer k that satisfies 0<kp where p is the number of variables in the first data argument of learn.

    The default value of NumComponentsis p.

    If you specify both NumComponents and VarianceExplained, the component uses the value that results in the fewest number of principal components.

    Example: c = pcaComponent(NumComponents=4)

    Example: c.NumComponents = 3

    Data Types: single | double

    Variance explained by the selected principal components, specified as a positive numeric scalar in the range [0,1].

    If you specify both VarianceExplained and NumComponents, the component uses the value that results in the fewest number of principal components.

    Example: c = pcaComponent(VarianceExplained=0.95)

    Example: c.VarianceExplained = 0.8

    Data Types: single | double

    Component Properties

    The software sets component properties when you create the component. You can modify the component properties (excluding HasLearnables and HasLearned) using dot notation at any time. You cannot modify the HasLearnables and HasLearned properties directly.

    Component identifier, specified as a character vector or string scalar.

    Example: c = pcaComponent(Name="PCAComponent")

    Example: c.Name = "PrincipalComponents"

    Data Types: char | string

    Names of the input ports, specified as a character vector, string array, or cell array of character vectors.

    Example: c = pcaComponent(Inputs="Data1")

    Example: c.Inputs = "X"

    Data Types: char | string | cell

    Names of the output ports, specified as a character vector, string array, or cell array of character vectors.

    Example: c = pcaComponent(Outputs=["newX")

    Example: c.Outputs = "X"

    Data Types: char | string | cell

    Tags that enable the automatic connection of the component inputs with other components or pipelines, specified as a nonnegative integer vector. If you specify InputTags, then the number of tags must match the number of inputs in Inputs.

    Example: c = pcaComponent(InputTags=2)

    Example: c.InputTags = 1

    Data Types: single | double

    Tags that enable the automatic connection of the component outputs with other components or pipelines, specified as a nonnegative integer vector. If you specify OutputTags, then the number of tags must match the number of outputs in Outputs.

    Example: c = pcaComponent(OutputTags=0)

    Example: c.OutputTags=1

    Data Types: single | double

    This property is read-only.

    Indicator for the learnables, returned as 1 (true). A value of 1 indicates that the component contains Learnables.

    Data Types: logical

    This property is read-only.

    Indicator showing the learning status of the component, returned as 0 (false) or 1 (true). A value of 1 indicates that the learn object function has been applied to the component and the Learnables are nonempty.

    Data Types: logical

    Learnables

    The software sets learnables when you use the learn object function. You cannot modify learnables directly.

    This property is read-only.

    Estimated means of the variables in the first data argument of learn, returned as a numeric row vector. When Centered is "off", the component does not compute the means and Mu is a vector of zeros.

    Data Types: single | double

    Principal component coefficients, returned as a numeric matrix. Each column of Coefficients contains the coefficients for one principal component. The columns are arranged in descending order by principal component variance.

    Data Types: single | double

    This property is read-only.

    Names of the variables used by the component to compute principal components, returned as a string array. The variables correspond to columns in the first data argument of learn.

    Data Types: string

    Object Functions

    learnInitialize and evaluate pipeline or component
    runExecute pipeline or component for inference after learning
    resetReset pipeline or component
    seriesConnect components in series to create pipeline
    parallelConnect components or pipelines in parallel to create pipeline
    viewView diagram of pipeline inputs, outputs, components, and connections

    Examples

    collapse all

    Create a pcaComponent pipeline component. Request three components.

    component = pcaComponent(NumComponents=3)
    component = 
      pcaComponent with properties:
    
                 Name: "PCA"
               Inputs: "DataIn"
            InputTags: 1
              Outputs: "DataOut"
           OutputTags: 1
    
       
    Learnables (HasLearned = false)
                   Mu: []
         Coefficients: []
        UsedVariables: []
    
       
    Structural Parameters (locked)
           UseWeights: 0
    
       
    Learn Parameters (unlocked)
        NumComponents: 3
    
    
    Show all parameters
    

    component is a pcaComponent object that contains three learnables: Mu, Coefficients, and UsedVariables. These properties remains empty until you pass data to the component during the learn phase.

    Read the fisheriris data set into a table. Store the predictor and response data in the tables X and Y, respectively.

    fisheriris = readtable("fisheriris.csv");
    X = fisheriris(:,1:end-1);
    Y = fisheriris(:,end);

    Use the learn object function to perform principal component analysis.

    component = learn(component,X)
    component = 
      pcaComponent with properties:
    
                 Name: "PCA"
               Inputs: "DataIn"
            InputTags: 1
              Outputs: "DataOut"
           OutputTags: 1
    
       
    Learnables (HasLearned = true)
                   Mu: [5.8433 3.0573 3.7580 1.1993]
         Coefficients: [4×3 double]
        UsedVariables: ["SepalLength"    "SepalWidth"    "PetalLength"    "PetalWidth"]
    
       
    Structural Parameters (locked)
           UseWeights: 0
    
       
    Learn Parameters (locked)
        NumComponents: 3
    
    
    Show all parameters
    

    Note that the HasLearned property is set to true and Mu, Coefficients, and UsedVariables are nonempty.

    Find the PCA coefficients. Each column contains the coefficients for one principal component.

    coefficients = component.Coefficients
    coefficients =
    
        0.3614    0.6566   -0.5820
       -0.0845    0.7302    0.5979
        0.8567   -0.1734    0.0762
        0.3583   -0.0755    0.5458

    Version History

    Introduced in R2026a

    See Also