featureSelectionClassificationNCAComponent
Pipeline component for performing feature selection using neighborhood component analysis (NCA) for classification
Since R2026a
Description
featureSelectionRegressionNCAComponent is a pipeline component that performs
feature selection using neighborhood component analysis (NCA) for classification. The pipeline
component uses the functionality of the fscnca function during the learn phase to identify important predictors in the
data. During the run phase, the component selects the same predictors from a new data
set.
Creation
Syntax
Description
creates a pipeline component for feature selection using an NCA feature selection model.
Use the component when creating a pipeline for classification.component = featureSelectionRegressionNCAComponent
sets writable Properties using one or more
name-value arguments. For example, you can specify the regularization parameter, solver
type, and method used for model fitting.component = featureSelectionClassificationNCAComponent(Name=Value)
Properties
Structural Parameters
The software sets structural parameters when you create the component. You cannot modify structural parameters after creating the component.
This property is read-only after the component is created.
Observation weights flag, specified as 0 (false)
or 1 (true). If UseWeights is
true, the component adds a third input "Weights" to the
Inputs component property, and a third input tag
3 to the InputTags component
property.
Example: c = featureSelectionClassificationNCAComponent(UseWeights=1)
Data Types: logical
Learn Parameters
The software sets learn parameters when you create the component. You can modify learn
parameters using dot notation any time before you use the learn object
function. Any unset learn parameters use the corresponding default values.
Method for fitting the model, specified as one of the following:
"exact"— Performs fitting using all of the data."none"— No fitting. Use this option to evaluate the generalization error of the NCA model using the initial feature weights."average"— Divides the data into partitions (subsets), fits each partition using theexactmethod, and returns the average of the feature weights.
Example: c =
featureSelectionClassificationNCAComponent(FitMethod="none")
Example: c.FitMethod = "average"
Data Types: char | string
Relative convergence tolerance on the gradient norm, specified as a positive real scalar.
This property is valid only when Solver is
"lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(GradientTolerance=2e-6)
Example: c.GradientTolerance = 1e-5
Data Types: single | double
Size of the history buffer for Hessian approximation, specified as a positive
integer. At each iteration, the component uses the most recent
HessianHistorySize iterations to build an approximation of the
inverse Hessian.
This property is valid only when Solver is
"lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(HessianHistorySize=20)
Example: c.HessianHistorySize = 10
Data Types: single | double
Initial learning rate for the "sgd" solver, specified as a
positive real scalar or "auto".
When Solver is
"sgd", the learning rate decays over iterations starting with the
value specified for InitialLearningRate.
When you specify "auto", the initial learning rate is
determined using experiments on small subsets of data. Use the NumTuningIterations property to specify the number of iterations for
automatically tuning the initial learning rate. Use the TuningSubsetSize property to specify the number of observations to use
for automatically tuning the initial learning rate.
For solver type "minibatch-lbfgs", you can set
InitialLearningRate to a very high value. In this case, the
function applies LBFGS to each mini-batch separately with initial feature weights from
the previous mini-batch.
Example: c =
featureSelectionClassificationNCAComponent(InitialLearningRate=0.9)
Example: c.InitialLearningRate = "auto"
Data Types: single | double | char | string
Initial step size, specified as a positive real scalar or
"auto".
This property is valid only when Solver is
"lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(InitialStepSize=0.1)
Example: c.InitialStepSize = "auto"
Data Types: single | double | char | string
Maximum number of iterations, specified as a positive integer.
Each pass through a batch is an iteration. Each pass through all of the data is an epoch. If the data is divided into k mini-batches, then every epoch is equivalent to k iterations.
If Solver is
"sgd", the default value is 10000. If
Solver is "lbfgs" or
"minibatch-lbfgs", the default value is
1000.
Example: c =
featureSelectionClassificationNCAComponent(IterationLimit=250)
Example: c.IterationLimit = 1000
Data Types: single | double
Regularization parameter to prevent overfitting, specified as a nonnegative scalar.
As the number of observations increases, the chance of overfitting decreases and the required amount of regularization also decreases.
The default value is 1/n, where
n is the number of observations in the first data argument of
learn.
Example: c =
featureSelectionClassificationNCAComponent(Lambda=0.002)
Example: c.Lambda = 0.01
Data Types: single | double
Width of the kernel, specified as a positive real scalar.
A length scale value of 1 is sensible when all predictors are
on the same scale. If the predictors are of very different magnitudes, then consider
standardizing the predictor values using the Standardize
property.
Example: c =
featureSelectionClassificationNCAComponent(LengthScale=1.5)
Example: c.LengthScale = 1.25
Data Types: single | double
Line search method, specified as one of the following:
"weakwolfe"— Weak Wolfe line search"strongwolfe"— Strong Wolfe line search"backtracking"— Backtracking line search
This property is valid only when Solver is
"lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(LineSearchMethod="strongwolfe")
Example: c.LineSearchMethod = "backtracking"
Data Types: char | string
Loss function, specified as "classiferror" or a function
handle.
When you specify "classiferror", the component uses the
misclassification error for computing the objective function..
To specify a custom loss function, use function handle notation. The function must
have the form L = lossfun(Yu,Yv), where Yu is a
u-by-1 vector, Yv is a
v-by-1 vector, and L is a
u-by-v matrix of loss values.
Example: c =
featureSelectionClassificationNCAComponent(LossFun=@lossfun)
Example: c.LossFun = "classiferror"
Data Types: char | string
Maximum number of line search iterations, specified as a positive integer.
This property is valid only when Solver is
"lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(MaxLineSearchIterations=25)
Example: c.MaxLineSearchIterations = 15
Data Types: single | double
Max weight fraction for selecting features, specified as a numeric scalar in the range (0,1].
If you do not specify the NumFeatures
or MaxWeightFraction value, the software selects all features.
You cannot specify both NumFeatures and
MaxWeightFraction.
Example: c =
featureSelectionClassificationNCAComponent(MaxWeightFraction=0.5)
Example: c.MaxWeightFraction = 0.75
Data Types: single | double
Maximum number of iterations per mini-batch LBFGS step, specified as a positive integer.
This property is valid only when Solver is
"minibatch-lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(MiniBatchLBFGSIterations=15)
Example: c.MiniBatchLBFGSIterations = 20
Data Types: single | double
Number of observations to use in each batch, specified as a positive integer
between 1 and n, where n is
the number of observations in the first data argument of
learn.
This property is valid only when Solver is
"sgd".
The default value is min(10,n).
Example: c =
featureSelectionClassificationNCAComponent(MiniBatchSize=25)
Example: c.MiniBatchSize = 20
Data Types: single | double
Number of features (predictors) to select, specified as a positive integer scalar.
If you do not specify the NumFeatures or MaxWeightFraction value, the software selects all features. You cannot
specify both NumFeatures and
MaxWeightFraction.
Example: c =
featureSelectionClassificationNCAComponent(NumFeatures=5)
Example: c.NumFeatures = 10
Data Types: single | double
Number of tuning iterations, specified as a positive integer.
This property is valid only when Solver is
"sgd" and InitialLearningRate is "auto".
Example: c =
featureSelectionClassificationNCAComponent(NumTuningIterations=15)
Example: c.NumTuningIterations = 25
Data Types: single | double
Maximum number of passes, specified as a positive integer. Each pass through all of the data is called an epoch.
This property is valid only when Solver is
"sgd".
Example: c =
featureSelectionClassificationNCAComponent(PassLimit=10)
Example: c.PassLimit = 3
Data Types: single | double
Prior probabilities for each class, specified as a value in this table.
| Value | Description |
|---|---|
"empirical" | The class prior probabilities are the class relative frequencies. The
class relative frequencies are determined by the second data argument of
learn. |
"uniform" | All class prior probabilities are equal to 1/K, where K is the number of classes. |
| structure | A structure
|
Example: c =
featureSelectionClassificationNCAComponent(Prior="uniform")
Example: c.Prior = "empirical"
Data Types: char | string | struct
Solver type for estimating feature weights, specified as one of the following:
"lbfgs"— Limited memory Broyden–Fletcher–Goldfarb–Shanno (LBFGS) algorithm"sgd"— Stochastic gradient descent (SGD) algorithm"minibatch-lbfgs"— Stochastic gradient descent with LBFGS algorithm applied to mini-batches
The default value is "sgd" when
n>1000, where n is the
number of observations in the first data argument of learn.
Otherwise, the default value is "lbfgs".
Example: c =
featureSelectionClassificationNCAComponent(Solver="sgd")
Example: c.Solver = "lbfgs"
Data Types: char | string
Indicator for standardizing the predictor data, specified as 0
(false) or 1 (true).
Example: c =
featureSelectionClassificationNCAComponent(Standardized=true)
Example: c.Standardize = false
Data Types: logical
Convergence tolerance on the step size, specified as a positive real scalar.
This property is valid only when Solver is
"sgd" or "lbfgs".
The "lbfgs" solver uses an absolute step tolerance, and the
"sgd" solver uses a relative step tolerance.
Example: c =
featureSelectionClassificationNCAComponent(StepTolerance=5e-6)
Example: c.StepTolerance = 1e-5
Data Types: single | double
Number of observations to use for tuning the initial learning rate, specified as a
positive integer value from 1 to n, where
n is the number of observations in the first data argument of
learn.
This property is valid only when Solver is
"sgd" and InitialLearningRate is "auto".
The default value is min(100,n).
Example: c =
featureSelectionClassificationNCAComponent(TuningSubsetSize=25)
Example: c.TuningSubsetSize = 50
Data Types: single | double
Component Properties
The software sets component properties when you create the component. You can modify the
component properties (excluding HasLearnables and
HasLearned) using dot notation at any time. You cannot modify the
HasLearnables and HasLearned properties
directly.
Component identifier, specified as a character vector or string scalar.
Example: c =
featureSelectionClassificationNCAComponent(Name="FeatureSelector")
Example: c.Name = "NCASelector"
Data Types: char | string
Names of the input ports, specified as a character vector, string array, or cell
array of character vectors. If UseWeights is true, the component adds the input port
"W" to Inputs.
Example: c =
featureSelectionClassificationNCAComponent(Inputs=["Data1","Data2"])
Example: c.Inputs = ["X1","Y1"]
Data Types: char | string | cell
Names of the output ports, specified as a character vector, string array, or cell array of character vectors.
Example: c =
featureSelectionClassificationNCAComponent(Outputs=["newX","importance"])
Example: c.Outputs = ["X","S"]
Data Types: char | string | cell
Tags that enable the automatic connection of the component inputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
InputTags, then the number of tags must match the number of
inputs in Inputs. If
UseWeights is true, the software adds a third input tag to
InputTags.
Example: c = featureSelectionClassificationNCAComponent(InputTags=[1
0])
Example: c.InputTags = [1 2]
Data Types: single | double
Tags that enable the automatic connection of the component outputs with other
components or pipelines, specified as a nonnegative integer vector. If you specify
OutputTags, then the number of tags must match the number of
outputs in Outputs.
Example: c = featureSelectionClassificationNCAComponent(OutputTags=[1
0])
Example: c.OutputTags=[1 2]
Data Types: single | double
This property is read-only.
Indicator for the learnables, returned as 1
(true). A value of 1 indicates that the
component contains Learnables.
Data Types: logical
This property is read-only.
Indicator showing the learning status of the component, returned as
0 (false) or 1
(true). A value of 1 indicates that the
learn object function has been applied to the component and the
Learnables are nonempty.
Data Types: logical
Learnables
The software sets learnables when you use the learn object
function. You cannot modify learnables directly.
This property is read-only.
Neighborhood component analysis model for classification, returned as a FeatureSelectionNCAClassification model object.
This property is read-only.
Names of the features selected by the component, returned as a string array. The
features correspond to columns in the first data argument of
learn.
Data Types: string
This property is read-only.
Names of the variables used by the component to select features, returned as a string
array. The variables correspond to columns in the first data argument of
learn.
Data Types: string
Object Functions
learn | Initialize and evaluate pipeline or component |
run | Execute pipeline or component for inference after learning |
reset | Reset pipeline or component |
series | Connect components in series to create pipeline |
parallel | Connect components or pipelines in parallel to create pipeline |
view | View diagram of pipeline inputs, outputs, components, and connections |
Examples
Create a featureSelectionClassificationNCAComponent pipeline
component. Specify to select 3 features.
component = featureSelectionClassificationNCAComponent(NumFeatures=3)
component =
featureSelectionClassificationNCAComponent with properties:
Name: "FeatureSelectionClassificationNCA"
Inputs: ["X" "Y"]
InputTags: [1 2]
Outputs: ["XSelected" "Scores"]
OutputTags: [1 NaN]
Learnables (HasLearned = false)
Model: []
SelectedVariables: []
UsedVariables: []
Structural Parameters (locked)
UseWeights: 0
Learn Parameters (unlocked)
NumFeatures: 3
Show all parameters
component is a
featureSelectionClassificationNCAComponent object that contains three
learnables: Model, SelectedVariables, and
UsedVariables. These properties remains empty until you pass data
to the component during the learn phase.
Read the fisheriris data set into a table. Store the predictor
and response data in the tables X and Y, respectively.
fisheriris = readtable("fisheriris.csv");
X = fisheriris(:,1:end-1);
Y = fisheriris(:,end);Use the learn object function to select features from the
predictor data X.
component = learn(component,X,Y)
component =
featureSelectionClassificationNCAComponent with properties:
Name: "FeatureSelectionClassificationNCA"
Inputs: ["X" "Y"]
InputTags: [1 2]
Outputs: ["XSelected" "Scores"]
OutputTags: [1 NaN]
Learnables (HasLearned = true)
Model: [1×1 FeatureSelectionNCAClassification]
SelectedVariables: ["PetalLength" "PetalWidth" "SepalWidth"]
UsedVariables: ["SepalLength" "SepalWidth" "PetalLength" "PetalWidth"]
Structural Parameters (locked)
UseWeights: 0
Learn Parameters (locked)
NumFeatures: 3
Show all parameters
Note that the HasLearned property is set to
true and Model,
SelectedVariables, and UsedVariables are
nonempty.
Find the names of the selected features.
names = component.SelectedVariables
names =
1×3 string array
"PetalLength" "PetalWidth" "SepalWidth"Version History
Introduced in R2026a
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)