Documentation

### This is machine translation

Mouseover text to see original. Click the button below to return to the English version of the page.

Note: This page has been translated by MathWorks. Click here to see
To view all translated materials including this page, select Country from the country navigator on the bottom of this page.

# fillmissing

Fill missing values

## Syntax

``F = fillmissing(A,'constant',v)``
``F = fillmissing(A,method)``
``F = fillmissing(A,movmethod,window)``
``F = fillmissing(___,dim)``
``F = fillmissing(___,Name,Value)``
``````[F,TF] = fillmissing(___)``````

## Description

example

````F = fillmissing(A,'constant',v)` fills missing entries of an array or table with the constant value `v`. If `A` is a matrix or multidimensional array, then `v` can be either a scalar or a vector. When `v` is a vector, each element specifies the fill value in the corresponding column of `A`. If `A` is a table or timetable, then `v` can also be a cell array.Missing values are defined according to the data type of `A`: `NaN` — `double`, `single`, `duration`, and `calendarDuration``NaT` — `datetime``<missing>`—`string``<undefined>` — `categorical``' '` — `char``{''}` — `cell` of character arrays If `A` is a table, then the data type of each column defines the missing value for that column.```

example

````F = fillmissing(A,method)` fills missing entries using the method specified by `method`, which can be one of the following: `'previous'` — previous non-missing value`'next'` — next non-missing value`'nearest'` — nearest non-missing value`'linear'` — linear interpolation of neighboring, non-missing values (numeric, `duration`, and `datetime` data types only)`'spline'` — piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only)`'pchip'` — shape-preserving piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only) ```

example

````F = fillmissing(A,movmethod,window)` fills missing entries using a moving window mean or median with window length `window`. For example, `fillmissing(A,'movmean',5)` fills data with a moving average using a window length of 5.```

example

````F = fillmissing(___,dim)` specifies the dimension of `A` to operate along. By default, `fillmissing` operates along the first dimension whose size does not equal 1. For example, if `A` is a matrix, then `fillmissing(A,2)` operates across the columns of `A`, filling missing data row by row.```

example

````F = fillmissing(___,Name,Value)` specifies additional parameters for filling missing values using one or more name-value pair arguments. For example, if `t` is a vector of time values, then `fillmissing(A,'linear','SamplePoints',t)` interpolates the data in `A` relative to the times in `t`.```

example

``````[F,TF] = fillmissing(___)``` also returns a logical array corresponding to the entries of `A` that were filled.```

## Examples

collapse all

Create a vector that contains `NaN` values and replace each `NaN` with the previous non-missing value.

```A = [1 3 NaN 4 NaN NaN 5]; F = fillmissing(A,'previous')```
```F = 1×7 1 3 3 4 4 4 5 ```

Use interpolation to replace `NaN` values in non-uniformly sampled data.

Define a vector of non-uniform sample points and evaluate the sine function over the points.

```x = [-4*pi:0.1:0, 0.1:0.2:4*pi]; A = sin(x);```

Inject `NaN` values into `A`.

`A(A < 0.75 & A > 0.5) = NaN;`

Fill the missing data using linear interpolation, and return the filled vector `F` and the logical vector `TF`. The value 1 (`true`) in entries of `TF` corresponds to the values of `F` that were filled.

`[F,TF] = fillmissing(A,'linear','SamplePoints',x);`

Plot the original data and filled data.

```plot(x,A,'.', x(TF),F(TF),'o') xlabel('x'); ylabel('sin(x)') legend('Original Data','Filled Missing Data')``` Use a moving median to fill missing numeric data.

Create a vector of sample points `x` and a vector of data `A` that contains missing values.

```x = linspace(0,10,200); A = sin(x) + 0.5*(rand(size(x))-0.5); A([1:10 randi([1 length(x)],1,50)]) = NaN; ```

Replace `NaN` values in `A` using a moving median with a window of length 10, and plot both the original data and the filled data.

```F = fillmissing(A,'movmedian',10); plot(x,F,'r.-',x,A,'b.-') legend('Filled Missing Data','Original Data')``` Create a matrix with missing entries and fill across the columns (second dimension) one row at a time using linear interpolation. For each row, fill leading and trailing missing values with the nearest non-missing value in that row.

```A = [NaN NaN 5 3 NaN 5 7 NaN 9 NaN; 8 9 NaN 1 4 5 NaN 5 NaN 5; NaN 4 9 8 7 2 4 1 1 NaN]```
```A = 3×10 NaN NaN 5 3 NaN 5 7 NaN 9 NaN 8 9 NaN 1 4 5 NaN 5 NaN 5 NaN 4 9 8 7 2 4 1 1 NaN ```
`F = fillmissing(A,'linear',2,'EndValues','nearest')`
```F = 3×10 5 5 5 3 4 5 7 8 9 9 8 9 5 1 4 5 5 5 5 5 4 4 9 8 7 2 4 1 1 1 ```

Fill missing values for table variables with different data types.

Create a table whose variables include `categorical`, `double`, and `char` data types.

```A = table(categorical({'Sunny';'Cloudy';''}),[66;NaN;54],{'';'N';'Y'},[37;39;NaN],... 'VariableNames',{'Description' 'Temperature' 'Rain' 'Humidity'})```
```A=3×4 table Description Temperature Rain Humidity ___________ ___________ ____ ________ Sunny 66 '' 37 Cloudy NaN 'N' 39 <undefined> 54 'Y' NaN ```

Replace all missing entries with the value from the previous entry. Since there is no previous element in the `Rain` variable, the missing character vector is not replaced.

`F = fillmissing(A,'previous')`
```F=3×4 table Description Temperature Rain Humidity ___________ ___________ ____ ________ Sunny 66 '' 37 Cloudy 66 'N' 39 Cloudy 54 'Y' 39 ```

Replace the `NaN` values from the `Temperature` and `Humidity` variables in `A` with 0.

`F = fillmissing(A,'constant',0,'DataVariables',{'Temperature','Humidity'})`
```F=3×4 table Description Temperature Rain Humidity ___________ ___________ ____ ________ Sunny 66 '' 37 Cloudy 0 'N' 39 <undefined> 54 'Y' 0 ```

Alternatively, use the `isnumeric` function to identify the numeric variables to operate on.

`F = fillmissing(A,'constant',0,'DataVariables',@isnumeric)`
```F=3×4 table Description Temperature Rain Humidity ___________ ___________ ____ ________ Sunny 66 '' 37 Cloudy 0 'N' 39 <undefined> 54 'Y' 0 ```

## Input Arguments

collapse all

Input data, specified as a vector, matrix, multidimensional array, table, or timetable.

If `A` is a timetable, then only table values are filled. If the associated vector of row times contains a `NaT` or `NaN` value, then `fillmissing` produces an error. Row times must be unique and listed in ascending order.

Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `logical` | `char` | `string` | `cell` | `table` | `timetable` | `categorical` | `datetime` | `duration` | `calendarDuration`

Fill constant, specified as a scalar, vector, or cell array. `v` can be a vector when `A` is a matrix or multidimensional array. `v` can be a cell array when `A` is a table or timetable.

Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `logical` | `char` | `cell` | `categorical` | `datetime` | `duration`

Fill method, specified as one of the following:

MethodDescription
`'previous'`previous non-missing value
`'next'`next non-missing value
`'nearest'`nearest non-missing value
`'linear'`linear interpolation of neighboring, non-missing values (numeric, `duration`, and `datetime` data types only)
`'spline'`piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only)
`'pchip'`shape-preserving piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only)
`'makima'`modified Akima cubic Hermite interpolation (numeric, `duration`, and `datetime` data types only)

Moving method to fill missing data, specified as one of the following:

MethodDescription
`'movmean'`Moving average over a window of length `window` (numeric data types only)
`'movmedian'`Moving median over a window of length `window` (numeric data types only)

Window length, specified as a positive integer scalar, a two-element vector of positive integers, a positive duration scalar, or a two-element vector of positive durations.

When `window` is a positive integer scalar, then the window is centered about the current element and contains `window-1` neighboring elements. If `window` is even, then the window is centered about the current and previous elements. If `window` is a two-element vector of positive integers `[b f]`, then the window contains the current element, `b` elements backward, and `f` elements forward.

When `A` is a timetable or `'SamplePoints'` is specified as a `datetime` or `duration` vector, `window` must be of type `duration`, and the windows are computed relative to the sample points.

Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `duration`

Dimension to operate along, specified as a positive integer scalar. If no value is specified, then the default is the first array dimension whose size does not equal 1.

When `A` is a table or timetable, `dim` is not supported. `fillmissing` operates along each table or timetable variable separately.

Consider a two-dimensional input array, `A`.

• If `dim=1`, then `fillmissing` fills `A` column by column. • If `dim=2`, then `fillmissing` fills `A` row by row. Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64`

### Name-Value Pair Arguments

Specify optional comma-separated pairs of `Name,Value` arguments. `Name` is the argument name and `Value` is the corresponding value. `Name` must appear inside quotes. You can specify several name and value pair arguments in any order as `Name1,Value1,...,NameN,ValueN`.

Example: `fillmissing(A,'DataVariables',{'Temperature','Altitude'})` fills only the columns corresponding to the `Temperature` and `Altitude` variables of an input table

Method for handling endpoints, specified as the comma-separated pair consisting of `'EndValues'` and one of `'extrap'`, `'previous'`, `'next'`, `'nearest'`, `'none'`, or a constant scalar value. The endpoint fill method handles leading and trailing missing values based on the following definitions:

MethodDescription
`'extrap'`same as `method`
`'previous'`previous non-missing value
`'next'`next non-missing value
`'nearest'`nearest non-missing value
`'none'`no fill value
scalarconstant value (numeric, `duration`, and `datetime` data types only)

Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `logical` | `datetime` | `duration`

Sample points for fill method, specified as the comma-separated pair consisting of `'SamplePoints'` and a vector. The sample points represent the location of the data in `A`, and must be sorted and contain unique elements. Sample points do not need to be uniformly sampled. If `A` is a timetable, then the default sample points vector is the vector of row times. Otherwise, the default vector is `[1 2 3 ...]`.

Moving windows are defined relative to the sample points. For example, if `t` is a vector of times corresponding to the input data, then `fillmissing(rand(1,10),'movmean',3,'SamplePoints',t)` has a window that represents the time interval between `t(i)-1.5` and `t(i)+1.5`.

When the sample points vector has data type `datetime` or `duration`, then the moving window length must have type `duration`.

This name-value pair is not supported when the input data is a timetable.

Data Types: `double` | `single` | `datetime` | `duration`

Table variables to fill, specified as the comma-separated pair consisting of `'DataVariables'` and a variable name, a cell array of variable names, a numeric vector, a logical vector, or a function handle. The `'DataVariables'` value indicates which columns of the input table to fill, and can be one of the following:

• A character vector specifying a single table variable name

• A cell array of character vectors where each element is a table variable name

• A vector of table variable indices

• A logical vector whose elements each correspond to a table variable, where `true` includes the corresponding variable and `false` excludes it

• A function handle that returns a logical scalar, such as `@isnumeric`

Example: `'Age'`

Example: `{'Height','Weight'}`

Example: `@iscategorical`

Data Types: `char` | `cell` | `single` | `double` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `logical` | `function_handle`

Known missing indicator, specified as the comma-separated pair consisting of `'MissingLocations'` and a logical vector, matrix, or multidimensional array of the same size as `A`. The indicator elements can be `true` to indicate a missing value in the corresponding location of `A` or `false` otherwise.

Data Types: `logical`

## Output Arguments

collapse all

Filled data, returned as a vector, matrix, multidimensional array, table, or timetable. `F` is the same size as `A`.

Data Types: `double` | `single` | `int8` | `int16` | `int32` | `int64` | `uint8` | `uint16` | `uint32` | `uint64` | `logical` | `char` | `string` | `cell` | `table` | `timetable` | `categorical` | `datetime` | `duration` | `calendarDuration`

Filled data indicator, returned as a vector, matrix, or multidimensional array. `TF` is a logical array where 1 (`true`) corresponds to entries in `F` that were filled and 0 (`false`) corresponds to unchanged entries. `TF` is the same size as `A` and `F`.

Data Types: `logical`

Download ebook