Percentile Plot

In the SimBiology Model Analyzer app, you can visualize time course data and its corresponding statistics using a percentile plot. The plot shows curves of summary statistics (percentiles or mean and standard deviation) over time. You can also view the raw data along with summary statistics.

You can choose between two methods to aggregate time-varying data and compute summary statistics. For details, see Interpolation Method and Time Point Binning Method.

To show a percentile plot, select a data source that contains time courses in the Browser pane, then click percentile in the Plot section on the Home tab.

Display Options

Each response in a percentile plot has three display options, which you can configure.

Percentiles — Shows the percentile curves. By default, the plot shows 5th and 95th percentiles. Scan programs with more than 40 samples use percentile plots as default plots. You can change this default cutoff in Preferences > Programs > Plots. This is the default display type for simulation data.
Mean — Shows the mean and standard deviation of response data at each time point. This is the default display type for experimental data.
Raw Data — Shows the original response data points at each time point.

Responses section with menu showing the three display

Percentiles Options

The Percentiles section provides the following options to configure the percentile curves.

Option	Description
Show percentiles (%)	Percentiles to plot. Enter one or more nonnegative numbers between 0 and 100 or any MATLAB^® expression that results in a nonnegative number or vector of values between 0 and 100 in ascending order.
Show median	Logical flag to show or hide the median line in the plot.
Display style	Display format to show lines, shading, or both.

Mean Options

The Mean section provides the following options to configure the plots.

Option	Description
Show mean	Logical flag to show the mean response value at each time point. The plot uses the marker `o` to represent mean values.
Show standard deviation	Logical flag to show ±1 standard deviation of the response value at each time point. The plot shows error bars to indicate the standard deviations.
Show min/max	Logical flag to show the derived minimum and maximum response values at each time point. To calculate these values, the app first interpolates all the time courses to a common time vector and then calculates the statistics, such as min, max, mean, and standard deviation, at each time point on the common time vector across all interpolated time courses. Hence the interpolated maximum and minimum values at a given time point shown in the percentile plot may not match those values of the raw data exactly. The plot uses the marker `*` to indicate the minimum and maximum values.
Display style	Display format to show lines, markers, or both.

Data Options

For each data source in the percentile plot, you can select the data aggregation method and related options in the corresponding data section of the Property Editor pane. The app provides two data aggregation methods: interpolation and binning. It is recommended to use interpolation for densely recorded data, such as simulated model responses. Use time point binning for sparsely recorded data, such as experimental data.

If you are plotting multiple responses or slicing the data by a covariate or parameter, the app performs data aggregation independently for each response or data slice. Different responses can have different automatic interpolation or binning results.

Each data aggregation method has its own set of options. You can change these options for each data source independently from other data sources.

Interpolation

If you select interpolation as the data aggregation method, the app interpolates all time courses onto a common time vector. The summary statistics are calculated on the interpolated data at every time point in the common time vector. For details, see Interpolation Method. You can also specify a custom common time vector and the interpolation method used for data aggregation. The next table summarizes the available options.

Option Description

Option	Description
Time vector	Common time vector onto which all time courses are interpolated before calculating the summary statistics. The default option `auto` option uses a common time vector which is a vector of equidistant points between the minimum and maximum time points specified in the data. Alternatively, specify the time vector as a sequence of numbers or MATLAB expression that evaluates to a vector of strictly increasing numbers greater than or equal to zero.
Interpolation method	Method used to interpolate time courses onto a common time vector. The app calls `interp1` with the specified method (default is `linear`). For simulation data, the app treats multiple response values at the same time point as a discontinuity and performs piecewise interpolation between such time points. For experimental data, the app treats these response values at the same time point as repeated measurements and uses the mean of all measurements at the same time point.
Show raw data fraction (%)	Percentage of raw (original) time courses or data shown in the plot. Enter a nonnegative integer between 0 and 100.

Time vector

Common time vector onto which all time courses are interpolated before calculating the summary statistics.

The default option auto option uses a common time vector which is a vector of equidistant points between the minimum and maximum time points specified in the data.

Alternatively, specify the time vector as a sequence of numbers or MATLAB expression that evaluates to a vector of strictly increasing numbers greater than or equal to zero.

Interpolation method

Method used to interpolate time courses onto a common time vector. The app calls interp1 with the specified method (default is linear). For simulation data, the app treats multiple response values at the same time point as a discontinuity and performs piecewise interpolation between such time points. For experimental data, the app treats these response values at the same time point as repeated measurements and uses the mean of all measurements at the same time point.

Show raw data fraction (%) Percentage of raw (original) time courses or data shown in the plot. Enter a nonnegative integer between 0 and 100.

For details, see Interpolation Method.

Binning

If you select binning as the data aggregation method, the app clusters the data points into different bins based on their time values. It calculates summary statistics for data within each bin and plots the statistics at the centroid of each bin, which is the mean of time values of data within that bin. For details, see Time Point Binning Method. The next table summarizes the available options.

Option	Description
Binning method	`auto` — The app scans over several possible numbers of bins and calculates the `kmeans` (Statistics and Machine Learning Toolbox) clustering solution for each of these values. It then determines the optimal numbers of bins by selecting the solution that minimizes the Davies-Bouldin Criterion (Statistics and Machine Learning Toolbox). `specify number of bins` — As an alternative to `auto`, you can specify a custom number of bins to use for the k-means algorithm. `specify bin edges` — Specify the exact edges or boundaries for each bin to avoid the `kmeans` clustering.
Show bin edges	Logical flag to display vertical lines that indicate the bin boundaries.
Show raw data fraction (%)	Percentage of raw (original) time courses or data shown in the plot. Enter an integer between 0 and 100.

For details, see Time Point Binning Method.

Interpolation Method

When you are using interpolation as the data aggregation method, the app calculates summary statistics using the following steps.

The app calculates a common time vector as a vector of equidistant time points between the minimum and maximum time points in the data across all the groups (or runs) in each data slice or obtains the time vector by using the code specified in the Time vector option.
It then interpolates the response time course for each group or run onto the common time vector using interp1 with the method specified in the Interpolation method option. For simulation data, the app treats multiple response values at the same time point as a discontinuity and performs piecewise interpolation between such time points. For experimental data, the app treats data at the same time point as repeated measurements and uses the mean of all measurements at those time points.
The app then calculates the corresponding statistics, such as percentiles, mean, max, standard deviation, for each time point in the common time vector across all groups for that time point in the interpolated time courses.
Note
Because of interpolation, calculated maximum and minimum values might be different than those values from the original data.
It then generates a plot using the calculated statistics against the common time vector according to the Display style option.

Time Point Binning Method

When you are using binning as the data aggregation method, the app calculates summary statistics using the following steps.

The app partitions the data into n bins using only the time values for each data point. It does not consider any similarities in measurement values. By default, the data is binned using the kmeans (Statistics and Machine Learning Toolbox) algorithm, and you can also specify a custom number of bins or specific bin edges (or boundaries).
The app calculates summary statistics for each bin.
It obtains the common time vector by calculating the mean time value for each bin.
The app then generates a plot using the computed statistics from step 2 against the mean time value for each bin from step 3 according to the Display style option.