Main Content

plotEmpiricalCDF

Visualize empirical cumulative distribution function (ecdf) of a variable specified for drift detection

    Description

    example

    plotEmpiricalCDF(DDiagnostics) plots the ecdf values of the baseline and target data for the continuous variable with the lowest p-value. If there are no continuous variables in the data, then plotEmpiricalCDF does not generate a plot and returns a warning.

    If you set the value of EstimatePValues to false in the call to detectdrift, then plotEmpiricalCDF displays NaN for the p-value and the drift status.

    example

    plotEmpiricalCDF(DDiagnostics,Variable=variable) plots the ecdf for the variable specified by variable.

    example

    plotEmpiricalCDF(ax,___) plots into axes ax instead of gca.

    St = plotEmpiricalCDF(___) plots the ecdf and returns an array of Stair objects in St. Use this to inspect and adjust the properties of the ecdf. To learn more about the properties of the Stair object, see Stair Properties.

    Examples

    collapse all

    Generate baseline and target data with three variables, where the distribution parameters of the second and third variables change for target data.

    rng('default') % For reproducibility
    baseline = [normrnd(0,1,100,1),wblrnd(1.1,1,100,1),betarnd(1,2,100,1)];
    target = [normrnd(0,1,100,1),wblrnd(1.2,2,100,1),betarnd(1.7,2.8,100,1)];

    Perform permutation testing for all variables to check for any drift between the baseline and target data.

    DDiagnostics = detectdrift(baseline,target)
    DDiagnostics = 
      DriftDiagnostics
    
                  VariableNames: ["x1"    "x2"    "x3"]
           CategoricalVariables: []
                    DriftStatus: ["Stable"    "Drift"    "Warning"]
                        PValues: [0.3850 0.0050 0.0910]
            ConfidenceIntervals: [2x3 double]
        MultipleTestDriftStatus: "Drift"
                 DriftThreshold: 0.0500
               WarningThreshold: 0.1000
    
    
      Properties, Methods
    
    

    Plot the ecdf for the variable with the lowest p-value.

    plotEmpiricalCDF(DDiagnostics)

    Figure contains an axes object. The axes object with title ECDF for x2 contains 2 objects of type stair. These objects represent Baseline, Target.

    plotEmpiricalCDF by default plots the ecdf of the baseline and target data for the variable with the lowest p-value, which, in this case, is variable x2. You can see the difference between the two empirical cumulative distribution functions. It also displays the p-value and the drift status for variable x2.

    Generate baseline and target data with three variables, where the distribution parameters of the second and third variables change for target data.

    rng('default') % For reproducibility
    baseline = [normrnd(0,1,100,1),wblrnd(1.1,1,100,1),betarnd(1,2,100,1)];
    target = [normrnd(0,1,100,1),wblrnd(1.2,2,100,1),betarnd(1.7,2.8,100,1)];

    Perform permutation testing for all variables to check for any drift between the baseline and target data.

    DDiagnostics = detectdrift(baseline,target)
    DDiagnostics = 
      DriftDiagnostics
    
                  VariableNames: ["x1"    "x2"    "x3"]
           CategoricalVariables: []
                    DriftStatus: ["Stable"    "Drift"    "Warning"]
                        PValues: [0.3850 0.0050 0.0910]
            ConfidenceIntervals: [2x3 double]
        MultipleTestDriftStatus: "Drift"
                 DriftThreshold: 0.0500
               WarningThreshold: 0.1000
    
    
      Properties, Methods
    
    

    Plot the ecdf for the third variable.

    plotEmpiricalCDF(DDiagnostics,Variable="x3")

    Figure contains an axes object. The axes object with title ECDF for x3 contains 2 objects of type stair. These objects represent Baseline, Target.

    plotEmpiricalCDF plots the ecdf for baseline and target data. It also displays the estimated p-value and the drift status for the same variable.

    Load the sample data.

    load humanactivity

    For details on the data set, enter Description at the command line. Assign the first 250 observations for columns 10 to 15 as baseline data and next 250 as target data.

    baseline = feat(1:250,10:15);
    target = feat(251:500,10:15);

    Test for drift on all variables.

    DDiagnostics = detectdrift(baseline,target)
    DDiagnostics = 
      DriftDiagnostics
    
                  VariableNames: ["x1"    "x2"    "x3"    "x4"    "x5"    "x6"]
           CategoricalVariables: []
                    DriftStatus: ["Drift"    "Stable"    "Stable"    ...    ]
                        PValues: [1.0000e-03 0.5080 0.2370 1.0000e-03 0.5370 ... ]
            ConfidenceIntervals: [2x6 double]
        MultipleTestDriftStatus: "Drift"
                 DriftThreshold: 0.0500
               WarningThreshold: 0.1000
    
    
      Properties, Methods
    
    

    Drift statuses for variables x4 and x6 are drift and warning, respectively. Plot the empirical cumulative distribution function values for x4 and x6 in a tiled layout.

    tiledlayout(1,2);
    ax1 = nexttile;
    plotEmpiricalCDF(DDiagnostics,ax1,Variable="x4")
    ax2= nexttile;
    plotEmpiricalCDF(DDiagnostics,ax2,Variable="x6")

    Figure contains 2 axes objects. Axes object 1 with title ECDF for x4 contains 2 objects of type stair. These objects represent Baseline, Target. Axes object 2 with title ECDF for x6 contains 2 objects of type stair. These objects represent Baseline, Target.

    There is a greater difference between the ecdf of baseline and target data for variable x4, for which detectdrift detected the shift.

    Input Arguments

    collapse all

    Diagnostics of the permutation testing for drift detection, specified as a DriftDiagnostics object returned by detectdrift.

    Variable for which to visualize the ecdf, specified as a string, a character vector, or an integer index.

    Example: Variable="x3"

    Example: Variable=3

    Data Types: single | double | char | string

    Axes for plotEmpiricalCDF to plot into, specified as an Axes or UIAxes object. If you do not specify ax, then plotEmpiricalCDF creates the plot using the current axes. For more information on creating an axes object, see axes and uiaxes.

    Version History

    Introduced in R2022a