Screen Risk Factors
Remove risk factors from data in Modelscape
Use the Modelscape™ Screen Risk Factors task to automatically remove risk factors from a data table based on their predictive power relative to a binary response variable. Feature selection is an important step in the development of a statistical model. Input data can have hundreds or thousands of variables, and discarding some variables often improves model interpretability, training times, and other important attributes. The task automatically generates MATLAB® code for your live script. This task requires the Modelscape for MATLAB support package.
Using this task, you can:
Inspect summary statistics and histograms for variables in a data table.
Use customizable screening criteria to analyze the predictive power of variables.
Remove variables from a data table and record the corresponding reason for exclusion.
Record reasons for including variables in a data table.
Export the resulting subtables to MATLAB desktop.
For general information about Live Editor tasks, see Add Interactive Tasks to a Live Script.
Open the Screen Risk Factors
To add the Threshold Predictors task to a live script in the MATLAB Editor:
On the Live Editor tab, select Task > Screen Risk Factors.
In a code block in the script, type a relevant keyword, such as
Screen Risk Factorsfrom the suggested command completions.
Input table — Table of input data to inspect
table of input data containing variables to inspect
Input table must be a MATLAB table or a timetable. The columns of Input table contain the variables for different data points, for example, Residence Status or Customer ID.
Response variable — Binary variable in table
binary variable to use for prediction
Response variable must be a binary variable in the input table. The task evaluates the risk factors in the input data table based on their power to predict this response variable.
Criteria — Screening criteria to apply to input variables
Criteria must be an object containing the criteria against which to screen the input variables. You can use the predefined criteria or customize your own screening criteria. For more details, see Screen Risk Factors by Custom Criteria.
Filtered table — Display table of filtered variables
check box to display subtable with excluded variables
Check the Filtered table check box to display the subtable after excluding the removed variables. The filtered table contains the columns from the Input table without the variables that you mark for exclusion.
Preview summary tables — Display tables of summary
check box to display two tables with summaries of variables and progress
Check the Preview summary tables check box to display two tables of additional information about the feature selection process. The exclusionSummaryPreview table includes all the data of the input table together with the exclusion flags and comments that you record in the task. The progressSummaryPreview table shows the total number of variables that are present, excluded, included, and commented against.
Introduced in R2021b