Test Metrics in Modelscape
This example shows how to implement various test metrics in MATLAB® using Modelscape™.
Write Test Metrics
The basic building block of Modelscape metrics framework is the
mrm.data.validation.TestMetric class. This class defines the following properties:
Name: a human-readable name for the test metric.
ShortName: a concise name for accessing metrics in
MetricsHandlerobjects. This name must be a valid MATLAB property name.
Value: the value(s) carried by the metric. The values can be a scalar or a row vector of doubles.
Keys: an n-by-m array of strings that parametrize the values of the metric. m is the length of
Value. The keys default to an empty string.
KeyNames: a vector of strings of size of the height of
Keys. It defaults to "Key".
Diagnostics: a free form struct carrying any diagnostics related to the calculation of the metric.
Any subclass of
TestMetric must implement a constructor and a
compute method to fill in these values.
For example, the Modelscape statistical parity difference (SPD) metric for bias detection has
Name "Statistical Parity Difference" and
ShortName "StatisticalParityDifference". The following table shows how the
KeyNames are arranged.
Here "SensitiveAttribute" and "Group" are the
KeyNames, and the two columns with certain attribute-group combinations are the
ShortName appears as the third header, and the third column of the table carries the
Value of the metric.
The base class has the following overridable methods:
(this): use this method to change the value against which thresholds are compared - for example, in statistical hypothesis testing, this should return the p-value associated to the computed statistic.
(this):returns by default a table as shown above for the SPD metric.
project(this):returns a restriction of a (non-scalar) metric to a subset of keys. Extend the default implementation in a subclass to cover any diagnostic or auxiliary data carried by the subclass objects.
Write Metrics With Visualizations
To write test metrics equipped with visualizations, the metrics should inherit from
mrm.data.validation.TestMetricWithVisualization. This class adds an additional requirement to the
TestMetric base class to implement a visualization method with the signature
fig = visualize(this, options).
options allows for any name value arguments that may be useful for the given metric. For example, use a particular sensitive attribute with the
StatisticalParityDifference metric for visualization.
spdFig = visualize(spdMetric, "SensitiveAttribute","ResStatus");
Write Metrics Projecting onto Selected Keys
The visualization above shows the SPD metrics for the
ResStatus attribute only. This plot uses the
project method of the
TestMetric class that uses selected keys of a metric. For a metric with N key names,
project accepts an array of up to N strings as the
Keys argument. The output restricts the metric to those keys where the first key matches the first element of the array, the second key matches the second element of the array, and so on.
spdResStatus = project(spdMetric, "Keys", "ResStatus")
On specifying both keys, the results is a scalar metric:
spdTenant = project(spdMetric, "Keys", ["ResStatus", "Tenant"])
The base class implementation of
project does not handle diagnostics or other auxiliary data carried by the subclass. If necessary, implement this in the subclass using the secondary
keySelection output in
Write Summarizable Metrics
Summary metrics reveal a different aspect of non-scalar metrics. In the case of the SPD metric, across all the attribute-group pairs, the "summary" SPD value is the value with the largest deviation from the completely non-biased value of zero.
spdSummary = summary(spdMetric)
Summarize a given TestMetric class by inheriting from
mrm.data.validation.TestMetricWithSummaryValue class and implementing the abstract
summary method. This returns a metric of the same type with a singleton
Value. The meaning of the summary value - if it exists- depends on the metric, so there is no default implementation for this method. However, the protected
summaryCore method in
TestMetricWithSummaryValue may be helpful.
Write Test Thresholds
Test metrics are often compared against thresholds to qualitatively assess of the inputs. For example, a model validator might require that the area under ROC curve should be at least 0.8 for the model to be deemed acceptable, values under 0.7 are red flags, and values between 0.7 and 0.8 require a closer look.
Use Modelscape class
mrm.data.validation.TestThresholds to implement these thresholds. Encode the thresholds and classifications into a
aurocThresholds = mrm.data.validation.TestThresholds([0.7, 0.8], ["Fail", "Undecided", "Pass"]);
These thresholds and labels govern the output of the
status method of
TestThresholds. For example,
status(aurocThresholds, 0.72) returns the following.
Comment indicates the interval to which the given input belongs.
Implement thresholding regimes, with different narrative strings as Comments, or different diagnostics, as subclasses of
mrm.data.validation.TestThresholdsBase. Implement the
status method of the class to populate the
Diagnostics properties as required.
Write Statistical Hypothesis Tests
In some cases, notably in statistical hypothesis testing, the relevant quantity to compare against test thresholds is the associated p-value (under some relevant null hypothesis). In these cases, use the test metric class to override the
ComparisonValue method and return the p-value instead of the
Value of the metric. For an example, see the Modelscape implementation of the Augmented Dickey-Fuller test.
Set the thresholds against which to compare the p-values.
adfThreshold = mrm.data.validation.PValueThreshold(0.05)
TestThresholds object returns
status as "Reject" for p-values less than 0.05 and "Accept" otherwise.