## Wallaroo SDK Essentials Guide: Assays Management

How to create and manage Wallaroo Assays through the Wallaroo SDK

## Model Insights and Interactive Analysis Introduction

Wallaroo provides the ability to perform interactive analysis so organizations can explore the data from a pipeline and learn how the data is behaving. With this information and the knowledge of your particular business use case you can then choose appropriate thresholds for persistent automatic assays as desired.

• IMPORTANT NOTE

Model insights operates over time and is difficult to demo in a notebook without pre-canned data. We assume you have an active pipeline that has been running and making predictions over time and show you the code you may use to analyze your pipeline.

Monitoring tasks called assays monitors a model’s predictions or the data coming into the model against an established baseline. Changes in the distribution of this data can be an indication of model drift, or of a change in the environment that the model trained for. This can provide tips on whether a model needs to be retrained or the environment data analyzed for accuracy or other needs.

### Assay Details

Assays contain the following attributes:

Attribute Default Description
Name   The name of the assay. Assay names must be unique.
Baseline Data   Data that is known to be “typical” (typically distributed) and can be used to determine whether the distribution of new data has changed.
Schedule Every 24 hours at 1 AM New assays are configured to run a new analysis for every 24 hours starting at the end of the baseline period. This period can be configured through the SDK.
Group Results Daily Groups assay results into groups based on either Daily (the default), Weekly, or Monthly.
Metric PSI Population Stability Index (PSI) is an entropy-based measure of the difference between distributions. Maximum Difference of Bins measures the maximum difference between the baseline and current distributions (as estimated using the bins). Sum of the difference of bins sums up the difference of occurrences in each bin between the baseline and current distributions.
Threshold 0.1 The threshold for deciding whether the difference between distributions, as evaluated by the above metric, is large (the distributions are different) or small (the distributions are similar). The default of 0.1 is generally a good threshold when using PSI as the metric.
Number of Bins 5 Sets the number of bins that will be used to partition the baseline data for comparison against how future data falls into these bins. By default, the binning scheme is percentile (quantile) based. The binning scheme can be configured (see Bin Mode, below). Note that the total number of bins will include the set number plus the left_outlier and the right_outlier, so the total number of bins will be the total set + 2.
Bin Mode Quantile Set the binning scheme. Quantile binning defines the bins using percentile ranges (each bin holds the same percentage of the baseline data). Equal binning defines the bins using equally spaced data value ranges, like a histogram. Custom allows users to set the range of values for each bin, with the Left Outlier always starting at Min (below the minimum values detected from the baseline) and the Right Outlier always ending at Max (above the maximum values detected from the baseline).
Bin Weight Equally Weighted The bin weights can be either set to Equally Weighted (the default) where each bin is weighted equally, or Custom where the bin weights can be adjusted depending on which are considered more important for detecting model drift.

## Manage Assays via the Wallaroo SDK

### List Assays

Assays are listed through the client.list_assays method, and returns a List object.

The following example shows how to list assays:

wl.list_assays()

name active status warning_threshold alert_threshold pipeline_name
Sample Assay 03 True {“run_at”: “2022-12-13T21:08:12.289359005+00:00”, “num_ok”: 16, “num_warnings”: 0, “num_alerts”: 14 None 0.1 housepricepipe
Sample Assay 02 True {“run_at”: “2022-12-13T17:34:31.148302668+00:00”, “num_ok”: 15, “num_warnings”: 0, “num_alerts”: 14} None 0.1 housepricepipe
Sample Assay 01 True {“run_at”: “2022-12-13T17:30:18.779095344+00:00”, “num_ok”: 16, “num_warnings”: 0, “num_alerts”: 14} None 0.1 housepricepipe

### Build Assay Via the Wallaroo SDK

Assays are built with the Wallaroo client.build_assay(assayName, pipeline, modelName, baselineStart, baselineEnd), and returns the wallaroo.assay_config.AssayBuilder. The method requires the following parameters:

Parameter Type Description
assayName String The human friendly name of the created assay.
pipeline Wallaroo.pipeline The pipeline the assay is assigned to.
modelName String The model to perform the assay on.
baselineStart DateTime When to start the baseline period.
baselineStart DateTime When to end the baseline period.

When called, this method will then pool the pipeline between the baseline start and end periods to establish what values are considered normal outputs for the specified model.

Assays by default will run a new a new analysis every 24 hours starting at the end of the baseline period, using a 24 hour observation window.

In this example, an assay will be created named example assay and stored into the variable assay_builder.

import datetime
baseline_start = datetime.datetime.fromisoformat('2022-01-01T00:00:00+00:00')
baseline_end = datetime.datetime.fromisoformat('2022-01-02T00:00:00+00:00')
last_day = datetime.datetime.fromisoformat('2022-02-01T00:00:00+00:00')

assay_name = "example assay"
assay_builder = client.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)


### Schedule Assay

By default assays are scheduled to run every 24 hours starting immediately after the baseline period ends. This scheduled period is referred to as the assay window and has the following properties:

• width: The period of data included in the analysis. By default this is 24 hours.
• interval:
• How often the analysis is run (every 5 minutes, every 24 hours, etc). By default this is the window width.
• start: When the analysis should start. By default this is at the end of the baseline period.

These are adjusted through the assay window_builder method that includes the following methods:

• add_width: Sets the width of the window.
• add_interval: Sets how often the analysis is run.

In this example, the assay will be set to run an analysis every 12 hours on the previous 24 hours of data:

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)

assay_config = assay_builder.build()

assay_results = assay_config.interactive_run()
print(f"Generated {len(assay_results)} analyses")

Generated 59 analyses


### Perform Interactive Baseline

Interactive baselines can be run against an assay to generate a list of the values that are established in the baseline. This is done through the AssayBuilder.interactive_baseline_run() method, which returns the following:

Parameter Type Description
count Integer The number of records evaluated.
min Float The minimum value found
max Float The maximum value found
mean Float The mean value derived from the values evaluated.
median Float The median value derived from the values evaluated.
std Float The standard deviation from the values evaluated.
start DateTime The start date for the records to evaluate.
end DateTime The end date for the records to evaluate.

In this example, an interactive baseline will be run against a new assay, and the results displayed:

baseline_run = assay_builder.build().interactive_baseline_run()
baseline_run.baseline_stats()

Baseline
count                   1813
min                    11.95
max                    15.08
mean                   12.95
median                 12.91
std                     0.46
start   2022-01-01T00:00:00Z
end     2022-01-02T00:00:00Z


### Display Assay Graphs

Histogram, kernel density estimate (KDE), and Empirical Cumulative Distribution (ecdf) charts can be generated from an assay to provide a visual representation of the values evaluated and where they fit within the established baseline.

These methods are part of the AssayBuilder object and are as follows:

Method Description
baseline_histogram() Creates a histogram chart from the assay baseline.
baseline_kde() Creates a kernel density estimate (KDE) chart from the assay baseline.
baseline_ecdf() Creates an Empirical Cumulative Distribution (ecdf) from the assay baseline.

In this example, each of the three different charts will be generated from an assay:

assay_builder.baseline_histogram()

assay_builder.baseline_kde()

assay_builder.baseline_ecdf()


### Run Interactive Assay

Users can issue an assay to be run through an interactive assay instead of waiting for the next scheduled assay to run through the wallaroo.assay_config.interactive_run method. This is usually run through the wallaroo.client.build_assay method, which returns a wallaroo.assay_config.AssayBuilder object.

The following example creates the AssayBuilder object then runs an interactive assay.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end)
assay_results = assay_config.interactive_run()

assay_df = assay_results.to_dataframe()
assay_df

assay_id name iopath score start min max mean median std warning_threshold alert_threshold status
0 None output dense_2 0 0.00 2023-01-02T00:00:00+00:00 12.05 14.71 12.97 12.90 0.48 None 0.25 Ok
1 None output dense_2 0 0.09 2023-01-03T00:00:00+00:00 12.04 14.65 12.96 12.93 0.41 None 0.25 Ok
2 None output dense_2 0 0.04 2023-01-04T00:00:00+00:00 11.87 14.02 12.98 12.95 0.46 None 0.25 Ok
3 None output dense_2 0 0.06 2023-01-05T00:00:00+00:00 11.92 14.46 12.93 12.87 0.46 None 0.25 Ok
4 None output dense_2 0 0.02 2023-01-06T00:00:00+00:00 12.02 14.15 12.95 12.90 0.43 None 0.25 Ok
5 None output dense_2 0 0.03 2023-01-07T00:00:00+00:00 12.18 14.58 12.96 12.93 0.44 None 0.25 Ok
6 None output dense_2 0 0.02 2023-01-08T00:00:00+00:00 12.01 14.60 12.92 12.90 0.46 None 0.25 Ok
7 None output dense_2 0 0.04 2023-01-09T00:00:00+00:00 12.01 14.40 13.00 12.97 0.45 None 0.25 Ok
8 None output dense_2 0 0.06 2023-01-10T00:00:00+00:00 11.99 14.79 12.94 12.91 0.46 None 0.25 Ok
9 None output dense_2 0 0.02 2023-01-11T00:00:00+00:00 11.90 14.66 12.91 12.88 0.45 None 0.25 Ok
10 None output dense_2 0 0.02 2023-01-12T00:00:00+00:00 11.96 14.82 12.94 12.90 0.46 None 0.25 Ok
11 None output dense_2 0 0.03 2023-01-13T00:00:00+00:00 12.07 14.61 12.96 12.93 0.47 None 0.25 Ok
12 None output dense_2 0 0.15 2023-01-14T00:00:00+00:00 12.00 14.20 13.06 13.03 0.43 None 0.25 Ok
13 None output dense_2 0 2.92 2023-01-15T00:00:00+00:00 12.74 15.62 14.00 14.01 0.57 None 0.25 Alert
14 None output dense_2 0 7.89 2023-01-16T00:00:00+00:00 14.64 17.19 15.91 15.87 0.63 None 0.25 Alert
15 None output dense_2 0 8.87 2023-01-17T00:00:00+00:00 16.60 19.23 17.94 17.94 0.63 None 0.25 Alert
16 None output dense_2 0 8.87 2023-01-18T00:00:00+00:00 18.67 21.29 20.01 20.04 0.64 None 0.25 Alert
17 None output dense_2 0 8.87 2023-01-19T00:00:00+00:00 20.72 23.57 22.17 22.18 0.65 None 0.25 Alert
18 None output dense_2 0 8.87 2023-01-20T00:00:00+00:00 23.04 25.72 24.32 24.33 0.66 None 0.25 Alert
19 None output dense_2 0 8.87 2023-01-21T00:00:00+00:00 25.06 27.67 26.48 26.49 0.63 None 0.25 Alert
20 None output dense_2 0 8.87 2023-01-22T00:00:00+00:00 27.21 29.89 28.63 28.58 0.65 None 0.25 Alert
21 None output dense_2 0 8.87 2023-01-23T00:00:00+00:00 29.36 32.18 30.82 30.80 0.67 None 0.25 Alert
22 None output dense_2 0 8.87 2023-01-24T00:00:00+00:00 31.56 34.35 32.98 32.98 0.65 None 0.25 Alert
23 None output dense_2 0 8.87 2023-01-25T00:00:00+00:00 33.68 36.44 35.14 35.14 0.66 None 0.25 Alert
24 None output dense_2 0 8.87 2023-01-26T00:00:00+00:00 35.93 38.51 37.31 37.33 0.65 None 0.25 Alert
25 None output dense_2 0 3.69 2023-01-27T00:00:00+00:00 12.06 39.91 29.29 38.65 12.66 None 0.25 Alert
26 None output dense_2 0 0.05 2023-01-28T00:00:00+00:00 11.87 13.88 12.92 12.90 0.38 None 0.25 Ok
27 None output dense_2 0 0.10 2023-01-29T00:00:00+00:00 12.02 14.36 12.98 12.96 0.38 None 0.25 Ok
28 None output dense_2 0 0.11 2023-01-30T00:00:00+00:00 11.99 14.44 12.89 12.88 0.37 None 0.25 Ok
29 None output dense_2 0 0.01 2023-01-31T00:00:00+00:00 12.00 14.64 12.92 12.89 0.40 None 0.25 Ok

### Bins

As defined under Assay Details, bins can be adjusted by number of bins, bin mode, and bin weight.

#### Number of Bins

The number of bins can be changed from the default of 5 through the wallaroo.assay_config.summarizer_builder.add_num_buns method. Note that the total number of bins will include the set bins, plus the left_outlier and the right_outlier bins. So the total number of bins are the set number of bins + 2.

The following example shows how to change the number of bins to 10 in an assay, then the assay results displayed in a chart with the total bins of 12 total (10 manually set, 1 left_outlier, 1 right_outlier).

assay_builder = wl.build_assay("Test Assay", pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
display(display(assay_results[1].compare_bins()))
assay_results[1].chart()

b_edges b_edge_names b_aggregated_values b_aggregation w_edges w_edge_names w_aggregated_values w_aggregation diff_in_pcts
0 11.95 left_outlier 0.00 Density 11.95 left_outlier 0.00 Density 0.00
1 12.40 q_10 0.10 Density 12.40 e_1.24e1 0.10 Density 0.00
2 12.56 q_20 0.10 Density 12.56 e_1.26e1 0.09 Density -0.01
3 12.70 q_30 0.10 Density 12.70 e_1.27e1 0.09 Density -0.01
4 12.81 q_40 0.10 Density 12.81 e_1.28e1 0.10 Density 0.00
5 12.91 q_50 0.10 Density 12.91 e_1.29e1 0.12 Density 0.02
6 13.01 q_60 0.10 Density 13.01 e_1.30e1 0.08 Density -0.02
7 13.15 q_70 0.10 Density 13.15 e_1.31e1 0.12 Density 0.02
8 13.31 q_80 0.10 Density 13.31 e_1.33e1 0.09 Density -0.01
9 13.56 q_90 0.10 Density 13.56 e_1.36e1 0.11 Density 0.01
10 15.08 q_100 0.10 Density 15.08 e_1.51e1 0.09 Density -0.01
11 NaN right_outlier 0.00 Density NaN right_outlier 0.00 Density 0.00

#### Bin Mode

Assays support the following binning modes:

• BinMode.QUANTILE (Default): Defines the bins using percentile ranges (each bin holds the same percentage of the baseline data).
• BinMode.EQUAL defines the bins using equally spaced data value ranges, like a histogram.
• Custom aka BinMode.PROVIDED allows users to set the range of values for each bin, with the Left Outlier always starting at Min (below the minimum values detected from the baseline) and the Right Outlier always ending at Max (above the maximum values detected from the baseline). When using BinMode.PROVIDED the edges are passed as an array value.

Bin modes are set through the wallaroo.assay_config.summarizer_builder.add_bin_mode method.

The following examples will demonstrate changing the bin mode to equal, then setting custom provided values.

prefix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

assay_name = f"{prefix}example assay"

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
display(display(assay_results[0].compare_bins()))
assay_results[0].chart()

b_edges b_edge_names b_aggregated_values b_aggregation w_edges w_edge_names w_aggregated_values w_aggregation diff_in_pcts
0 12.00 left_outlier 0.00 Density 12.00 left_outlier 0.00 Density 0.00
1 12.60 p_1.26e1 0.24 Density 12.60 e_1.26e1 0.24 Density 0.00
2 13.19 p_1.32e1 0.49 Density 13.19 e_1.32e1 0.48 Density -0.02
3 13.78 p_1.38e1 0.22 Density 13.78 e_1.38e1 0.22 Density -0.00
4 14.38 p_1.44e1 0.04 Density 14.38 e_1.44e1 0.06 Density 0.02
5 14.97 p_1.50e1 0.01 Density 14.97 e_1.50e1 0.01 Density 0.00
6 NaN right_outlier 0.00 Density NaN right_outlier 0.00 Density 0.00
None

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Equal
aggregation = Density
metric = PSI
weighted = False
score = 0.011074287819376092
scores = [0.0, 7.3591419975306595e-06, 0.000773779195360713, 8.538514991838585e-05, 0.010207597078872246, 1.6725322721660374e-07, 0.0]
index = None

edges = [11.0, 12.0, 13.0, 14.0, 15.0, 16.0]
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
display(display(assay_results[0].compare_bins()))
assay_results[0].chart()

b_edges b_edge_names b_aggregated_values b_aggregation w_edges w_edge_names w_aggregated_values w_aggregation diff_in_pcts
0 11.00 left_outlier 0.00 Density 11.00 left_outlier 0.00 Density 0.00
1 12.00 e_1.20e1 0.00 Density 12.00 e_1.20e1 0.00 Density 0.00
2 13.00 e_1.30e1 0.62 Density 13.00 e_1.30e1 0.59 Density -0.03
3 14.00 e_1.40e1 0.36 Density 14.00 e_1.40e1 0.35 Density -0.00
4 15.00 e_1.50e1 0.02 Density 15.00 e_1.50e1 0.06 Density 0.03
5 16.00 e_1.60e1 0.00 Density 16.00 e_1.60e1 0.00 Density 0.00
6 NaN right_outlier 0.00 Density NaN right_outlier 0.00 Density 0.00
None

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Provided
aggregation = Density
metric = PSI
weighted = False
score = 0.0321620386600679
scores = [0.0, 0.0, 0.0014576920813015586, 3.549754401142936e-05, 0.030668849034754912, 0.0, 0.0]
index = None


#### Bin Weights

Bin weights can be adjusted so bins that that bins with more importance can be given more prominence in the final assay score. This is done through the wallaroo.assay_config.summarizer_builder.add_bin_weights, where the weights are assigned as array values matching the bins.

The following example has 10 bins (12 total including the left_outlier and the right_outlier bins), with weights assigned of 0 for the first six bins, 1 for the last six, and the resulting score from these weights.

weights = [0] * 6
weights.extend([1] * 6)
print("Using weights: ", weights)
assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
display(display(assay_results[1].compare_bins()))
assay_results[1].chart()

Using weights:  [0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1]

b_edges b_edge_names b_aggregated_values b_aggregation w_edges w_edge_names w_aggregated_values w_aggregation diff_in_pcts
0 12.00 left_outlier 0.00 Density 12.00 left_outlier 0.00 Density 0.00
1 12.41 q_10 0.10 Density 12.41 e_1.24e1 0.09 Density -0.00
2 12.55 q_20 0.10 Density 12.55 e_1.26e1 0.04 Density -0.05
3 12.72 q_30 0.10 Density 12.72 e_1.27e1 0.14 Density 0.03
4 12.81 q_40 0.10 Density 12.81 e_1.28e1 0.05 Density -0.05
5 12.88 q_50 0.10 Density 12.88 e_1.29e1 0.12 Density 0.02
6 12.98 q_60 0.10 Density 12.98 e_1.30e1 0.09 Density -0.01
7 13.15 q_70 0.10 Density 13.15 e_1.32e1 0.18 Density 0.08
8 13.33 q_80 0.10 Density 13.33 e_1.33e1 0.14 Density 0.03
9 13.47 q_90 0.10 Density 13.47 e_1.35e1 0.07 Density -0.03
10 14.97 q_100 0.10 Density 14.97 e_1.50e1 0.08 Density -0.02
11 NaN right_outlier 0.00 Density NaN right_outlier 0.00 Density 0.00
None

baseline mean = 12.940910643273655
window mean = 12.956829186961135
baseline median = 12.884286880493164
window median = 12.929338455200195
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = True
score = 0.012600694309416988
scores = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.00019654033061397393, 0.00850384373737565, 0.0015735766052488358, 0.0014437605903522511, 0.000882973045826275, 0.0]
index = None

/opt/homebrew/anaconda3/envs/arrowtests/lib/python3.8/site-packages/wallaroo/assay.py:315: UserWarning: FixedFormatter should only be used together with FixedLocator
ax.set_xticklabels(labels=edge_names, rotation=45)


### Metrics

The metric score is a distance or dis-similarity measure. The larger it is the less similar the two distributions are. The following metrics are supported:

• PSI: Population Stability Index
• SumDiff: The sum of differences
• MaxDiff: The maximum of differences.

The following coded sample shows each used.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = SumDiff
weighted = False
score = 0.025438649748041997
scores = [0.0, 0.009956893934794486, 0.006648048084512165, 0.01548175581324751, 0.006648048084512165, 0.012142553579017668, 0.0]
index = None

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = MaxDiff
weighted = False
score = 0.01548175581324751
scores = [0.0, 0.009956893934794486, 0.006648048084512165, 0.01548175581324751, 0.006648048084512165, 0.012142553579017668, 0.0]
index = 3


### Aggregation Options

Bin aggregation can be done in histogram Aggregation.DENSITY style (the default) where we count the number/percentage of values that fall in each bin or Empirical Cumulative Density Function style Aggregation.CUMULATIVE where we keep a cumulative count of the values/percentages that fall in each bin.

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Density
metric = PSI
weighted = False
score = 0.0029273068646199748
scores = [0.0, 0.000514261205558409, 0.0002139202456922972, 0.0012617897456473992, 0.0002139202456922972, 0.0007234154220295724, 0.0]
index = None

assay_builder = wl.build_assay(assay_name, pipeline, model_name, baseline_start, baseline_end).add_run_until(last_day)
assay_results = assay_builder.build().interactive_run()
assay_results[0].chart()

baseline mean = 12.940910643273655
window mean = 12.969964654406132
baseline median = 12.884286880493164
window median = 12.899214744567873
bin_mode = Quantile
aggregation = Cumulative
metric = PSI
weighted = False
score = 0.04419889502762442
scores = [0.0, 0.009956893934794486, 0.0033088458502823492, 0.01879060166352986, 0.012142553579017725, 0.0, 0.0]
index = None