.

.

Wallaroo SDK Essentials Guide

Reference Guide for the most essential Wallaroo SDK Commands

The following guides detail how to use the Wallaroo SDK. These include detailed instructions on classes, methods, and code examples.

Wallaroo JupyterHub Python Libraries

When using the Wallaroo SDK, it is recommended that the Python modules used are the same as those used in the Wallaroo JupyterHub environments to ensure maximum compatibility. When installing modules in the Wallaroo JupyterHub environments, do not override the following modules or versions, as that may impact how the JupyterHub environments performance.

appdirs == 1.4.4
gql == 3.4.0
ipython == 7.24.1
matplotlib == 3.5.0
numpy == 1.22.3
orjson == 3.8.0
pandas == 1.3.4
pyarrow == 9.0.0
PyJWT == 2.4.0
python_dateutil == 2.8.2
PyYAML == 6.0
requests == 2.25.1
scipy == 1.8.0
seaborn == 0.11.2
tenacity == 8.0.1
# Required by gql?
requests_toolbelt>=0.9.1<1
# Required by the autogenerated ML Ops client
httpx >= 0.15.4<0.24.0
attrs >= 21.3.0
# These are documented as part of the autogenerated ML Ops requirements
# python = ^3.7
# python-dateutil = ^2.8.0

Model and Framework Support

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Supported Data Types

The following data types are supported for transporting data to and from Wallaroo in the following run times:

  • ONNX
  • TensorFlow
  • MLFlow

Data Type Conditions

The following conditions apply to data types used in inference requests.

  • None or Null data types are not submitted. All fields must have submitted values that match their data type. For example, if the schema expects a float value, then some value of type float must be submitted and can not be None or Null. If a schema expects a string value, then some value of type string must be submitted, etc.
  • datetime data types must be converted to string.
  • ONNX models support multiple inputs only of the same data type.
RuntimeBFloat16*Float16Float32Float64
ONNXXX
TensorFlowXXX
MLFlowXXX
  • * (Brain Float 16, represented internally as a f32)

RuntimeInt8Int16Int32Int64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeUint8Uint16Uint32Uint64
ONNXXXXX
TensorFlowXXXX
MLFlowXXXX
RuntimeBooleanUtf8 (String)Complex 64Complex 128FixedSizeList*
ONNXX
TensorXXX
MLFlowXXX
  • * Fixed sized lists of any of the previously supported data types.

1 - Wallaroo SDK Essentials Guide: Client Connection

How to connect to a Wallaroo instance through the Wallaroo SDK

Users connect to a Wallaroo instance with the Wallaroo Client class. This connection can be made from within the Wallaroo instance, or external from the Wallaroo instance via the Wallaroo SDK.

The following methods are supported in connecting to the Wallaroo instance:

  • Connect from Within the Wallaroo Instance: Connect within the JupyterHub service or other method within the Kubernetes cluster hosting the Wallaroo instance. This requires confirming the connections with the Wallaroo instance through a browser link.
  • Connect from Outside the Wallaroo Instance: Connect via the Wallaroo SDK via an external connection to the Kubernetes cluster hosting the Wallaroo instance. This requires confirming the connections with the Wallaroo instance through a browser link.
  • Automated Connection: Connect to the Wallaroo instance by providing the username and password directly into the request. This bypasses confirming the connections with the Wallaroo instance through a browser link.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc. To use the broker, select it from the list under the username/password login forms. For more information on Wallaroo authentication configurations, see the Wallaroo Authentication Configuration Guides.

Wallaroo Login

Once authenticated, the user will verify adding the device the user is establishing the connection from. Once both steps are complete, then the connection is granted.

Device Registration

Connect from Within the Wallaroo Instance

Users who connect from within their Wallaroo instance’s Kubernetes environment, such as through the Wallaroo provided JupyterHub service, will be authenticated with the Wallaroo Client() method.

The first step in using Wallaroo is creating a connection. To connect to your Wallaroo environment:

  1. Import the wallaroo library:

    import wallaroo
    
  2. Open a connection to the Wallaroo environment with the wallaroo.Client() command and save it to a variable.

    In this example, the Wallaroo connection is saved to the variable wl.

    wl = wallaroo.Client()
    
  3. A verification URL will be displayed. Enter it into your browser and grant access to the SDK client.

    Wallaroo Confirm Connection
  4. Once this is complete, you will be able to continue with your Wallaroo commands.

    Wallaroo Connection Example

Connect from Outside the Wallaroo Instance

Users who have installed the Wallaroo SDK from an external location, such as their own JupyterHub service, Google Workbench, or other services can connect via Single-Sign On (SSO). This is accomplished using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type command="sso") command that connects to the Wallaroo instance services. For more information on the DNS names of Wallaroo services, see the DNS Integration Guide.

Before performing this step, verify that SSO is enabled for the specific service. For more information, see the Wallaroo Authentication Configuration Guide.

The Client method takes the following parameters:

  • api_endpoint (String): The URL to the Wallaroo instance API service.
  • auth_endpoint (String): The URL to the Wallaroo instance Keycloak service.
  • auth_type command (String): The authorization type. In this case, SSO.

Once run, the wallaroo.Client command provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. This connection is stored into a variable that can be referenced later.

In this example, a connection will be made to the Wallaroo instance shadowy-unicorn-5555.wallaroo.ai through SSO authentication.

import wallaroo
from wallaroo.object import EntityNotFoundError

# SSO login through keycloak

wl = wallaroo.Client(api_endpoint="https://shadowy-unicorn-5555.api.wallaroo.ai", 
                    auth_endpoint="https://shadowy-unicorn-5555.keycloak.wallaroo.ai", 
                    auth_type="sso")
Please log into the following URL in a web browser:

    https://shadowy-unicorn-5555.keycloak.wallaroo.example.com/auth/realms/master/device?user_code=LGZP-FIQX

Login successful!

Automated Connection

Users can connect either internally or externally without confirming the connection via a browser link using the wallaroo.Client(api_endpoint, auth_endpoint, auth_type="user_password") command for external connections, and wallaroo.Client(auth_type="user_password") with internal connections.

The auth_type="user_password" parameter requires either the environment parameter WALLAROO_SDK_CREDENTIALS with the following settings:

{
    "username": "{Connecting User's Username}", 
    "password": "{Connecting User's Password}", 
    "email": "{Connecting User's Email Address}"
}

The other option is to provide the environment variables WALLAROO_USER and WALLAROO_PASSWORD:

WALLAROO_USER={Connecting User's Username}
WALLAROO_PASSWORD={Connecting User's Password}

In typical installations, the username and email settings will both be the user’s email address.

For example, if the username is steve, the password is hello and the email is steve@ex.co then the ``WALLAROO_SDK_CREDENTIALS` can be set in the following ways:

# Import via file
os.environ["WALLAROO_SDK_CREDENTIALS"] = 'creds.json'
wl = wallaroo.Client(auth_type="user_password")

The other method:

# Set directly
os.environ["WALLAROO_USER"] = 'username@company.com'
os.environ["WALLAROO_PASSWORD"] = 'password'
wl = wallaroo.Client(auth_type="user_password")

For automated connections, using the environment options tied into a specific file with minimum access is recommended.

The following example shows connecting to a remote Wallaroo instance via the auth_type="user_password" parameter with the credentials stored in the creds.json file using the format above:

wallarooPrefix = "wallaroo"
wallarooSuffix = "example.com"

os.environ["WALLAROO_SDK_CREDENTIALS"] = 'creds.json'

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}.api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}.keycloak.{wallarooSuffix}", 
                    auth_type="user_password")

2 - Wallaroo SDK Essentials Guide: Data Connections Management

How to create and manage Wallaroo Data Connections through the Wallaroo SDK

Wallaroo Data Connections are provided to establish connections to external data stores for requesting or submitting information. They provide a source of truth for data source connection information to enable repeatability and access control within your ecosystem. The actual implementation of data connections are managed through other means, such as Wallaroo Pipeline Orchestrators and Tasks, where the libraries and other tools used for the data connection can be stored.

Wallaroo Data Connections have the following properties:

  • Available across the Wallaroo Instance: Data Connections are created at the Wallaroo instance level.
  • Tied to a Wallaroo Workspace: Data Connections, like pipeline and models, are tied to a workspace. This allows organizations to limit data connection access by restricting users to specific workspaces.
  • Support different types: Data Connections support various types of connections such as ODBC, Kafka, etc.

Create Data Connection

Data Connections are created through the Wallaroo Client create_connection(name, type, details) method.

ParameterTypeDescription
namestring (Required)The name of the connection. Names must be unique. Attempting to create a connection with the same name as an existing connection will cause an error.
typestring (Required)The user defined type of connection.
detailsDict (Required)User defined configuration details for the data connection. These can be {'username':'dataperson', 'password':'datapassword', 'port': 3339}, or {'token':'abcde123==', 'host':'example.com', 'port:1234'}, or other user defined combinations.

The SDK allows the data connections to be fully defined and stored for later use. This allows whatever type of data connection the organization uses to be defined, then applied to a workspace for other Wallaroo users to integrate into their code.

When data connections are displayed via the Wallaroo Client list_connections, the details field is removed to not show sensitive information by default.

wl.create_connection("houseprice_arrow_table", 
                  "HTTPFILE", 
                  {'host':'https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-testing-tutorials/houseprice-saga/data/xtest-1k.arrow?raw=true'}
                  )
FieldValue
Namehouseprice_arrow_table
Connection TypeHTTPFILE
Details*****
Created At2023-05-04T17:52:32.249322+00:00

Get Connection By Name

The Wallaroo client get_connection(name) method retrieves the connection with the Connection name matching the name parameter.

ParameterTypeDescription
namestring (Required)The name of the connection.

In the following example, the connection name external_inference_connection will be retrieved and stored into the variable inference_source_connection.

inference_source_connection = wl.get_connection(name="external_inference_connection")
display(inference_source_connection)
FieldValue
Nameexternal_inference_connection
Connection TypeHTTPFILE
Details*****
Created At2023-05-08T20:10:07.914083+00:00

List Data Connections

The Wallaroo Client list_connections() method lists all connections for the Wallaroo instance. When data connections are displayed via the Wallaroo Client list_connections, the details field is removed to not show sensitive information by default.

wl.list_connections()
nameconnection typedetailscreated at
houseprice_arrow_tableHTTPFILE*****2023-05-04T17:52:32.249322+00:00

Add Data Connection to Workspace

The method Workspace add_connection(connection_name) adds a Data Connection to a workspace, and takes the following parameters.

ParameterTypeDescription
namestring (Required)The name of the Data Connection

Connection Details

The Connection method details() retrieves a the connection details() as a dict.

display(connection.details())

{'host': 'https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-testing-tutorials/houseprice-saga/data/xtest-1k.arrow?raw=true'}

Remove Connection from Workspace

The Workspace method remove_connection(connection_name) removes the connection from the workspace, but does not delete the connection from the Wallaroo instance. This method takes the following parameters.

ParameterTypeDescription
nameString (Required)The name of the connection to be removed from the workspace.

Delete Connection

The Connection method delete_connection() removes the connection from the Wallaroo instance.

Before deleting a connection, it must be removed from all workspaces that it is attached to.

3 - Wallaroo SDK Essentials Guide: Workspace Management

How to create and use Wallaroo Workspaces through the Wallaroo SDK

Workspace Management

Workspaces are used to segment groups of models into separate environments. This allows different users to either manage or have access to each workspace, controlling the models and pipelines assigned to the workspace.

Workspace Naming Requirements

Workspace names map onto Kubernetes objects, and must be DNS compliant. Workspace names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Create a Workspace

Workspaces can be created either through the Wallaroo Dashboard or through the Wallaroo SDK.

  • IMPORTANT NOTICE

    Workspace names are not forced to be unique. You can have 50 workspaces all named my-amazing-workspace, which can cause confusion in determining which workspace to use.

    It is recommended that organizations agree on a naming convention and select the workspace to use rather than creating a new one each time.

To create a workspace, use the create_workspace("{WORKSPACE NAME}") command through an established Wallaroo connection and store the workspace settings into a new variable. Once the new workspace is created, the user who created the workspace is assigned as its owner. The following template is an example:

{New Workspace Variable} = {Wallaroo Connection}.create_workspace("{New Workspace Name}")

For example, if the connection is stored in the variable wl and the new workspace will be named imdb, then the command to store it in the new_workspace variable would be:

new_workspace = wl.create_workspace("imdb-workspace")

List Workspaces

The command list_workspaces() displays the workspaces that are part of the current Wallaroo connection. The following details are returned as an array:

ParameterTypeDescription
NameStringThe name of the workspace. Note that workspace names are not unique.
Created AtDateTimeThe date and time the workspace was created.
UsersArray[Users]A list of all users assigned to this workspace.
ModelsIntegerThe number of models uploaded to the workspace.
PipelinesIntegerThe number of pipelines in the environment.

For example, for the Wallaroo connection wl the following workspaces are returned:

wl.list_workspaces()
Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Get Current Workspace

The command get_current_workspace displays the current workspace used for the Wallaroo connection. The following information is returned by default:

ParameterTypeDescription
nameStringThe name of the current workspace.
idIntegerThe ID of the current workspace.
archivedBoolWhether the workspace is archived or not.
created_byStringThe identifier code for the user that created the workspace.
created_atDateTimeWhen the timestamp for when workspace was created.
modelsArray[Models]The models that are uploaded to this workspace.
pipelinesArray[Pipelines]The pipelines created for the workspace.

For example, the following will display the current workspace for the wl connection that contains a single pipeline and multiple models:

wl.get_current_workspace()
{'name': 'imdb-workspace', 'id': 6, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-30T17: 09: 23.960406+00: 00', 'models': [
        {'name': 'embedder-o', 'version': '6dbe5524-7bc3-4ff3-8ca8-d454b2cbd0e4', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            18,
            321105, tzinfo=tzutc())
        },
        {'name': 'smodel-o', 'version': '6eb7f824-3d77-417f-9169-6a301d20d842', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            18,
            783485, tzinfo=tzutc())
        }
    ], 'pipelines': [
        {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022,
            3,
            30,
            17,
            34,
            19,
            318819, tzinfo=tzutc()), 'definition': '[]'
        }
    ]
}

Set the Current Workspace

The current workspace can be set through set_current_workspace for the Wallaroo connection through the following call, and returns the workspace details as a JSON object:

{Wallaroo Connection}.set_current_workspace({Workspace Object})

Set Current Workspace from a New Workspace

The following example creates the workspace imdb-workspace through the Wallaroo connection stored in the variable wl, then sets it as the current workspace:

new_workspace = wl.create_workspace("imdb-workspace")
wl.set_current_workspace(new_workspace)
{'name': 'imdb-workspace', 'id': 7, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-30T17:43:09.405038+00:00', 'models': [], 'pipelines': []}

Set the Current Workspace an Existing Workspace

To set the current workspace from an established workspace, the easiest method is to use list_workspaces() then set the current workspace as the array value displayed. For example, from the following list_workspaces() command the 3rd workspace element demandcurve-workspace can be assigned as the current workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

wl.set_current_workspace(wl.list_workspaces()[2])

{'name': 'demandcurve-workspace', 'id': 3, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:21:32.732178+00:00', 'models': [{'name': 'demandcurve', 'version': '4f5193fc-9c18-4851-8489-42e61d095588', 'file_name': 'demand_curve_v1.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 32, 822812, tzinfo=tzutc())}, {'name': 'preprocess', 'version': '159b9e99-edb6-4c5e-8336-63bc6000623e', 'file_name': 'preprocess.py', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 32, 984117, tzinfo=tzutc())}, {'name': 'postprocess', 'version': '77ee154c-d64c-49dd-985a-96f4c2931b6e', 'file_name': 'postprocess.py', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 21, 33, 119037, tzinfo=tzutc())}], 'pipelines': [{'name': 'demand-curve-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 21, 33, 264321, tzinfo=tzutc()), 'definition': '[]'}]}

Add a User to a Workspace

Users are added to the workspace via their email address through the wallaroo.workspace.Workspace.add_user({email address}) command. The email address must be assigned to a current user in the Wallaroo platform before they can be assigned to the workspace.

For example, the following workspace imdb-workspace has the user steve@ex.co. We will add the user john@ex.co to this workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.add_user("john@ex.co")

{'name': 'imdb-workspace', 'id': 4, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:23:08.742676+00:00', 'models': [{'name': 'embedder-o', 'version': '23a33c3d-68e6-4bdb-a8bc-32ea846908ee', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 8, 833716, tzinfo=tzutc())}, {'name': 'smodel-o', 'version': '2c298aa9-be9d-482d-8188-e3564bdbab43', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 9, 49881, tzinfo=tzutc())}], 'pipelines': [{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 23, 28, 518946, tzinfo=tzutc()), 'definition': '[]'}]}

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Remove a User to a Workspace

Removing a user from a workspace is performed through the wallaroo.workspace.Workspace.remove_user({email address}) command, where the {email address} matches a user in the workspace.

In the following example, the user john@ex.co is removed from the workspace imdb-workspace.

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.remove_user("john@ex.co")

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

Add a Workspace Owner

To update the owner of workspace, or promote an existing user of a workspace to the owner of workspace, use the wallaroo.workspace.Workspace.add_owner({email address}) command. The email address must be assigned to a current user in the Wallaroo platform before they can be assigned as the owner to the workspace.

The following example shows assigning the user john@ex.co as an owner to the workspace imdb-workspace:

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

current_workspace = wl.list_workspaces()[3]

current_workspace.add_owner("john@ex.co")

{'name': 'imdb-workspace', 'id': 4, 'archived': False, 'created_by': '45e6b641-fe57-4fb2-83d2-2c2bd201efe8', 'created_at': '2022-03-29T20:23:08.742676+00:00', 'models': [{'name': 'embedder-o', 'version': '23a33c3d-68e6-4bdb-a8bc-32ea846908ee', 'file_name': 'embedder.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 8, 833716, tzinfo=tzutc())}, {'name': 'smodel-o', 'version': '2c298aa9-be9d-482d-8188-e3564bdbab43', 'file_name': 'sentiment_model.onnx', 'last_update_time': datetime.datetime(2022, 3, 29, 20, 23, 9, 49881, tzinfo=tzutc())}], 'pipelines': [{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 29, 20, 23, 28, 518946, tzinfo=tzutc()), 'definition': '[]'}]}

wl.list_workspaces()

Name Created At Users Models Pipelines
aloha-workspace 2022-03-29 20:15:38 ['steve@ex.co'] 1 1
ccfraud-workspace 2022-03-29 20:20:55 ['steve@ex.co'] 1 1
demandcurve-workspace 2022-03-29 20:21:32 ['steve@ex.co'] 3 1
imdb-workspace 2022-03-29 20:23:08 ['steve@ex.co', 'john@ex.co'] 2 1
aloha-workspace 2022-03-29 20:33:54 ['steve@ex.co'] 1 1
imdb-workspace 2022-03-30 17:09:23 ['steve@ex.co'] 2 1
imdb-workspace 2022-03-30 17:43:09 ['steve@ex.co'] 0 0

4 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations

How to create and manage Wallaroo Models Uploads through the Wallaroo SDK

Models are uploaded or registered to a Wallaroo workspace depending on the model framework and version.

Wallaroo Engine Runtimes

Pipeline deployment configurations provide two runtimes to run models in the Wallaroo engine:

  • Native Runtimes: Models that are deployed “as is” with the Wallaroo engine. These are:

    • ONNX
    • Python step
    • Tensorflow 2.9.1 in SavedModel format
  • Containerized Runtimes: Containerized models such as MLFlow or Arbitrary Python. These are run in the Wallaroo engine in their containerized form.

  • Non-Native Runtimes: Models that when uploaded are either converted to a native Wallaroo runtime, or are containerized so they can be run in the Wallaroo engine. When uploaded, Wallaroo will attempt to convert it to a native runtime. If it can not be converted, then it will be packed in a Wallaroo containerized model based on its framework type.

    Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Models and Runtimes

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

4.1 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: ONNX

How to upload and use ONNX ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]

Upload ONNX Model to Wallaroo

Open Neural Network eXchange(ONNX) is the default model runtime supported by Wallaroo. ONNX models are uploaded to the current workspace through the Wallaroo Client upload_model(name, path, framework, input_schema, output_schema).configure(options). When uploading a default ML Model that matches the default Wallaroo runtime, the configure(options) can be left empty or the framework onnx specified.

Uploading ONNX Models

ONNX models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload ONNX Model Parameters

The following parameters are required for ONNX models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a ONNX model to Wallaroo.

For ONNX models, the input_schema and output_schema are not required so are not listed here.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.ONNX.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload ONNX Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
SHAstringThe hash value of the model file.
StatusstringThe status of the model. Values include:
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

For example:

model_name = "embedder-o"
model_path = "./embedder.onnx"

embedder = wl.upload_model(model_name, model_path, Framework=Framework.ONNX).configure("onnx")

ONNX Conversion Tips

When converting from one ML model type to an ONNX ML model, the input and output fields should be specified so users anticipate the exact field names used in their code. This prevents conversion naming formats from creating unintended names, and sets consistent field names that can be relied upon in future code updates.

The following example shows naming the input and output names when converting from a PyTorch model to an ONNX model. Note that the input fields are set to data, and the output fields are set to output_names = ["bounding-box", "classification","confidence"].

input_names = ["data"]
output_names = ["bounding-box", "classification","confidence"]
torch.onnx.export(model,
                    tensor,
                    pytorchModelPath+'.onnx',
                    input_names=input_names,
                    output_names=output_names,
                    opset_version=17,
                    )

See the documentation for the specific ML model being converting from to ONNX for complete details.

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

4.2 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Arbitrary Python

How to upload and use Containerized MLFlow with Wallaroo

Arbitrary Python or BYOP (Bring Your Own Predict) allows organizations to use Python scripts and supporting libraries as it’s own model. Similar to using a Python step, arbitrary python is an even more robust and flexible tool for working with ML Models in Wallaroo pipelines.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

Upload Arbitrary Python Model

Arbitrary Python models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Arbitrary Python Model Parameters

The following parameters are required for Arbitrary Python models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Arbitrary Python model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, Arbitrary Python model Required)Set as Framework.CUSTOM.
input_schemapyarrow.lib.Schema (Upload Method Optional, Arbitrary Python model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, Arbitrary Python model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, Arbitrary Python model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Arbitrary Python Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload Arbitrary Python Model Example

The following example is of uploading a Arbitrary Python VGG16 Clustering ML Model to a Wallaroo instance.

Arbitrary Python Script Example

The following is an example script that fulfills the requirements for a Wallaroo Arbitrary Python Model, and would be saved as custom_inference.py.

"""This module features an example implementation of a custom Inference and its
corresponding InferenceBuilder."""

import pathlib
import pickle
from typing import Any, Set

import tensorflow as tf
from mac.config.inference import CustomInferenceConfig
from mac.inference import Inference
from mac.inference.creation import InferenceBuilder
from mac.types import InferenceData
from sklearn.cluster import KMeans


class ImageClustering(Inference):
    """Inference class for image clustering, that uses
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and performs clustering on a trained KMeans model.

    Attributes:
        - feature_extractor: The embedding model we will use
        as a feature extractor (i.e. a trained VGG16).
        - expected_model_types: A set of model instance types that are expected by this inference.
        - model: The model on which the inference is calculated.
    """

    def __init__(self, feature_extractor: tf.keras.Model):
        self.feature_extractor = feature_extractor
        super().__init__()

    @property
    def expected_model_types(self) -> Set[Any]:
        return {KMeans}

    @Inference.model.setter  # type: ignore
    def model(self, model) -> None:
        """Sets the model on which the inference is calculated.

        :param model: A model instance on which the inference is calculated.

        :raises TypeError: If the model is not an instance of expected_model_types
            (i.e. KMeans).
        """
        self._raise_error_if_model_is_wrong_type(model) # this will make sure an error will be raised if the model is of wrong type
        self._model = model

    def _predict(self, input_data: InferenceData) -> InferenceData:
        """Calculates the inference on the given input data.
        This is the core function that each subclass needs to implement
        in order to calculate the inference.

        :param input_data: The input data on which the inference is calculated.
        It is of type InferenceData, meaning it comes as a dictionary of numpy
        arrays.

        :raises InferenceDataValidationError: If the input data is not valid.
        Ideally, every subclass should raise this error if the input data is not valid.

        :return: The output of the model, that is a dictionary of numpy arrays.
        """

        # input_data maps to the input_schema we have defined
        # with PyArrow, coming as a dictionary of numpy arrays
        inputs = input_data["images"]

        # Forward inputs to the models
        embeddings = self.feature_extractor(inputs)
        predictions = self.model.predict(embeddings.numpy())

        # Return predictions as dictionary of numpy arrays
        return {"predictions": predictions}


class ImageClusteringBuilder(InferenceBuilder):
    """InferenceBuilder subclass for ImageClustering, that loads
    a pre-trained VGG16 model on cifar10 as a feature extractor
    and a trained KMeans model, and creates an ImageClustering object."""

    @property
    def inference(self) -> ImageClustering:
        return ImageClustering

    def create(self, config: CustomInferenceConfig) -> ImageClustering:
        """Creates an Inference subclass and assigns a model and additionally
        needed attributes to it.

        :param config: Custom inference configuration. In particular, we're
        interested in `config.model_path` that is a pathlib.Path object
        pointing to the folder where the model artifacts are saved.
        Every artifact we need to load from this folder has to be
        relative to `config.model_path`.

        :return: A custom Inference instance.
        """
        feature_extractor = self._load_feature_extractor(
            config.model_path / "feature_extractor.h5"
        )
        inference = self.inference(feature_extractor)
        model = self._load_model(config.model_path / "kmeans.pkl")
        inference.model = model

        return inference

    def _load_feature_extractor(
        self, file_path: pathlib.Path
    ) -> tf.keras.Model:
        return tf.keras.models.load_model(file_path)

    def _load_model(self, file_path: pathlib.Path) -> KMeans:
        with open(file_path.as_posix(), "rb") as fp:
            model = pickle.load(fp)
        return model

The following is the requirements.txt file that would be included in the arbitrary python ZIP file. It is highly recommended to use the same requirements.txt file for setting the libraries and versions used to create the model in the arbitrary python ZIP file.

tensorflow==2.8.0
scikit-learn==1.2.2

Upload Upload Arbitrary Python Example

The following example demonstrates uploading the arbitrary python model as vgg_clustering.zip with the following input and output schemas defined.

input_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=32
        ),
        list_size=32
    )),
])

output_schema = pa.schema([
    pa.field('predictions', pa.int64()),
])

model = wl.upload_model(
                        'vgg16-clustering', 
                        'vgg16_clustering.zip', 
                        framework=Framework.CUSTOM, 
                        input_schema=input_schema, 
                        output_schema=output_schema, 
                        convert_wait=True
                    )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.................Ready.

{
    'name': 'vgg16-clustering', 
    'version': '819114a4-c1a4-43b4-b66b-a486e05a867f', 
    'file_name': 'model-auto-conversion_BYOP_vgg16_clustering.zip', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3443', 
    'last_update_time': datetime.datetime(2023, 6, 28, 16, 54, 38, 299848, tzinfo=tzutc())
}

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This model will always run in the containerized runtime space.

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.3 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Containerized MLFlow

How to upload and use Containerized MLFlow with Wallaroo
ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Register a Containerized MLFlow Model

Containerized MLFlow models are not uploaded, but registered from a container registry service. This is performed through the Wallaroo Client .register_model_image(name, image).configure(options) method. For the options, the following must be defined:

  • runtime: Set as mlflow.
  • input_schema: The input schema from the Apache Arrow pyarrow.lib.Schema format.
  • output_schema: The output schema from the Apache Arrow pyarrow.lib.Schema format.

For example:

sm_input_schema = pa.schema([
  pa.field('temp', pa.float32()),
  pa.field('holiday', pa.uint8()),
  pa.field('workingday', pa.uint8()),
  pa.field('windspeed', pa.float32())
])

sm_output_schema = pa.schema([
    pa.field('predicted_mean', pa.float32())
])

statsmodelUrl = "ghcr.io/wallaroolabs/wallaroo_tutorials/mlflow-statsmodels-example:2023.1"

sm_model = wl.register_model_image(
    name=f"{prefix}statmodels",
    image=f"{statsmodelUrl}"
    ).configure("mlflow", 
            input_schema=sm_input_schema, 
            output_schema=sm_output_schema
    )

MLFlow Data Formats

When using containerized MLFlow models with Wallaroo, the inputs and outputs must be named. For example, the following output:

[-12.045839810372835]

Would need to be wrapped with the data values named:

[{"prediction": -12.045839810372835}]

A short sample code for wrapping data may be:

output_df = pd.DataFrame(prediction, columns=["prediction"])
return output_df

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This model will always run in the containerized runtime space.

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.4 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Model Registry Services

How to upload and use Registry ML Models with Wallaroo

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.2.1 (July 2023)1.12.18173
2023.2 (May 2023)1.12.18173
2023.1 (March 2023)1.12.18173
2022.4 (December 2022)1.12.18173
After April 2022 until release 2022.4 (December 2022)1.10.*7152
Before April 20221.6.*7132

For the most recent release of Wallaroo 2023.2.1, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the native runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka mlflow

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model Optional[Any]An optional list of models that match the supported frameworks from wallaroo.framework.Framework included in the arbitrary python script. Note that this is optional - no models are actually required. A BYOP can refer to a specific model(s) used, be used for data processing and reshaping for later pipeline steps, or other needs.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when expected_model_types is not set.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.30.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

RoleRecommended ActionsDescription
DevOps EngineerCreate Model RegistryCreate the model (AKA artifact) registry service
 Retrieve Model Registry TokensGenerate the model registry service credentials.
MLOps EngineerConnect Model Registry to WallarooAdd the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
 Add Wallaroo Registry Service to WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
Data ScientistList Registries in a WorkspaceList registries available from a workspace.
 List Models in RegistryList available models in a model registry.
 List Model Versions of Registered ModelList versions of a registry stored model.
 List Model Version ArtifactsRetrieve the artifacts (usually files) for a model stored in a model registry.
 Upload Model from RegistryUpload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

  1. Log into the Azure Databricks workspace.
  2. From the upper right corner access the User Settings.
  3. From the Access tokens, select Generate new token.
  4. Specify any token description and lifetime. Once complete, select Generate.
  5. Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.
Retrieve Azure Databricks User Token

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Guides.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

  • Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
  • Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
  • Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the Wallaroo.Client.create_model_registry method.

Connect Model Registry to Wallaroo Parameters

ParameterTypeDescription
namestring (Required)The name of the MLFlow Registry service.
tokenstring (Required)The authentication token used to authenticate to the MLFlow Registry.
urlstring (Required)The URL of the MLFlow registry service.

Connect Model Registry to Wallaroo Return

The following is returned when a MLFlow Registry is successfully created.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Note that the token is not displayed for security reasons.

Connect Model Registry to Wallaroo Example

The following example creates a Wallaroo MLFlow Registry with the name ExampleNotebook stored in a sample Azure DataBricks environment.

wl.create_model_registry(name="ExampleNotebook", 
                        token="abcdefg-3", 
                        url="https://abcd-123489.456.azuredatabricks.net")
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Add Registry to Workspace

Registries are assigned to a Wallaroo workspace with the Wallaroo.registry.add_registry_to_workspace method. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

Add Registry to Workspace Parameters

ParameterTypeDescription
namestring (Required)The numerical identifier of the workspace.

Add Registry to Workspace Returns

The following is returned when a MLFlow Registry is successfully added to a workspace.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Example

registry.add_registry_to_workspace(workspace_id=workspace_id)
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Remove Registry from Workspace

Registries are removed from a Wallaroo workspace with the Registry remove_registry_from_workspace method.

Remove Registry from Workspace Parameters

ParameterTypeDescription
workspace_idInteger (Required)The numerical identifier of the workspace.

Remove Registry from Workspace Return

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList(string)A list of workspaces by name that still contain the registry.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Remove Registry from Workspace Example

registry.remove_registry_from_workspace(workspace_id=workspace_id)
FieldValue
NameJeffRegistry45
URLhttps://sample.registry.azuredatabricks.net
Workspacesjohn.hummel@wallaroo.ai - Default Workspace
Created At2023-17-Jul 17:56:52
Updated At2023-17-Jul 17:56:52

Wallaroo Registry Model Operations

  • List Registries in a Workspace: List the available registries in the current workspace.
  • List Models: List Models in a Registry
  • Upload Model: Upload a version of a ML Model from the Registry to a Wallaroo workspace.
  • List Model Versions: List the versions of a particular model.
  • Remove Registry from Workspace: Remove a specific Registry configuration from a specific workspace.

List Registries in a Workspace

Registries associated with a workspace are listed with the Wallaroo.Client.list_model_registries() method. This lists all registries associated with the current workspace.

List Registries in a Workspace Parameters

None

List Registries in a Workspace Returns

A List of Registries with the following fields.

FieldTypeDescription
NamestringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Registries in a Workspace Example

wl.list_model_registries()
nameregistry urlcreated atupdated at
gibhttps://sampleregistry.wallaroo.ai2023-27-Jun 03:22:462023-27-Jun 03:22:46
ExampleNotebookhttps://sampleregistry.wallaroo.ai2023-27-Jun 13:57:262023-27-Jun 13:57:26

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the Wallaroo.Registry.list_models() method.

List Models in a Registry Parameters

None

List Models in a Registry Returns

A List of models with the following fields.

FieldTypeDescription
NamestringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Models in a Registry Example

registry.list_models()
NameRegistry UserVersionsCreated AtUpdated At
testmodelsample.user@wallaroo.ai02023-16-Jun 14:38:422023-16-Jun 14:38:42
testmodel2sample.user@wallaroo.ai02023-16-Jun 14:41:042023-16-Jun 14:41:04
wine_qualitysample.user@wallaroo.ai22023-16-Jun 15:05:532023-16-Jun 15:09:57

Retrieve Specific Model Details from the Registry

Model details are retrieved by assigning a MLFlow Registry Model to an object with the Wallaroo.Registry.list_models(), then specifying the element in the list to save it to a Registered Model object.

The following will return the most recent model added to the MLFlow Registry service.

mlflow_model = registry.list_models()[-1]
mlflow_model
FieldTypeDescription
NamestringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Model Versions of Registered Model

MLFlow registries can contain multiple versions of a ML Model. These are listed and are listed with the Registered Model versions attribute. The versions are listed in reverse order of insertion, with the most recent model version in position 0.

List Model Versions of Registered Model Parameters

None

List Model Versions of Registered Model Returns

A List of the Registered Model Versions with the following fields.

FieldTypeDescription
NamestringThe name of the model.
VersionintThe version number. The higher numbers are the most recent.
DescriptionstringThe registered model’s description from the MLFlow Registry service.

List Model Versions of Registered Model Example

The following will return the most recent model added to the MLFlow Registry service and list its versions.

mlflow_model = registry.list_models()[-1]
mlflow_model.versions
NameVersionDescription
wine_quality2None
wine_quality1None

List Model Version Artifacts

Artifacts belonging to a MLFlow registry model are listed with the Model Version list_artifacts() method. This returns all artifacts for the model.

List Model Version Artifacts Parameters

None

List Model Version Artifacts Returns

A List of artifacts with the following fields.

FieldTypeDescription
file_namestringThe name assigned to the artifact.
file_sizestringThe size of the artifact in bytes.
full_pathstringThe path of the artifact. This will be used to upload the artifact to Wallaroo.

List Model Version Artifacts Example

The following will list the artifacts in a single registry model.

single_registry_model.versions[0].list_artifacts()
File NameFile SizeFull Path
MLmodel546Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/MLmodel
conda.yaml182Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/conda.yaml
model.pkl1429Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
python_env.yaml122Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/python_env.yaml
requirements.txt73Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/requirements.txt

Upload a Model from a Registry

Models uploaded to the Wallaroo workspace are uploaded from a MLFlow Registry with the Wallaroo.Registry.upload method.

Upload a Model from a Registry Parameters

ParameterTypeDescription
namestring (Required)The name to assign the model once uploaded. Model names are unique within a workspace. Models assigned the same name as an existing model will be uploaded as a new model version.
pathstring (Required)The full path to the model artifact in the registry.
frameworkstring (Required)The Wallaroo model Framework. See Model Uploads and Registrations Supported Frameworks
input_schemapyarrow.lib.Schema (Required for non-native runtimes)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Required for non-native runtimes)The output schema in Apache Arrow schema format.

Upload a Model from a Registry Returns

The registry model details as follows.

FieldTypeDescription
NamestringThe name of the model.
VersionstringThe version registered in the Wallaroo instance in UUID format.
File NamestringThe file name associated with the ML Model in the Wallaroo instance.
SHAstringThe models hash value.
StatusstringThe status of the model from the following list.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.
Image PathstringThe image used for the containerization of the model.
Updated AtDateTimeWhen the model was last updated.

Upload a Model from a Registry Example

The following will retrieve the most recent uploaded model and upload it with the XGBOOST framework into the current Wallaroo workspace.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = registry.upload_model(
  name="sklearnonnx", 
  path="https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl", 
  framework=Framework.SKLEARN,
  input_schema=input_schema,
  output_schema=output_schema)
  
Namesklearnonnx
Version63bd932d-320d-4084-b972-0cfe1a943f5a
File Namemodel.pkl
SHA970da8c178e85dfcbb69fab7bad0fb58cd0c2378d27b0b12cc03a288655aa28d
Statuspending_conversion
ImagePathNone
Updated At2023-05-Jul 19:14:49

Retrieve Model Status

The model status is retrieved with the Model status() method.

Retrieve Model Status Parameters

None

Retrieve Model Status Returns

FieldTypeDescription
statusstringThe current status of the uploaded model.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.

Retrieve Model Status Returns Example

The following demonstrates checking the status in the for loop until the model shows either ready or error.

import time
while model.status() != "ready" and model.status() != "error":
    print(model.status())
    time.sleep(3)
print(model.status())

converting
converting
ready

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.5 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Python Models

How to upload and use Python Models as Wallaroo Pipeline Steps

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Python scripts are uploaded to Wallaroo and and treated like an ML Models in Pipeline steps. These will be referred to as Python steps.

Python steps can include:

  • Preprocessing steps to prepare the data received to be handed to ML Model deployed as another Pipeline step.
  • Postprocessing steps to take data output by a ML Model as part of a Pipeline step, and prepare the data to be received by some other data store or entity.
  • A model contained within a Python script.

In all of these, the requirements for uploading a Python step as a ML Model in Wallaroo are the same.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0

Upload Python Models

Python step models are uploaded to Wallaroo through the Wallaroo Client upload_model(name, path, framework).configure(options).

Upload Python Model Parameters

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.Python.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Upload Python Models Example

The following example is of uploading a Python step ML Model to a Wallaroo instance.

import pyarrow as pa
input_schema = pa.schema([
    pa.field('dense_2', pa.list_(pa.float64()))
])
output_schema = pa.schema([
    pa.field('output', pa.list_(pa.float64()))
])

from wallaroo.framework import Framework
step = client.upload_model("python-step", 
                           "./step.py", 
                           framework=Framework.PYTHON)
                           .configure('python', 
                                       input_schema=input_schema,
                                       output_schema=output_schema
                        )

4.6 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: PyTorch

How to upload and use PyTorch ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports PyTorch models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
RuntimeContainerized aka mlflow

Uploading PyTorch Models

PyTorch models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload PyTorch Model Parameters

The following parameters are required for PyTorch models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a PyTorch model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, PyTorch model Required)Set as the Framework.PyTorch.
input_schemapyarrow.lib.Schema (Upload Method Optional, PyTorch model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, PyTorch model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, PyTorch model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload PyTorch Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload PyTorch Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema(
    [
        pa.field('input', pa.list_(pa.float64(), list_size=10))
    ]
)

output_schema = pa.schema(
[
    pa.field('output', pa.list_(pa.float64(), list_size=1))
]
)

model = wl.upload_model('pt-single-io-model', 
                        "./models/model-auto-conversion_pytorch_single_io_model.pt", 
                        framework=Framework.PYTORCH, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting...........Ready.
{
    'name': 'pt-single-io-model', 
    'version': '8f91dee1-79e0-449b-9a59-0e93ba4a1ba9', 
    'file_name': 'model-auto-conversion_pytorch_single_io_model.pt', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3397', 
    'last_update_time': datetime.datetime(2023, 6, 23, 2, 8, 56, 669565, tzinfo=tzutc())
}

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.7 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: SKLearn

How to upload and use SKLearn ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports SKLearn models by containerizing the model and running as an image.

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.2.2
FrameworkFramework.SKLEARN aka sklearn
RuntimeContainerized aka tensorflow / mlflow

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00

Uploading SKLearn Models

SKLearn models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload SKLearn Model Parameters

The following parameters are required for SKLearn models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a SKLearn model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, SKLearn model Required)Set as the Framework.SKLEARN.
input_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, SKLearn model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload SKLearn Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload SKLearn Model Example

The following example is of uploading a pickled SKLearn ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=10))
])

output_schema = pa.schema([
    pa.field('predictions', pa.float64())
])

model = wl.upload_model(
                        'sklearn-linear-regression', 
                        'models/model-auto-conversion_sklearn_linreg_diabetes.pkl', 
                        framework=Framework.SKLEARN, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                        )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.Pending conversion..Converting.......Ready.

{
    'name': 'sklearn-linear-regression', 
    'version': 'e84809eb-0992-457e-95cb-cc3d20c792db', 
    'file_name': 'model-auto-conversion_sklearn_linreg_diabetes.pkl', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3443', 
    'last_update_time': datetime.datetime(2023, 6, 27, 22, 17, 52, 967054, tzinfo=tzutc())
}
 agesexbmibps1s2s3s4s5s6
00.0380760.0506800.0616960.021872-0.044223-0.034821-0.043401-0.0025920.019907-0.017646
1-0.001882-0.044642-0.051474-0.026328-0.008449-0.0191630.074412-0.039493-0.068332-0.092204
 inputs
0[0.0380759064, 0.0506801187, 0.0616962065, 0.0…
1[-0.0018820165, -0.0446416365, -0.051474061200…
 timein.inputsout.predictionscheck_failures
02023-07-05 15:43:33.065[0.0380759064, 0.0506801187, 0.0616962065, 0.0…206.1166770
12023-07-05 15:43:33.065[-0.0018820165, -0.0446416365, -0.0514740612, …68.0710330

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.8 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Hugging Face

How to upload and use Hugging Face ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports Hugging Face models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.27.0
  • diffusers==0.14.0
  • accelerate==0.18.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
RuntimeContainerized aka tensorflow / mlflow

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING-FACE-IMAGE-TO-TEXT
  • Framework.HUGGING-FACE-TEXT-CLASSIFICATION
  • Framework.HUGGING-FACE-SUMMARIZATION
  • Framework.HUGGING-FACE-TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING-FACE-FEATURE-EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-IMAGE-TO-TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-QUESTION-ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-STABLE-DIFFUSION-TEXT-2-IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-IMAGE-CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-ZERO-SHOT-OBJECT-DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING-FACE-SENTIMENT-ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING-FACE-TEXT-GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])

Uploading Hugging Face Models

Hugging Face models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload Hugging Face Model Parameters

The following parameters are required for Hugging Face models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a Hugging Face model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, Hugging Face model Required)Set as the framework - see the list above for all supported Hugging Face frameworks.
input_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, Hugging Face model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, Hugging Face model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload Hugging Face Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload Hugging Face Model Example

The following example is of uploading a Hugging Face Zero Shot Classification ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])

model = wl.upload_model(f"hugging-face-zero-model",
                        './models/model-auto-conversion_hugging-face_dummy-pipelines_zero-shot-classification-pipeline.zip', 
                        framework=Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION, 
                        input_schema=input_schema,
                        output_schema=output_schema)

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.9 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: TensorFlow

How to upload and use TensorFlow ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.1
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

Uploading TensorFlow Models

TensorFlow models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow models. Tensorflow models are native runtimes in Wallaroo, so the input_schema and output_schema parameters are optional.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Required)Set as the Framework.TENSORFLOW.
input_schemapyarrow.lib.Schema (Optional)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Optional)The output schema in Apache Arrow schema format.
convert_waitbool (Optional) (Default: True)Not required for native runtimes.
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

For example, the following example is of uploading a TensorFlow ML Model to a Wallaroo instance.

from wallaroo.framework import Framework
model = wl.upload_model(model_name, 
                        model_file_name,
                        framework=Framework.TENSORFLOW
                        )

Pipeline Deployment Configurations

Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.

This model will always run in the native runtime space.

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

4.10 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: TensorFlow Keras

How to upload and use TensorFlow Keras ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports TensorFlow/Keras models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.8.0
  • keras==1.1.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
RuntimeContainerized aka mlflow

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

Uploading TensorFlow Models

TensorFlow Keras models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload TensorFlow Model Parameters

The following parameters are required for TensorFlow keras models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a TensorFlow Keras model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, TensorFlow keras model Required)Set as the Framework.KERAS.
input_schemapyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, TensorFlow model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload TensorFlow Model Return

For example, the following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('input', 
             pa.list_(pa.float64(), 
             list_size=10)
             )
        ]
    )

output_schema = pa.schema([
    pa.field('output', 
             pa.list_(pa.float64(), 
             list_size=32)
            )
        ]
    )

model = wl.upload_model('mac-keras-single-io-example', 
                        './models/single_io_keras_sequential_model.h5',
                        framework=Framework.KERAS, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                        )

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

4.11 - Wallaroo SDK Essentials Guide: Model Uploads and Registrations: XGBoost

How to upload and use XGBoost ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports XGBoost models by containerizing the model and running as an image.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Librariesxgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
RuntimeContainerized aka tensorflow / mlflow

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00

Uploading XGBoost Models

XGBoost models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload XGBoost Model Parameters

The following parameters are required for XGBoost models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a XGBoost model to Wallaroo.

ParameterTypeDescription
namestring (Required)The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
pathstring (Required)The path to the model file being uploaded.
frameworkstring (Upload Method Optional, SKLearn model Required)Set as the Framework.XGBOOST.
input_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Upload Method Optional, SKLearn model Required)The output schema in Apache Arrow schema format.
convert_waitbool (Upload Method Optional, SKLearn model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload XGBoost Model Return

The following is returned with a successful model upload and conversion.

FieldTypeDescription
namestringThe name of the model.
versionstringThe model version as a unique UUID.
file_namestringThe file name of the model as stored in Wallaroo.
image_pathstringThe image used to deploy the model in the Wallaroo engine.
last_update_timeDateTimeWhen the model was last updated.

Upload XGBoost Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

model = wl.upload_model(f"{prefix}", 
                        'models/model-auto-conversion_xgboost_xgb_ranker_model.pkl', 
                        framework=Framework.XGBOOST, 
                        input_schema=input_schema, output_schema=output_schema
                        )
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion...Converting..Pending conversion.Converting.........Ready.

{
    'name': 'xgb-ranker', 
    'version': 'c53c6a84-9f56-41c6-bb2f-049ef6b067e8', 
    'file_name': 'model-auto-conversion_xgboost_xgb_ranker_model.pkl', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3367', 
    'last_update_time': datetime.datetime(2023, 6, 16, 18, 51, 15, 27969, tzinfo=tzutc())
}

data = pd.read_json('data/test-xgboost-classification-data.json')
display(data)

dataframe = pd.DataFrame({"inputs": data[:2].values.tolist()})
display(dataframe)

results = pipeline.infer(dataframe)
display(results)
 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2
 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]
 timein.inputsout.outputcheck_failures
02023-07-05 16:15:55.802[5.1, 3.5, 1.4, 0.2]0.00
12023-07-05 16:15:55.802[4.9, 3.0, 1.4, 0.2]0.00

Model Status

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 
                        framework=framework, 
                        input_schema=input_schema, 
                        output_schema=output_schema
                       )
model.config().runtime()
'mlflow'

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .build()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

5 - Wallaroo SDK Essentials Guide: User Management

How to create and manage Wallaroo Users through the Wallaroo SDK

Managing Workspace Users

Users are managed via their email address, and can be assigned to a workspace as either the owner or a user.

List All Users

list_users() returns an array of all users registered in the connected Wallaroo platform with the following values:

ParameterTypeDescription
idStringThe unique identifier for the user.
emailStringThe unique email identifier for the user.
usernameStringThe unique username of the user.

For example, listing all users in the Wallaroo connection returns the following:

wl.list_users()
[User({"id": "7b8b4f7d-de27-420f-9cd0-892546cb0f82", "email": "test@test.com", "username": "admin"),
 User({"id": "45e6b641-fe57-4fb2-83d2-2c2bd201efe8", "email": "steve@ex.co", "username": "steve")]

Get User by Email

The get_user_by_email({email}) command finds the user who’s email address matches the submitted {email} field. If no email address matches then the return will be null.

For example, the user steve with the email address steve@ex.co returns the following:

wl.get_user_by_email("steve@ex.co")

User({"id": "45e6b641-fe57-4fb2-83d2-2c2bd201efe8", "email": "steve@ex.co", "username": "steve")

Activate and Deactivate Users

To remove a user’s access to the Wallaroo instance, use the Wallaroo Client deactivate_user("{User Email Address}) method, replacing the {User Email Address} with the email address of the user to deactivate.

To activate a user, use the Wallaroo Client active_user("{User Email Address}) method, replacing the {User Email Address} with the email address of the user to activate.

This feature impacts Wallaroo Community’s license count. Wallaroo Community only allows a total of 5 users per Wallaroo Community instance. Deactivated users does not count to this total - this allows organizations to add users, then activate/deactivate them as needed to stay under the total number of licensed users count.

Wallaroo Enterprise has no limits on the number of users who can be added or active in a Wallaroo instance.

In this example, the user testuser@wallaroo.ai will be deactivated then reactivated.

wl.list_users()

[User({"id": "0528f34c-2725-489f-b97b-da0cde02cbd9", "email": "testuser@wallaroo.ai", "username": "testuser@wallaroo.ai"),
 User({"id": "3927b9d3-c279-442c-a3ac-78ba1d2b14d8", "email": "john.hummel+signuptest@wallaroo.ai", "username": "john.hummel+signuptest@wallaroo.ai")]

wl.deactivate_user("testuser@wallaroo.ai")

wl.activate_user("testuser@wallaroo.ai")

Troubleshooting

When a new user logs in for the first time, they get an error when uploading a model or issues when they attempt to log in. How do I correct that?

When a new registered user attempts to upload a model, they may see the following error:

TransportQueryError: 
{'extensions': 
    {'path': 
        '$.selectionSet.insert_workspace_one.args.object[0]', 'code': 'not-supported'
    }, 
    'message': 
        'cannot proceed to insert array relations since insert to table "workspace" affects zero rows'

Or if they log into the Wallaroo Dashboard, they may see a Page not found error.

This is caused when a user has been registered without an appropriate email address. See the user guides here on inviting a user, or the Wallaroo Enterprise User Management on how to log into the Keycloak service and update users. Verify that the username and email address are both the same, and they are valid confirmed email addresses for the user.

6 - Wallaroo SDK Essentials Guide: Pipelines

The classes and methods for managing Wallaroo pipelines and configurations.

6.1 - Wallaroo SDK Essentials Guide: Pipeline Management

How to create and manage Wallaroo Pipelines through the Wallaroo SDK

Pipelines are the method of taking submitting data and processing that data through the models. Each pipeline can have one or more steps that submit the data from the previous step to the next one. Information can be submitted to a pipeline as a file, or through the pipeline’s URL.

A pipeline’s metrics can be viewed through the Wallaroo Dashboard Pipeline Details and Metrics page.

Pipeline Naming Requirements

Pipeline names map onto Kubernetes objects, and must be DNS compliant. Pipeline names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Create a Pipeline

New pipelines are created in the current workspace.

To create a new pipeline, use the Wallaroo Client build_pipeline("{Pipeline Name}") command.

The following example creates a new pipeline imdb-pipeline through a Wallaroo Client connection wl:

imdb_pipeline = wl.build_pipeline("imdb-pipeline")

imdb_pipeline.status()
{'status': 'Pipeline imdb-pipeline is not deployed'}

List All Pipelines

The Wallaroo Client method list_pipelines() lists all pipelines in a Wallaroo Instance.

The following example lists all pipelines in the wl Wallaroo Client connection:

wl.list_pipelines()

[{'name': 'ccfraud-pipeline', 'create_time': datetime.datetime(2022, 4, 12, 17, 55, 41, 944976, tzinfo=tzutc()), 'definition': '[]'}]

Select an Existing Pipeline

Rather than creating a new pipeline each time, an existing pipeline can be selected by using the list_pipelines() command and assigning one of the array members to a variable.

The following example sets the pipeline ccfraud-pipeline to the variable current_pipeline:

wl.list_pipelines()

[{'name': 'ccfraud-pipeline', 'create_time': datetime.datetime(2022, 4, 12, 17, 55, 41, 944976, tzinfo=tzutc()), 'definition': '[]'}]

current_pipeline = wl.list_pipelines()[0]

current_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.244.5.4',
   'name': 'engine-7fcc7df596-hvlxb',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'ccfraud-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraud-model',
      'version': '4624e8a8-1414-4408-8b40-e03da4b5cb68',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.1.24',
   'name': 'engine-lb-85846c64f8-mtq9p',
   'status': 'Running',
   'reason': None}]}

Pipeline Steps

Once a pipeline has been created, or during its creation process, a pipeline step can be added. The pipeline step refers to the model that will perform an inference off of the data submitted to it. Each time a step is added, it is added to the pipeline’s models array.

Pipeline steps are not saved until the pipeline is deployed. Until then, pipeline steps are stored in local memory as a potential pipeline configuration until the pipeline is deployed.

Add a Step to a Pipeline

A pipeline step is added through the pipeline add_model_step({Model}) command.

In the following example, two models uploaded to the workspace are added as pipeline step:

imdb_pipeline.add_model_step(embedder)
imdb_pipeline.add_model_step(smodel)

imdb_pipeline.status()

{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}

Replace a Pipeline Step

The model specified in a pipeline step can be replaced with the pipeline method replace_with_model_step(index, model).

The following parameters are used for replacing a pipeline step:

ParameterDefault ValuePurpose
indexnullThe pipeline step to be replaced. Pipeline steps follow array numbering, where the first step is 0, etc.
modelnullThe new model to be used in the pipeline step.

In the following example, a deployed pipeline will have the initial model step replaced with a new one. A status of the pipeline will be displayed after deployment and after the pipeline swap to show the model has been replaced from ccfraudoriginal to ccfraudreplacement, each with their own versions.

pipeline.deploy()

pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.145',
   'name': 'engine-75bfd7dc9d-7p9qk',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'hotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraudoriginal',
      'version': '3a03dc94-716e-46bb-84c8-91bc99ceb2c3',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.2.144',
   'name': 'engine-lb-55dcdff64c-vf74s',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

pipeline.replace_with_model_step(0, replacement_model).deploy()

pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.2.153',
   'name': 'engine-96486c95d-zfchr',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'hotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'ccfraudreplacement',
      'version': '714efd19-5c83-42a8-aece-24b4ba530925',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.2.154',
   'name': 'engine-lb-55dcdff64c-9np9k',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}
Pre and Post Processing Steps

A Pipeline Step can be more than models - they can also be pre processing and post processing steps. For example, the Demand Curve Tutorial has both a pre and post processing steps that are added to the pipeline. The preprocessing step uses the following code:

import numpy
import pandas

import json

# add interaction terms for the model
def actual_preprocess(pdata):
    pd = pdata.copy()
    # convert boolean cust_known to 0/1
    pd.cust_known = numpy.where(pd.cust_known, 1, 0)
    # interact UnitPrice and cust_known
    pd['UnitPriceXcust_known'] = pd.UnitPrice * pd.cust_known
    return pd.loc[:, ['UnitPrice', 'cust_known', 'UnitPriceXcust_known']]

# If the data is a json string, call this wrapper instead
# Expected input:
# a dictionary with fields 'colnames', 'data'

# test that the code works here
def wallaroo_json(data):
    obj = json.loads(data)
    pdata = pandas.DataFrame(obj['query'],
                             columns=obj['colnames'])
    pprocessed = actual_preprocess(pdata)
    
   # return a dictionary, with the fields the model expect
    return {
       'tensor_fields': ['model_input'],
       'model_input': pprocessed.to_numpy().tolist()
    }

It is added as a Python module by uploading it as a model:

# load the preprocess module
module_pre = wl.upload_model("preprocess", "./preprocess.py").configure('python')

And then added to the pipeline as a step:

# now make a pipeline
demandcurve_pipeline = (wl.build_pipeline("demand-curve-pipeline")
                        .add_model_step(module_pre)
                        .add_model_step(demand_curve_model)
                        .add_model_step(module_post))

Remove a Pipeline Step

To remove a step from the pipeline, use the Pipeline remove_step(index) command, where the index is the array index for the pipeline’s steps.

In the following example the pipeline imdb_pipeline will have the step with the model smodel-o removed.

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.remove_step(1)
{'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}]"}

Clear All Pipeline Steps

The Pipeline clear() method removes all pipeline steps from a pipeline. Note that pipeline steps are not saved until the pipeline is deployed.

Manage Pipeline Deployment Configuration

For full details on pipeline deployment configurations, see Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration.

Deploy a Pipeline

When a pipeline step is added or removed, the pipeline must be deployed through the pipeline deploy(). This allocates resources to the pipeline from the Kubernetes environment and make it available to submit information to perform inferences. This process typically takes 45 seconds.

Once complete, the pipeline status() command will show 'status':'Running'.

Pipeline deployments can be modified to enable auto-scaling to allow pipelines to allocate more or fewer resources based on need by setting the pipeline’s This will then be applied to the deployment of the pipelineccfraudPipelineby specifying it'sdeployment_config` optional parameter. If this optional parameter is not passed, then the deployment will defer to default values. For more information, see Manage Pipeline Deployment Configuration.

In the following example, the pipeline imdb-pipeline that contains two steps will be deployed with default deployment configuration:

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.deploy()
Waiting for deployment - this will take up to 45s ...... ok

imdb_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.12.1.65',
   'name': 'engine-778b65459-f9mt5',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'embedder-o',
      'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
      'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
      'status': 'Running'},
     {'name': 'smodel-o',
      'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
      'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.12.1.66',
   'name': 'engine-lb-85846c64f8-ggg2t',
   'status': 'Running',
   'reason': None}]}

Troubleshooting Pipeline Deployment

If you deploy more pipelines than your environment can handle, or if you deploy more pipelines than your license allows, you may see an error like the following:


LimitError: You have reached a license limit in your Wallaroo instance. In order to add additional resources, you can remove some of your existing resources. If you have any questions contact us at community@wallaroo.ai: MAX_PIPELINES_LIMIT_EXCEEDED

Undeploy any unnecessary pipelines either through the SDK or through the Wallaroo Pipeline Dashboard, then attempt to redeploy the pipeline in question again.

Undeploy a Pipeline

When a pipeline is not currently needed, it can be undeployed and its resources turned back to the Kubernetes environment. To undeploy a pipeline, use the pipeline undeploy() command.

In this example, the aloha_pipeline will be undeployed:

aloha_pipeline.undeploy()

{'name': 'aloha-test-demo', 'create_time': datetime.datetime(2022, 3, 29, 20, 34, 3, 960957, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'aloha-2', 'version': 'a8e8abdc-c22f-416c-a13c-5fe162357430', 'sha': 'fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520'}]}}]"}

Get Pipeline Status

The pipeline status() command shows the current status, models, and other information on a pipeline.

The following example shows the pipeline imdb_pipeline status before and after it is deployed:

imdb_pipeline.status

<bound method Pipeline.status of {'name': 'imdb-pipeline', 'create_time': datetime.datetime(2022, 3, 30, 21, 21, 31, 127756, tzinfo=tzutc()), 'definition': "[{'ModelInference': {'models': [{'name': 'embedder-o', 'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d', 'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4'}]}}, {'ModelInference': {'models': [{'name': 'smodel-o', 'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19', 'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650'}]}}]"}>

imdb_pipeline.deploy()
Waiting for deployment - this will take up to 45s ...... ok

imdb_pipeline.status()

{'status': 'Running',
 'details': None,
 'engines': [{'ip': '10.12.1.65',
   'name': 'engine-778b65459-f9mt5',
   'status': 'Running',
   'reason': None,
   'pipeline_statuses': {'pipelines': [{'id': 'imdb-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'embedder-o',
      'version': '1c16d21d-fe4c-4081-98bc-65fefa465f7d',
      'sha': 'd083fd87fa84451904f71ab8b9adfa88580beb92ca77c046800f79780a20b7e4',
      'status': 'Running'},
     {'name': 'smodel-o',
      'version': '8d311ba3-c336-48d3-99cd-85d95baa6f19',
      'sha': '3473ea8700fbf1a1a8bfb112554a0dde8aab36758030dcde94a9357a83fd5650',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.12.1.66',
   'name': 'engine-lb-85846c64f8-ggg2t',
   'status': 'Running',
   'reason': None}]}

Anomaly Testing

Anomaly detection allows organizations to set validation parameters. A validation is added to a pipeline to test data based on a specific expression. If the expression is returned as False, this is detected as an anomaly and added to the InferenceResult object’s check_failures array and the pipeline logs.

Anomaly detection consists of the following steps:

  • Set a validation: Add a validation to a pipeline that, when returned False, adds an entry to the InferenceResult object’s check_failures attribute with the expression that caused the failure.
  • Display anomalies: Anomalies detected through a Pipeline’s validation attribute are displayed either through the InferenceResult object’s check_failures attribute, or through the pipeline’s logs.

Set A Validation

Validations are added to a pipeline through the wallaroo.pipeline add_validation method. The following parameters are required:

ParameterTypeDescription
nameString (Required)The name of the validation
validationwallaroo.checks.Expression (Required)The validation expression that adds the result InferenceResult object’s check_failures attribute when expression result is False. The validation checks the expression against both the data value and the data type.

Validation expressions take the format value Expression, with the expression being in the form of a :py:Expression:. For example, if the model housing_model is part of the pipeline steps, then a validation expression may be housing_model.outputs[0][0] < 100.0: If the output of the housing_model inference is less than 100, then the validation is True and no action is taken. Any values over 100, the validation is False which triggers adding the anomaly to the InferenceResult object’s check_failures attribute.

Note that multiple validations can be created to allow for multiple anomalies detection.

In the following example, a validation is added to the pipeline to detect housing prices that are below 100 (represented as $100 million), and trigger an anomaly for values above that level. When an inference is performed that triggers a validation failure, the results are displayed in the InferenceResult object’s check_failures attribute.

p = wl.build_pipeline('anomaly-housing-pipeline')
p = p.add_model_step(housing_model)
p = p.add_validation('price too high', housing_model.outputs[0][0] < 100.0)
pipeline = p.deploy()

test_input = {"dense_16_input":[[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145,  2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]}
response_trigger = pipeline.infer(test_input)
print("\n")
print(response_trigger)

[InferenceResult({'check_failures': [{'False': {'expr': 'anomaly-housing.outputs[0][0] < 100'}}],
 'elapsed': 15110549,
 'model_name': 'anomaly-housing',
 'model_version': 'c3cf1577-6666-48d3-b85c-5d4a6e6567ea',
 'original_data': {'dense_16_input': [[0.02675675,
                                       0.0,
                                       0.02677953,
                                       0.0,
                                       0.0010046,
                                       0.00951931,
                                       0.14795322,
                                       0.0027145,
                                       2,
                                       0.98536841,
                                       0.02988655,
                                       0.04031725,
                                       0.04298041]]},
 'outputs': [{'Float': {'data': [350.46990966796875], 'dim': [1, 1], 'v': 1}}],
 'pipeline_name': 'anomaly-housing-model',
 'time': 1651257043312})]

Display Anomalies

Anomalies detected through a Pipeline’s validation attribute are displayed either through the InferenceResult object’s check_failures attribute, or through the pipeline’s logs.

To display an anomaly through the InferenceResult object, display the check_failures attribute.

In the following example, the an InferenceResult where the validation failed will display the failure in the check_failures attribute:

test_input = {"dense_16_input":[[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145,  2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]}
response_trigger = pipeline.infer(test_input)
print("\n")
print(response_trigger)

[InferenceResult({'check_failures': [{'False': {'expr': 'anomaly-housing-model.outputs[0][0] < '
                                       '100'}}],
 'elapsed': 12196540,
 'model_name': 'anomaly-housing-model',
 'model_version': 'a3b1c29f-c827-4aad-817d-485de464d59b',
 'original_data': {'dense_16_input': [[0.02675675,
                                       0.0,
                                       0.02677953,
                                       0.0,
                                       0.0010046,
                                       0.00951931,
                                       0.14795322,
                                       0.0027145,
                                       2,
                                       0.98536841,
                                       0.02988655,
                                       0.04031725,
                                       0.04298041]]},
 'outputs': [{'Float': {'data': [350.46990966796875], 'dim': [1, 1], 'v': 1}}],
 'pipeline_name': 'anomaly-housing-pipeline',
 'shadow_data': {},
 'time': 1667416852255})]

The other methods is to use the pipeline.logs() method with the parameter valid=False, isolating the logs where the validation was returned as False.

In this example, a set of logs where the validation returned as False will be displayed:

pipeline.logs(valid=False)
TimestampOutputInputAnomalies
2022-02-Nov 19:20:52[array([[350.46990967]])][[0.02675675, 0.0, 0.02677953, 0.0, 0.0010046, 0.00951931, 0.14795322, 0.0027145, 2, 0.98536841, 0.02988655, 0.04031725, 0.04298041]]1

A/B Testing

A/B testing is a method that provides the ability to test competing ML models for performance, accuracy or other useful benchmarks. Different models are added to the same pipeline steps as follows:

  • Control or Champion model: The model currently used for inferences.
  • Challenger model(s): The model or set of models compared to the challenger model.

A/B testing splits a portion of the inference requests between the champion model and the one or more challengers through the add_random_split method. This method splits the inferences submitted to the model through a randomly weighted step.

Each model receives inputs that are approximately proportional to the weight it is assigned. For example, with two models having weights 1 and 1, each will receive roughly equal amounts of inference inputs. If the weights were changed to 1 and 2, the models would receive roughly 33% and 66% respectively instead.

When choosing the model to use, a random number between 0.0 and 1.0 is generated. The weighted inputs are mapped to that range, and the random input is then used to select the model to use. For example, for the two-models equal-weight case, a random key of 0.4 would route to the first model, 0.6 would route to the second.

Add Random Split

A random split step can be added to a pipeline through the add_random_split method.

The following parameters are used when adding a random split step to a pipeline:

ParameterTypeDescription
champion_weightFloat (Required)The weight for the champion model.
champion_modelWallaroo.Model (Required)The uploaded champion model.
challenger_weightFloat (Required)The weight of the challenger model.
challenger_modelWallaroo.Model (Required)The uploaded challenger model.
hash_keyString(Optional)A key used instead of a random number for model selection. This must be between 0.0 and 1.0.

Note that multiple challenger models with different weights can be added as the random split step.

add_random_split([(champion_weight, champion_model), (challenger_weight, challenger_model),  (challenger_weight2, challenger_model2),...], hash_key)

In this example, a pipeline will be built with a 2:1 weighted ratio between the champion and a single challenger model.

pipeline = (wl.build_pipeline("randomsplitpipeline-demo")
            .add_random_split([(2, control), (1, challenger)]))

The results for a series of single are displayed to show the random weighted split between the two models in action:

results = []
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))
results.append(experiment_pipeline.infer_from_file("data/data-1.json"))

for result in results:
    print(result[0].model())
    print(result[0].data())

('aloha-control', 'ff81f634-8fb4-4a62-b873-93b02eb86ab4')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-control', 'ff81f634-8fb4-4a62-b873-93b02eb86ab4')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

('aloha-challenger', '87fdfe08-170e-4231-a0b9-543728d6fc57')
[array([[0.00151959]]), array([[0.98291481]]), array([[0.01209957]]), array([[4.75912966e-05]]), array([[2.02893716e-05]]), array([[0.00031977]]), array([[0.01102928]]), array([[0.99756402]]), array([[0.01034162]]), array([[0.00803896]]), array([[0.01615506]]), array([[0.00623623]]), array([[0.00099858]]), array([[1.79337805e-26]]), array([[1.38899512e-27]])]

Replace With Random Split

If a pipeline already had steps as detailed in Add a Step to a Pipeline, this step can be replaced with a random split with the replace_with_random_split method.

The following parameters are used when adding a random split step to a pipeline:

ParameterTypeDescription
indexInteger (Required)The pipeline step being replaced.
champion_weightFloat (Required)The weight for the champion model.
champion_modelWallaroo.Model (Required)The uploaded champion model.
**challenger_weightFloat (Required)The weight of the challenger model.
challenger_modelWallaroo.Model (Required)The uploaded challenger model.
hash_keyString(Optional)A key used instead of a random number for model selection. This must be between 0.0 and 1.0.

Note that one or more challenger models can be added for the random split step:

replace_with_random_split(index, [(champion_weight, champion_model), (challenger_weight, challenger_model)], (challenger_weight2, challenger_model2),...], hash_key)

A/B Testing Logs

A/B Testing logs entries contain the model used for the inferences in the column out._model_split.

logs = experiment_pipeline.logs(limit=5)
display(logs.loc[:,['time', 'out._model_split', 'out.main']])
timeout._model_splitout.main
02023-03-03 19:08:35.653[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.9999754]
12023-03-03 19:08:35.702[{“name”:“aloha-challenger”,“version”:“3acd3835-be72-42c4-bcae-84368f416998”,“sha”:“223d26869d24976942f53ccb40b432e8b7c39f9ffcf1f719f3929d7595bceaf3”}][0.9999727]
22023-03-03 19:08:35.753[{“name”:“aloha-challenger”,“version”:“3acd3835-be72-42c4-bcae-84368f416998”,“sha”:“223d26869d24976942f53ccb40b432e8b7c39f9ffcf1f719f3929d7595bceaf3”}][0.6606688]
32023-03-03 19:08:35.799[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.9998954]
42023-03-03 19:08:35.846[{“name”:“aloha-control”,“version”:“89389786-0c17-4214-938c-aa22dd28359f”,“sha”:“fd998cd5e4964bbbb4f8d29d245a8ac67df81b62be767afbceb96a03d1a01520”}][0.99999803]

Pipeline Shadow Deployments

Wallaroo provides a method of testing the same data against two different models or sets of models at the same time through shadow deployments otherwise known as parallel deployments or A/B test. This allows data to be submitted to a pipeline with inferences running on several different sets of models. Typically this is performed on a model that is known to provide accurate results - the champion - and a model or set of models that is being tested to see if it provides more accurate or faster responses depending on the criteria known as the challenger(s). Multiple challengers can be tested against a single champion to determine which is “better” based on the organization’s criteria.

As described in the Wallaroo blog post The What, Why, and How of Model A/B Testing:

In data science, A/B tests can also be used to choose between two models in production, by measuring which model performs better in the real world. In this formulation, the control is often an existing model that is currently in production, sometimes called the champion. The treatment is a new model being considered to replace the old one. This new model is sometimes called the challenger….
Keep in mind that in machine learning, the terms experiments and trials also often refer to the process of finding a training configuration that works best for the problem at hand (this is sometimes called hyperparameter optimization).

When a shadow deployment is created, only the inference from the champion is returned in the InferenceResult Object data, while the result data for the shadow deployments is stored in the InferenceResult Object shadow_data.

Create Shadow Deployment

Create a parallel or shadow deployment for a pipeline with the pipeline.add_shadow_deploy(champion, challengers[]) method, where the champion is a Wallaroo Model object, and challengers[] is one or more Wallaroo Model objects.

Each inference request sent to the pipeline is sent to all the models. The prediction from the champion is returned by the pipeline, while the predictions from the challengers are not part of the standard output, but are kept stored in the shadow_data attribute and in the logs for later comparison.

In this example, a shadow deployment is created with the champion versus two challenger models.

champion = wl.upload_model(champion_model_name, champion_model_file).configure()
model2 = wl.upload_model(shadow_model_01_name, shadow_model_01_file).configure()
model3 = wl.upload_model(shadow_model_02_name, shadow_model_02_file).configure()
   
pipeline.add_shadow_deploy(champion, [model2, model3])
pipeline.deploy()
  
namecc-shadow
created2022-08-04 20:06:55.102203+00:00
last_updated2022-08-04 20:37:28.785947+00:00
deployedTrue
tags
stepsccfraud-lstm

Shadow Deploy Outputs

Model outputs are listed by column based on the model’s outputs. The output data is set by the term out, followed by the name of the model. For the default model, this is out.{variable_name}, while the shadow deployed models are in the format out_{model name}.variable, where {model name} is the name of the shadow deployed model.

sample_data_file = './smoke_test.df.json'
response = pipeline.infer_from_file(sample_data_file)
timein.tensorout.dense_1check_failuresout_ccfraudrf.variableout_ccfraudxgb.variable
02023-03-03 17:35:28.859[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0[1.0][0.0005066991]

Retrieve Shadow Deployment Logs

Shadow deploy results are part of the Pipeline.logs() method. The output data is set by the term out, followed by the name of the model. For the default model, this is out.dense_1, while the shadow deployed models are in the format out_{model name}.variable, where {model name} is the name of the shadow deployed model.

logs = pipeline.logs()
display(logs)
timein.tensorout.dense_1check_failuresout_ccfraudrf.variableout_ccfraudxgb.variable
02023-03-03 17:35:28.859[1.0678324729, 0.2177810266, -1.7115145262, 0.682285721, 1.0138553067, -0.4335000013, 0.7395859437, -0.2882839595, -0.447262688, 0.5146124988, 0.3791316964, 0.5190619748, -0.4904593222, 1.1656456469, -0.9776307444, -0.6322198963, -0.6891477694, 0.1783317857, 0.1397992467, -0.3554220649, 0.4394217877, 1.4588397512, -0.3886829615, 0.4353492889, 1.7420053483, -0.4434654615, -0.1515747891, -0.2668451725, -1.4549617756][0.0014974177]0[1.0][0.0005066991]

Get Pipeline URL Endpoint

The Pipeline URL Endpoint or the Pipeline Deploy URL is used to submit data to a pipeline to use for an inference. This is done through the pipeline _deployment._url() method.

In this example, the pipeline URL endpoint for the pipeline ccfraud_pipeline will be displayed:

ccfraud_pipeline._deployment._url()

'http://engine-lb.ccfraud-pipeline-1:29502/pipelines/ccfraud-pipeline'

6.2 - Wallaroo SDK Essentials Guide: Pipeline Deployment Configuration

Details on pipeline configurations and settings

Pipeline deployments configurations allow tailoring of a pipeline’s resources to match an organization’s and model’s requirements. Pipelines may require more memory, CPU cores, or GPUs to run to run all its steps efficiently. Pipeline deployment configurations also allow for multiple replicas of a model in a pipeline to provide scalability.

Create Pipeline Configuration

Setting a pipeline deployment configuration follows this process:

  1. Pipeline deployment configurations are created through the wallaroo ‘deployment_config.DeploymentConfigBuilder()](https://docs.wallaroo.ai/20230201/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-reference-guide/deployment_config/#DeploymentConfigBuilder) class.
  2. Once the configuration options are set the pipeline deployment configuration is set with the deployment_config.build() method.
  3. The pipeline deployment configuration is then applied when the pipeline is deployed.

The following example shows a pipeline deployment configuration with 1 replica, 1 cpu, and 2Gi of memory set to be allocated to the pipeline.

deployment_config = wallaroo.DeploymentConfigBuilder()
                    .replica_count(1)
                    .cpus(1)
                    .memory("2Gi")
                    .build()

pipeline.deploy(deployment_config = deployment_config)

Pipeline resources can be configured with autoscaling. Autoscaling allows the user to define how many engines a pipeline starts with, the minimum amount of engines a pipeline uses, and the maximum amount of engines a pipeline can scale to. The pipeline scales up and down based on the average CPU utilization across the engines in a given pipeline as the user’s workload increases and decreases.

Pipeline Resource Configurations

Pipeline deployment configurations deal with two major components:

  • Native Runtimes: Models that are deployed “as is” with the Wallaroo engine (Onnx, etc).
  • Containerized Runtimes: Models that are packaged into a container then deployed as a container with the Wallaroo engine (MLFlow, etc).

These configurations can be mixed - both native runtimes and containerized runtimes deployed to the same pipeline, with resources allocated to each runtimes in different configurations.

The following resources configurations are available through the wallaroo.deployment_config object.

GPU and CPU Allocation

CPUs are allocated in fractions of total CPU power similar to the Kubernetes CPU definitions. cpus(0.25), cpus(1.0), etc are valid values.

GPUs can only be allocated by entire integer units from the GPU enabled nodepools. gpus(1), gpus(2), etc are valid values, while gpus(0.25) are not.

Organizations should be aware of how many GPUs are allocated to the cluster. If all GPUs are already allocated to other pipelines, or if there are not enough GPUs to fulfill the request, the pipeline deployment will fail and return an error message.

GPU Support

Wallaroo 2023.2.1 and above supports Kubernetes nodepools with Nvidia Cuda GPUs.

See the Create GPU Nodepools for Kubernetes Clusters guide for instructions on adding GPU enabled nodepools to a Kubernetes cluster.

Native Runtime Configuration Methods

MethodParametersDescriptionEnterprise Only Feature
replica_count(count: int)The number of replicas of the pipeline to deploy. This allows for multiple deployments of the same models to be deployed to increase inferences through parallelization.
replica_autoscale_min_max(maximum: int, minimum: int = 0)Provides replicas to be scaled from 0 to some maximum number of replicas. This allows pipelines to spin up additional replicas as more resources are required, then spin them back down to save on resources and costs.
autoscale_cpu_utilization(cpu_utilization_percentage: int)Sets the average CPU percentage metric for when to load or unload another replica.
disable_autoscaleDisables autoscaling in the deployment configuration. 
cpus(core_count: float)Sets the number or fraction of CPUs to use for the pipeline, for example: 0.25, 1, 1.5, etc. The units are similar to the Kubernetes CPU definitions. 
gpus(core_count: int)Sets the number of GPUs to allocate for native runtimes. GPUs are only allocated in whole units, not as fractions. Organizations should be aware of the total number of GPUs available to the cluster, and monitor which pipeline deployment configurations have gpus allocated to ensure they do not run out. If there are not enough gpus to allocate to a pipeline deployment configuration, and error message will be deployed when the pipeline is deployed. If gpus is called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance.
memorymemory_spec: strSets the amount of RAM to allocate the pipeline. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
The values are similar to the Kubernetes memory resource units format.
 
lb_cpus(core_count: float)Sets the number or fraction of CPUs to use for the pipeline’s load balancer, for example: 0.25, 1, 1.5, etc. The units, similar to the Kubernetes CPU definitions. 
lb_memorymemory_spec: strSets the amount of RAM to allocate the pipeline’s load balancer. The memory_spec string is in the format “{size as number}{unit value}”. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
The values are similar to the Kubernetes memory resource units format.
 
deployment_label Label used for Kubernetes labels. Required if gpus are set and must match the GPU nodepool label.

Containerized Runtime Configuration Methods

MethodParametersDescriptionEnterprise Only Feature
sidekick_cpus(model: wallaroo.model.Model, core_count: float)Sets the number of CPUs to be used for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
  • Model model: The sidekick model to configure.
  • float core_count: Number of CPU cores to use in this sidekick.
 
sidekick_memory(model: wallaroo.model.Model, memory_spec: str)Sets the memory available to for the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. The parameters are as follows:
  • Model model: The sidekick model to configure.
  • memory_spec: The amount of memory to allocated as memory unit values. The accepted unit values are:
    • KiB (for KiloBytes)
    • MiB (for MegaBytes)
    • GiB (for GigaBytes)
    • TiB (for TeraBytes)
    The values are similar to the Kubernetes memory resource units format.
 
sidekick_env(model: wallaroo.model.Model, environment: Dict[str, str])Environment variables submitted to the model’s sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment. These are used specifically for containerized models that have environment variables that effect their performance. 
sidekick_gpus(model: wallaroo.model.Model, core_count: int)Sets the number of GPUs to allocate for containerized runtimes. GPUs are only allocated in whole units, not as fractions. Organizations should be aware of the total number of GPUs available to the cluster, and monitor which pipeline deployment configurations have gpus allocated to ensure they do not run out. If there are not enough gpus to allocate to a pipeline deployment configuration, and error message will be deployed when the pipeline is deployed. If called, then the deployment_label must be called and match the GPU Nodepool for the Wallaroo Cluster hosting the Wallaroo instance.

Examples

Native Runtime Deployment

The following will set native runtime deployment to one quarter of a CPU with 1 Gi of Ram:

deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

This example sets the replica count to 1, then sets the auto-scale to vary between 2 to 5 replicas depending on need, with 1 CPU and 1 GI RAM allocated per replica.

deploy_config = (wallaroo.DeploymentConfigBuilder()
                        .replica_count(1)
                        .replica_autoscale_min_max(minimum=2, maximum=5)
                        .cpus(1)
                        .memory("1Gi")
                        .build()
                    )

The following configuration allocates 1 GPU to the pipeline for native runtimes.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .gpus(1)
                    .deployment_label('doc-gpu-label:true')
                    .build()

Containerized Runtime Deployment

The following configuration allocates 0.25 CPU and 1Gi RAM to the containerized runtime sm_model, and passes that runtime environmental variables used for timeout settings.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                    .sidekick_env(sm_model, 
                        {"GUNICORN_CMD_ARGS":
                        "__timeout=188 --workers=1"}
                    )
                    .build()

This example shows allocating 1 GPU to the containerized runtime model sm_model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_gpus(sm_model, 1)
                    .deployment_label('doc-gpu-label:true')
                    .sidekick_memory(sm_model, '1Gi')
                    .build()

Mixed Environments

The following configuration allocates 1 gpu to the pipeline for native runtimes, then another gpu to the containerized runtime sm_model for a total of 2 gpus allocated to the pipeline: one gpu for native runtimes, another gpu for the containerized runtime model sm_model.

deployment_config = DeploymentConfigBuilder()
                    .cpus(0.25)
                    .memory('1Gi')
                    .gpus(1)
                    .sidekick_gpus(sm_model, 1)
                    .deployment_label('doc-gpu-label:true')
                    .build()

6.3 - Wallaroo SDK Essentials Guide: Pipeline Log Management

How to create and manage Wallaroo Pipelines through the Wallaroo SDK

Pipeline have their own set of log files that are retrieved and analyzed as needed with the either through:

  • The Pipeline logs method (returns either a DataFrame or Apache Arrow).
  • The Pipeline export_logs method (saves either a DataFrame file in JSON format, or an Apache Arrow file).

Get Pipeline Logs

Pipeline logs are retrieved through the Pipeline logs method. By default, logs are returned as a DataFrame in reverse chronological order of insertion, with the most recent files displayed first.

Pipeline logs are segmented by pipeline versions. For example, if a new model step is added to a pipeline, a model swapped out of a pipeline step, etc - this generated a new pipeline version. log method requests will return logs based on the parameter that match the pipeline version. To request logs of a specific pipeline version, specify the start_datetime and end_datetime parameters based on the pipeline version logs requested.

This command takes the following parameters.

ParameterTypeDescription
limitInt (Optional) (Default: 100)Limits how many log records to display. If there are more pipeline logs than are being displayed, the Warning message Pipeline log record limit exceeded will be displayed. For example, if 100 log files were requested and there are a total of 1,000, the warning message will be displayed.
start_datetime and end_datetimeDateTime (Optional)Limits logs to all logs between the start_datetime and end_datetime DateTime parameters. These comply with the Python datetime library for formats such as:
  • datetime.datetime.now()
  • datetime.datetime(2023, 3, 28, 14, 25, 51, 660058, tzinfo=tzutc()) (March 28, 2023 14:25:51:660058 UTC time zone)

Both parameters must be provided. Submitting a logs() request with only start_datetime or end_datetime will generate an exception.
If start_datetime and end_datetime are provided as parameters even with any other parameter, then the records are returned in chronological order, with the oldest record displayed first.
datasetList[String] (OPTIONAL)The datasets to be returned. The datasets available are:
  • *: Default. This translates to ["time", "in", "out", "check_failures"].
  • time: The DateTime of the inference request.
  • in: All inputs listed as in_{variable_name}.
  • out: All outputs listed as out_variable_name.
  • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks weretriggers, 1 or greater indicates a check was triggered.
  • meta: Returns metadata. IMPORTANT NOTE: See Metadata RequestsRestrictions for specifications on how this dataset can be used with otherdatasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
    • Returns in the metadata.last_model field:
      • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
    • Returns in the metadata.pipeline_version field:
      • The pipeline version as a UUID value.
  • metadata.elapsed: IMPORTANT NOTE: See Metadata Requests Restrictionsfor specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
dataset_excludeList[String] (OPTIONAL)Exclude specified datasets.
dataset_separatorSequence[[String], string] (OPTIONAL)If set to “.”, return dataset will be flattened.
arrowBoolean (Optional) (Default: False)If arrow is set to True, then the logs are returned as an Apache Arrow table. If arrow=False, then the logs are returned as a pandas DataFrame.

All of the parameters can be used together, but start_datetime and end_datetime must be combined; if one is used, then so must the other. If start_datetime and end_datetime are used with any other parameter, then the log results are in chronological order of record insertion.

Log requests are limited to around 100k in size. For requests greater than 100k in size, use the Pipeline export_logs() method.

Logs include the following standard datasets:

ParameterTypeDescription
timeDateTimeThe DateTime the inference request was made.
in.{variable} The input(s) for the inference request. Each input is listed as in.{variable_name}. For example, in.text_input, in.square_foot, in.number_of_rooms, etc.
out The outputs(s) for the inference request, based on the ML model’s outputs. Each output is listed as out.{variable_name}. For example, out.maximum_offer_price, out.minimum_asking_price, out.trade_in_value, etc.
check_failuresIntHow many validation checks were triggered by the inference. For more information, see Anomaly Testing
out_{model_name}.{variable} Only returned when using Pipeline Shadow Deployments. For each model in the shadow deploy step, their output is listed in the format out_{model_name}.{variable}. For example, out_shadow_model_xgb.maximum_offer_price, out_shadow_model_xgb.minimum_asking_price, out_shadow_model_xgb.trade_in_value, etc.
out._model_split Only returned when using A/B Testing, used to display the model_name, model_version, and model_sha of the model used for the inference.

In this example, the last 50 logs to the pipeline mainpipeline between two sample dates. In this case, all of the time column fields are the same since the inference request was sent as a batch.

logs = mainpipeline.logs(start_datetime=date_start, end_datetime=date_end)

display(len(logs))
display(logs)

538
 timein.tensorout.variablecheck_failures
02023-04-24 18:09:33.970[4.0, 2.5, 2900.0, 5505.0, 2.0, 0.0, 0.0, 3.0, 8.0, 2900.0, 0.0, 47.6063, -122.02, 2970.0, 5251.0, 12.0, 0.0, 0.0][718013.75]0
12023-04-24 18:09:33.970[2.0, 2.5, 2170.0, 6361.0, 1.0, 0.0, 2.0, 3.0, 8.0, 2170.0, 0.0, 47.7109, -122.017, 2310.0, 7419.0, 6.0, 0.0, 0.0][615094.56]0
22023-04-24 18:09:33.970[3.0, 2.5, 1300.0, 812.0, 2.0, 0.0, 0.0, 3.0, 8.0, 880.0, 420.0, 47.5893, -122.317, 1300.0, 824.0, 6.0, 0.0, 0.0][448627.72]0
32023-04-24 18:09:33.970[4.0, 2.5, 2500.0, 8540.0, 2.0, 0.0, 0.0, 3.0, 9.0, 2500.0, 0.0, 47.5759, -121.994, 2560.0, 8475.0, 24.0, 0.0, 0.0][758714.2]0
42023-04-24 18:09:33.970[3.0, 1.75, 2200.0, 11520.0, 1.0, 0.0, 0.0, 4.0, 7.0, 2200.0, 0.0, 47.7659, -122.341, 1690.0, 8038.0, 62.0, 0.0, 0.0][513264.7]0
5332023-04-24 18:09:33.970[3.0, 2.5, 1750.0, 7208.0, 2.0, 0.0, 0.0, 3.0, 8.0, 1750.0, 0.0, 47.4315, -122.192, 2050.0, 7524.0, 20.0, 0.0, 0.0][311909.6]0
5342023-04-24 18:09:33.970[5.0, 1.75, 2330.0, 6450.0, 1.0, 0.0, 1.0, 3.0, 8.0, 1330.0, 1000.0, 47.4959, -122.367, 2330.0, 8258.0, 57.0, 0.0, 0.0][448720.28]0
5352023-04-24 18:09:33.970[4.0, 3.5, 4460.0, 16271.0, 2.0, 0.0, 2.0, 3.0, 11.0, 4460.0, 0.0, 47.5862, -121.97, 4540.0, 17122.0, 13.0, 0.0, 0.0][1208638.0]0
5362023-04-24 18:09:33.970[3.0, 2.75, 3010.0, 1842.0, 2.0, 0.0, 0.0, 3.0, 9.0, 3010.0, 0.0, 47.5836, -121.994, 2950.0, 4200.0, 3.0, 0.0, 0.0][795841.06]0
5372023-04-24 18:09:33.970[2.0, 1.5, 1780.0, 4750.0, 1.0, 0.0, 0.0, 4.0, 7.0, 1080.0, 700.0, 47.6859, -122.395, 1690.0, 5962.0, 67.0, 0.0, 0.0][558463.3]0

538 rows × 4 columns

Metadata Requests Restrictions

The following restrictions are in place when requesting the datasets metadata or metadata.elapsed.

Standard Pipeline Steps

For the following Pipeline steps, metadata or metadata.elapsed must be requested with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["*", "metadata.elapsed"])

Effected pipeline steps:

  • add_model_step
  • replace_with_model_step

Testing Pipeline Steps

For the following Pipeline steps, meta or metadata.elapsed can not be included with the * parameter. For example:

result = mainpipeline.infer(normal_input, dataset=["metadata.elapsed"])

Effected pipeline steps:

  • add_random_split
  • replace_with_random_split
  • add_shadow_deploy
  • replace_with_shadow_deploy

Export Pipeline Logs as File

The Pipeline method export_logs returns the Pipeline records as either by default pandas records in Newline Delimited JSON (NDJSON) format, or an Apache Arrow table files.

The output files are by default stores in the current working directory ./logs with the default prefix as the {pipeline name}-1, {pipeline name}-2, etc.

The suffix by default will be json for pandas records in Newline Delimited JSON (NDJSON) format files. Logs are segmented by pipeline version across the limit, data_size_limit, or start_datetime and end_datetime parameters.

By default, logs are returned as a pandas record in NDJSON in reverse chronological order of insertion, with the most recent log insertions displayed first.

Pipeline logs are segmented by pipeline versions. For example, if a new model step is added to a pipeline, a model swapped out of a pipeline step, etc - this generated a new pipeline version.

This command takes the following parameters.

ParameterTypeDescription
directoryString (Optional) (Default: logs)Logs are exported to a file from current working directory to directory.
file_prefixString (Optional) (Default: The name of the pipeline)The name of the exported files. By default, this will the name of the pipeline and is segmented by pipeline version between the limits or the start and end period. For example: ’logpipeline-1.json`, etc.
data_size_limitString (Optional) (Default: 100MB)The maximum size for the exported data in bytes. Note that file size is approximate to the request; a request of 10MiB may return 10.3MB of data. The fields are in the format “{size as number} {unit value}”, and can include a space so “10 MiB” and “10MiB” are the same. The accepted unit values are:
  • KiB (for KiloBytes)
  • MiB (for MegaBytes)
  • GiB (for GigaBytes)
  • TiB (for TeraBytes)
limitInt (Optional) (Default: 100)Limits how many log records to display. Defaults to 100. If there are more pipeline logs than are being displayed, the Warning message Pipeline log record limit exceeded will be displayed. For example, if 100 log files were requested and there are a total of 1,000, the warning message will be displayed.
start_datetime and end_datetimeDateTime (Optional)Limits logs to all logs between the start_datetime and end_datetime DateTime parameters. These comply with the Python datetime library for formats such as:
  • datetime.datetime.now()
  • datetime.datetime(2023, 3, 28, 14, 25, 51, 660058, tzinfo=tzutc()) (March 28, 2023 14:25:51:660058 UTC time zone)

Both parameters must be provided. Submitting a logs() request with only start_datetime or end_datetime will generate an exception.
If start_datetime and end_datetime are provided as parameters even with any other parameter, then the records are returned in chronological order, with the oldest record displayed first.
filenameString (Required)The file name to save the log file to. The requesting user must have write access to the file location. The requesting user must have write permission to the file location, and the target directory for the file must already exist. For example: If the file is set to /var/wallaroo/logs/pipeline.json, then the directory /var/wallaroo/logs must already exist. Otherwise file names are only limited by standard file naming rules for the target environment.
datasetList (OPTIONAL)The datasets to be returned. The datasets available are:
  • *: Default. This translates to ["time", "in", "out", "check_failures"].
  • time: The DateTime of the inference request.
  • in: All inputs listed as in_{variable_name}.
  • out: All outputs listed as out_variable_name.
  • check_failures: Flags whether an Anomaly or Validation Check was triggered. 0 indicates no checks were triggered, 1 or greater indicates a check was triggered.
  • meta: Returns metadata. IMPORTANT NOTE: See Metadata RequestsRestrictions for specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
    • Returns in the metadata.last_model field:
      • A dict with each Python step as:
        • model_name: The name of the model in the pipeline step.
        • model_sha : The sha hash of the model in the pipeline step.
    • Returns in the metadata.pipeline_version field:
      • The pipeline version as a UUID value.
  • metadata.elapsed: IMPORTANT NOTE: See Metadata Requests Restrictionsfor specifications on how this dataset can be used with other datasets.
    • Returns in the metadata.elapsed field:
      • A list of time in nanoseconds for:
        • The time to serialize the input.
        • How long each step took.
dataset_excludeList[String] (OPTIONAL)Exclude specified datasets.
dataset_separatorSequence[[String], string] (OPTIONAL)If set to “.”, return dataset will be flattened.
arrowBoolean (Optional)Defaults to False. If arrow=True, then the logs are returned as an Apache Arrow table. If arrow=False, then the logs are returned as pandas record in NDJSON that can be imported into a pandas DataFrame.

All of the parameters can be used together, but start_datetime and end_datetime must be combined; if one is used, then so must the other. If start_datetime and end_datetime are used with any other parameter, then the log results are in chronological order of record insertion.

File sizes are limited to around 10 MB in size. If the requested log file is greater than 10 MB, a Warning will be displayed indicating the end date of the log file downloaded so the request can be adjusted to capture the requested log files.

In this example, the log files are saved as both Pandas DataFrame and Apache Arrow.

# Save the DataFrame version of the log file

mainpipeline.export_logs()
display(os.listdir('./logs'))

mainpipeline.export_logs(arrow=True)
display(os.listdir('./logs'))

    Warning: There are more logs available. Please set a larger limit to export more data.
    

    ['pipeline-logs-1.json']

    Warning: There are more logs available. Please set a larger limit to export more data.
    

    ['pipeline-logs-1.arrow', 'pipeline-logs-1.json']

Pipeline Log Storage

Pipeline logs have a set allocation of storage space and data requirements.

Pipeline Log Storage Warnings

To prevent storage and performance issues, inference result data may be dropped from pipeline logs by the following standards:

  • Columns are progressively removed from the row starting with the largest input data size and working to the smallest, then the same for outputs.

For example, Computer Vision ML Models typically have large inputs and output values - a single pandas DataFrame inference request may be over 13 MB in size, and the inference results nearly as large. To prevent pipeline log storage issues, the input may be dropped from the pipeline logs, and if additional space is needed, the inference outputs would follow. The time column is preserved.

If a pipeline has dropped columns for space purposes, this will be displayed when a log request is made with the following warning, with {columns} replaced with the dropped columns.

The inference log is above the allowable limit and the following columns may have been suppressed for various rows in the logs: {columns}. To review the dropped columns for an individual inferences suppressed data, include dataset=["metadata"] in the log request.

Review Dropped Columns

To review what columns are dropped from pipeline logs for storage reasons, include the dataset metadata in the request to view the column metadata.dropped. This metadata field displays a List of any columns dropped from the pipeline logs.

For example:

metadatalogs = mainpipeline.logs(dataset=["time", "metadata"])
 timemetadata.dropped
02023-07-0615:47:03.673
12023-07-0615:47:03.673
22023-07-0615:47:03.673
32023-07-0615:47:03.673
42023-07-0615:47:03.673
952023-07-0615:47:03.673
962023-07-0615:47:03.673
972023-07-0615:47:03.673
982023-07-0615:47:03.673
992023-07-0615:47:03.673

Suppressed Data Elements

Data elements that do not fit the supported data types below, such as None or Null values, are not supported in pipeline logs. When present, undefined data will be written in the place of the null value, typically zeroes. Any null list values will present an empty list.

7 - Wallaroo SDK Essentials Guide: ML Workload Orchestration

How to create and manage ML Workload Orchestration through the Wallaroo SDK

Wallaroo provides ML Workload Orchestrations and Tasks to automate processes in a Wallaroo instance. For example:

  • Deploy a pipeline, retrieve data through a data connector, submit the data for inferences, undeploy the pipeline
  • Replace a model with a new version
  • Retrieve shadow deployed inference results and submit them to a database

Orchestration Flow

ML Workload Orchestration flow works within 3 tiers:

TierDescription
ML Workload OrchestrationUser created custom instructions that provide automated processes that follow the same steps every time without error. Orchestrations contain the instructions to be performed, uploaded as a .ZIP file with the instructions, requirements, and artifacts.
TaskInstructions on when to run an Orchestration as a scheduled Task. Tasks can be Run Once, where is creates a single Task Run, or Run Scheduled, where a Task Run is created on a regular schedule based on the Kubernetes cronjob specifications. If a Task is Run Scheduled, it will create a new Task Run every time the schedule parameters are met until the Task is killed.
Task RunThe execution of an task. These validate business operations are successful identify any unsuccessful task runs. If the Task is Run Once, then only one Task Run is generated. If the Task is a Run Scheduled task, then a new Task Run will be created each time the schedule parameters are met, with each Task Run having its own results and logs.
Wallaroo Components

One example may be of making donuts.

  • The ML Workload Orchestration is the recipe.
  • The Task is the order to make the donuts. It might be Run Once, so only one set of donuts are made, or Run Scheduled, so donuts are made every 2nd Sunday at 6 AM. If Run Scheduled, the donuts are made every time the schedule hits until the order is cancelled (aka killed).
  • The Task Run are the donuts with their own receipt of creation (logs, etc).

Orchestration Requirements

Orchestrations are uploaded to the Wallaroo instance as a ZIP file with the following requirements:

ParameterTypeDescription
User Code(Required) Python script as .py filesIf main.py exists, then that will be used as the task entrypoint. Otherwise, the first main.py found in any subdirectory will be used as the entrypoint. If no main.py is found, the orchestration will not be accepted.
Python Library Requirements(Optional) requirements.txt file in the requirements file format.A standard Python requirements.txt for any dependencies to be provided in the task environment. The Wallaroo SDK will already be present and should not be included in the requirements.txt. Multiple requirements.txt files are not allowed.
Other artifacts Other artifacts such as files, data, or code to support the orchestration.

Zip Instructions

In a terminal with the zip command, assemble artifacts as above and then create the archive. The zip command is included by default with the Wallaroo JupyterHub service.

zip commands take the following format, with {zipfilename}.zip as the zip file to save the artifacts to, and each file thereafter as the files to add to the archive.

zip {zipfilename}.zip file1, file2, file3....

For example, the following command will add the files main.py and requirements.txt into the file hello.zip.

$ zip hello.zip main.py requirements.txt 
  adding: main.py (deflated 47%)
  adding: requirements.txt (deflated 52%)

Example requirements.txt file

dbt-bigquery==1.4.3
dbt-core==1.4.5
dbt-extractor==0.4.1
dbt-postgres==1.4.5
google-api-core==2.8.2
google-auth==2.11.0
google-auth-oauthlib==0.4.6
google-cloud-bigquery==3.3.2
google-cloud-bigquery-storage==2.15.0
google-cloud-core==2.3.2
google-cloud-storage==2.5.0
google-crc32c==1.5.0
google-pasta==0.2.0
google-resumable-media==2.3.3
googleapis-common-protos==1.56.4

Orchestration Recommendations

The following recommendations will make using Wallaroo orchestrations.

  • The version of Python used should match the same version as in the Wallaroo JupyterHub service.
  • The same version of the Wallaroo SDK should match the server. For a 2023.2.1 Wallaroo instance, use the Wallaroo SDK version 2023.2.1.
  • Specify the version of pip dependencies.
  • The wallaroo.Client constructor auth_type argument is ignored. Using wallaroo.Client() is sufficient.
  • The following methods will assist with orchestrations:
    • wallaroo.in_task() : Returns True if the code is running within an orchestration task.
    • wallaroo.task_args(): Returns a Dict of invocation-specific arguments passed to the run_ calls.
  • Orchestrations will be run in the same way as running within the Wallaroo JupyterHub service, from the version of Python libraries (unless specifically overridden by the requirements.txt setting, which is not recommended), and running in the virtualized directory /home/jovyan/.

Orchestration Code Samples

The following demonstres using the wallaroo.in_task() and wallaroo.task_args() methods within an Orchestration. This sample code uses wallaroo.in_task() to verify whether or not the script is running as a Wallaroo Task. If true, it will gather the wallaroo.task_args() and use them to set the workspace and pipeline. If False, then it sets the pipeline and workspace manually.

# get the arguments
wl = wallaroo.Client()

# if true, get the arguments passed to the task
if wl.in_task():
  arguments = wl.task_args()
  
  # arguments is a key/value pair, set the workspace and pipeline name
  workspace_name = arguments['workspace_name']
  pipeline_name = arguments['pipeline_name']
  
# False:  We're not in a Task, so set the pipeline manually
else:
  workspace_name="bigqueryworkspace"
  pipeline_name="bigquerypipeline"

Orchestration Methods

The following methods are provided for creating and listing orchestrations.

Create Orchestration

An orchestration is created through the Wallaroo Client upload_orchestration(path) with the following parameters.

For the uploads, either the path to the .zip file is required, or bytes_buffer with name are required. path can not be used with bytes_buffer and name, and vice versa.

ParameterTypeDescription
pathString (Optional)The path to the .zip file that contains the orchestration package. Can not be use with bytes_buffer and name are used.
file_nameString (Optional)The file name to give to the zip file when uploaded.
bytes_buffer[bytes] (Optional)The .zip file object to be uploaded. Can not be used with path. Note that if the zip file is uploaded as from the bytes_buffer parameter and file_name is not included, then the file name in the Wallaroo orchestrations list will be -.
nameString (Optional)Sets the name of the byte uploaded zip file.

List Orchestrations

All orchestrations for a Wallaroo instances are listed via the Wallaroo Client list_orchestrations() method. It returns an array with the following.

ParameterTypeDescription
idStringThe UUID identifier for the orchestration.
last run statusStringThe last reported status the task. Valid values are:
  • packaging: The orchestration has been upload and is being prepared.
  • ready: The orchestration is available to be used as a task.
shaStringThe sha value of the uploaded orchestration.
nameStringThe name of the orchestration
filenameStringThe name of the uploaded orchestration file.
created atDateTimeThe date and time the orchestration was uploaded to the Wallaroo instance.
updated atDateTimeThe date and time a new version of the orchestration was uploaded.
wl.list_orchestrations()
idnamestatusfilenameshacreated atupdated at
0f90e606-09f8-409b-a306-cb04ec4c011acomprehensive samplereadyrem