Wallaroo MLOps API Essentials Guide: Model Registry

How to use the Wallaroo API for Model Registry aka Artifact Registries

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Supported Models

The following frameworks are supported. Frameworks fall under either Native or Containerized runtimes in the Wallaroo engine. For more details, see the specific framework what runtime a specific model framework runs in.

Runtime DisplayModel Runtime SpacePipeline Configuration
tensorflowNativeNative Runtime Configuration Methods
onnxNativeNative Runtime Configuration Methods
pythonNativeNative Runtime Configuration Methods
mlflowContainerizedContainerized Runtime Deployment
flightContainerizedContainerized Runtime Deployment

Please note the following.

Wallaroo ONNX Requirements

Wallaroo natively supports Open Neural Network Exchange (ONNX) models into the Wallaroo engine.

ParameterDescription
Web Sitehttps://onnx.ai/
Supported LibrariesSee table below.
FrameworkFramework.ONNX aka onnx
RuntimeNative aka onnx

The following ONNX versions models are supported:

Wallaroo VersionONNX VersionONNX IR VersionONNX OPset VersionONNX ML Opset Version
2023.4.0 (October 2023)1.12.18173
2023.2.1 (July 2023)1.12.18173

For the most recent release of Wallaroo 2023.4.0, the following native runtimes are supported:

  • If converting another ML Model to ONNX (PyTorch, XGBoost, etc) using the onnxconverter-common library, the supported DEFAULT_OPSET_NUMBER is 17.

Using different versions or settings outside of these specifications may result in inference issues and other unexpected behavior.

ONNX models always run in the Wallaroo Native Runtime space.

Data Schemas

ONNX models deployed to Wallaroo have the following data requirements.

  • Equal rows constraint: The number of input rows and output rows must match.
  • All inputs are tensors: The inputs are tensor arrays with the same shape.
  • Data Type Consistency: Data types within each tensor are of the same type.

Equal Rows Constraint

Inference performed through ONNX models are assumed to be in batch format, where each input row corresponds to an output row. This is reflected in the in fields returned for an inference. In the following example, each input row for an inference is related directly to the inference output.

df = pd.read_json('./data/cc_data_1k.df.json')
display(df.head())

result = ccfraud_pipeline.infer(df.head())
display(result)

INPUT

 tensor
0[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
1[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
2[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
3[-1.0603297501, 2.3544967095000002, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192000001, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526000001, 1.9870535692, 0.7005485718000001, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439]
4[0.5817662108, 0.09788155100000001, 0.1546819424, 0.4754101949, -0.19788623060000002, -0.45043448540000003, 0.016654044700000002, -0.0256070551, 0.0920561602, -0.2783917153, 0.059329944100000004, -0.0196585416, -0.4225083157, -0.12175388770000001, 1.5473094894000001, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355000001, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.10867738980000001, 0.2547179311]

OUTPUT

 timein.tensorout.dense_1check_failures
02023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
12023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
22023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
32023-11-17 20:34:17.005[-1.0603297501, 2.3544967095, -3.5638788326, 5.1387348926, -1.2308457019, -0.7687824608, -3.5881228109, 1.8880837663, -3.2789674274, -3.9563254554, 4.0993439118, -5.6539176395, -0.8775733373, -9.131571192, -0.6093537873, -3.7480276773, -5.0309125017, -0.8748149526, 1.9870535692, 0.7005485718, 0.9204422758, -0.1041491809, 0.3229564351, -0.7418141657, 0.0384120159, 1.0993439146, 1.2603409756, -0.1466244739, -1.4463212439][0.99300325]0
42023-11-17 20:34:17.005[0.5817662108, 0.097881551, 0.1546819424, 0.4754101949, -0.1978862306, -0.4504344854, 0.0166540447, -0.0256070551, 0.0920561602, -0.2783917153, 0.0593299441, -0.0196585416, -0.4225083157, -0.1217538877, 1.5473094894, 0.2391622864, 0.3553974881, -0.7685165301, -0.7000849355, -0.1190043285, -0.3450517133, -1.1065114108, 0.2523411195, 0.0209441826, 0.2199267436, 0.2540689265, -0.0450225094, 0.1086773898, 0.2547179311][0.0010916889]0

All Inputs Are Tensors

All inputs into an ONNX model must be tensors. This requires that the shape of each element is the same. For example, the following is a proper input:

t [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]
Standard tensor array

Another example is a 2,2,3 tensor, where the shape of each element is (3,), and each element has 2 rows.

t = [
        [2.35, 5.75, 19.2],
        [3.72, 8.55, 10.5]
    ],
    [
        [5.55, 7.2, 15.7],
        [9.6, 8.2, 2.3]
    ]

In this example each element has a shape of (2,). Tensors with elements of different shapes, known as ragged tensors, are not supported. For example:

t = [
    [2.35, 5.75],
    [3.72, 8.55, 10.5],
    [5.55, 97.2]
])

**INVALID SHAPE**
Ragged tensor array - unsupported

For models that require ragged tensor or other shapes, see other data formatting options such as Bring Your Own Predict models.

Data Type Consistency

All inputs into an ONNX model must have the same internal data type. For example, the following is valid because all of the data types within each element are float32.

t = [
    [2.35, 5.75],
    [3.72, 8.55],
    [5.55, 97.2]
]

The following is invalid, as it mixes floats and strings in each element:

t = [
    [2.35, "Bob"],
    [3.72, "Nancy"],
    [5.55, "Wani"]
]

The following inputs are valid, as each data type is consistent within the elements.

df = pd.DataFrame({
    "t": [
        [2.35, 5.75, 19.2],
        [5.55, 7.2, 15.7],
    ],
    "s": [
        ["Bob", "Nancy", "Wani"],
        ["Jason", "Rita", "Phoebe"]
    ]
})
df
 ts
0[2.35, 5.75, 19.2][Bob, Nancy, Wani]
1[5.55, 7.2, 15.7][Jason, Rita, Phoebe]
ParameterDescription
Web Sitehttps://www.tensorflow.org/
Supported Librariestensorflow==2.9.3
FrameworkFramework.TENSORFLOW aka tensorflow
RuntimeNative aka tensorflow
Supported File TypesSavedModel format as .zip file

TensorFlow File Format

TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.

See the SavedModel guide for full details.

ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.PYTHON aka python
RuntimeNative aka python

Python models uploaded to Wallaroo are executed as a native runtime.

Note that Python models - aka “Python steps” - are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

This is contrasted with Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Python Models Requirements

Python models uploaded to Wallaroo are Python scripts that must include the wallaroo_json method as the entry point for the Wallaroo engine to use it as a Pipeline step.

This method receives the results of the previous Pipeline step, and its return value will be used in the next Pipeline step.

If the Python model is the first step in the pipeline, then it will be receiving the inference request data (for example: a preprocessing step). If it is the last step in the pipeline, then it will be the data returned from the inference request.

In the example below, the Python model is used as a post processing step for another ML model. The Python model expects to receive data from a ML Model who’s output is a DataFrame with the column dense_2. It then extracts the values of that column as a list, selects the first element, and returns a DataFrame with that element as the value of the column output.

def wallaroo_json(data: pd.DataFrame):
    print(data)
    return [{"output": [data["dense_2"].to_list()[0][0]]}]

In line with other Wallaroo inference results, the outputs of a Python step that returns a pandas DataFrame or Arrow Table will be listed in the out. metadata, with all inference outputs listed as out.{variable 1}, out.{variable 2}, etc. In the example above, this results the output field as the out.output field in the Wallaroo inference result.

 timein.tensorout.outputcheck_failures
02023-06-20 20:23:28.395[0.6878518042, 0.1760734021, -0.869514083, 0.3..[12.886651039123535]0
ParameterDescription
Web Sitehttps://huggingface.co/models
Supported Libraries
  • transformers==4.34.1
  • diffusers==0.14.0
  • accelerate==0.23.0
  • torchvision==0.14.1
  • torch==1.13.1
FrameworksThe following Hugging Face pipelines are supported by Wallaroo.
  • Framework.HUGGING_FACE_FEATURE_EXTRACTION aka hugging-face-feature-extraction
  • Framework.HUGGING_FACE_IMAGE_CLASSIFICATION aka hugging-face-image-classification
  • Framework.HUGGING_FACE_IMAGE_SEGMENTATION aka hugging-face-image-segmentation
  • Framework.HUGGING_FACE_IMAGE_TO_TEXT aka hugging-face-image-to-text
  • Framework.HUGGING_FACE_OBJECT_DETECTION aka hugging-face-object-detection
  • Framework.HUGGING_FACE_QUESTION_ANSWERING aka hugging-face-question-answering
  • Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG aka hugging-face-stable-diffusion-text-2-img
  • Framework.HUGGING_FACE_SUMMARIZATION aka hugging-face-summarization
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION aka hugging-face-text-classification
  • Framework.HUGGING_FACE_TRANSLATION aka hugging-face-translation
  • Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION aka hugging-face-zero-shot-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION aka hugging-face-zero-shot-image-classification
  • Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION aka hugging-face-zero-shot-object-detection
  • Framework.HUGGING_FACE_SENTIMENT_ANALYSIS aka hugging-face-sentiment-analysis
  • Framework.HUGGING_FACE_TEXT_GENERATION aka hugging-face-text-generation
  • Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION aka hugging-face-automatic-speech-recognition
RuntimeContainerized flight

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

Hugging Face Schemas

Input and output schemas for each Hugging Face pipeline are defined below. Note that adding additional inputs not specified below will raise errors, except for the following:

  • Framework.HUGGING_FACE_IMAGE_TO_TEXT
  • Framework.HUGGING_FACE_TEXT_CLASSIFICATION
  • Framework.HUGGING_FACE_SUMMARIZATION
  • Framework.HUGGING_FACE_TRANSLATION

Additional inputs added to these Hugging Face pipelines will be added as key/pair value arguments to the model’s generate method. If the argument is not required, then the model will default to the values coded in the original Hugging Face model’s source code.

See the Hugging Face Pipeline documentation for more details on each pipeline and framework.

Wallaroo FrameworkReference
Framework.HUGGING_FACE_FEATURE_EXTRACTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string())
])
output_schema = pa.schema([
    pa.field('output', pa.list_(
        pa.list_(
            pa.float64(),
            list_size=128
        ),
    ))
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    pa.field('top_k', pa.int64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)),
    pa.field('label', pa.list_(pa.string(), list_size=2)),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_SEGMENTATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
    pa.field('mask_threshold', pa.float64()),
    pa.field('overlap_mask_area_threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('mask', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=100
                ),
                list_size=100
            ),
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_IMAGE_TO_TEXT

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.list_( #required
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=100
        ),
        list_size=100
    )),
    # pa.field('max_new_tokens', pa.int64()),  # optional
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string())),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_OBJECT_DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('threshold', pa.float64()),
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())),
    pa.field('label', pa.list_(pa.string())),
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_QUESTION_ANSWERING

Schemas:

input_schema = pa.schema([
    pa.field('question', pa.string()),
    pa.field('context', pa.string()),
    pa.field('top_k', pa.int64()),
    pa.field('doc_stride', pa.int64()),
    pa.field('max_answer_len', pa.int64()),
    pa.field('max_seq_len', pa.int64()),
    pa.field('max_question_len', pa.int64()),
    pa.field('handle_impossible_answer', pa.bool_()),
    pa.field('align_to_words', pa.bool_()),
])

output_schema = pa.schema([
    pa.field('score', pa.float64()),
    pa.field('start', pa.int64()),
    pa.field('end', pa.int64()),
    pa.field('answer', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_STABLE_DIFFUSION_TEXT_2_IMG

Schemas:

input_schema = pa.schema([
    pa.field('prompt', pa.string()),
    pa.field('height', pa.int64()),
    pa.field('width', pa.int64()),
    pa.field('num_inference_steps', pa.int64()), # optional
    pa.field('guidance_scale', pa.float64()), # optional
    pa.field('negative_prompt', pa.string()), # optional
    pa.field('num_images_per_prompt', pa.string()), # optional
    pa.field('eta', pa.float64()) # optional
])

output_schema = pa.schema([
    pa.field('images', pa.list_(
        pa.list_(
            pa.list_(
                pa.int64(),
                list_size=3
            ),
            list_size=128
        ),
        list_size=128
    )),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_SUMMARIZATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_text', pa.bool_()),
    pa.field('return_tensors', pa.bool_()),
    pa.field('clean_up_tokenization_spaces', pa.bool_()),
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('summary_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TEXT_CLASSIFICATION

Schemas

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('top_k', pa.int64()), # optional
    pa.field('function_to_apply', pa.string()), # optional
])

output_schema = pa.schema([
    pa.field('label', pa.list_(pa.string(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # list with a number of items same as top_k, list_size can be skipped but may lead in worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TRANSLATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('src_lang', pa.string()), # optional
    pa.field('tgt_lang', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('translation_text', pa.string()),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', pa.string()), # required
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
    pa.field('multi_label', pa.bool_()), # optional
])

output_schema = pa.schema([
    pa.field('sequence', pa.string()),
    pa.field('scores', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
    pa.field('labels', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels, list_size can be skipped by may result in slightly worse performance
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_IMAGE_CLASSIFICATION

Schemas:

input_schema = pa.schema([
    pa.field('inputs', # required
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=100
            ),
        list_size=100
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=2)), # required
    pa.field('hypothesis_template', pa.string()), # optional
]) 

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64(), list_size=2)), # same as number of candidate labels
    pa.field('label', pa.list_(pa.string(), list_size=2)), # same as number of candidate labels
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_ZERO_SHOT_OBJECT_DETECTION

Schemas:

input_schema = pa.schema([
    pa.field('images', 
        pa.list_(
            pa.list_(
                pa.list_(
                    pa.int64(),
                    list_size=3
                ),
                list_size=640
            ),
        list_size=480
    )),
    pa.field('candidate_labels', pa.list_(pa.string(), list_size=3)),
    pa.field('threshold', pa.float64()),
    # pa.field('top_k', pa.int64()), # we want the model to return exactly the number of predictions, we shouldn't specify this
])

output_schema = pa.schema([
    pa.field('score', pa.list_(pa.float64())), # variable output, depending on detected objects
    pa.field('label', pa.list_(pa.string())), # variable output, depending on detected objects
    pa.field('box', 
        pa.list_( # dynamic output, i.e. dynamic number of boxes per input image, each sublist contains the 4 box coordinates 
            pa.list_(
                    pa.int64(),
                    list_size=4
                ),
            ),
    ),
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_SENTIMENT_ANALYSISHugging Face Sentiment Analysis
Wallaroo FrameworkReference
Framework.HUGGING_FACE_TEXT_GENERATION

Any parameter that is not part of the required inputs list will be forwarded to the model as a key/pair value to the underlying models generate method. If the additional input is not supported by the model, an error will be returned.

input_schema = pa.schema([
    pa.field('inputs', pa.string()),
    pa.field('return_tensors', pa.bool_()), # optional
    pa.field('return_text', pa.bool_()), # optional
    pa.field('return_full_text', pa.bool_()), # optional
    pa.field('clean_up_tokenization_spaces', pa.bool_()), # optional
    pa.field('prefix', pa.string()), # optional
    pa.field('handle_long_generation', pa.string()), # optional
    # pa.field('extra_field', pa.int64()), # every extra field you specify will be forwarded as a key/value pair
])

output_schema = pa.schema([
    pa.field('generated_text', pa.list_(pa.string(), list_size=1))
])
Wallaroo FrameworkReference
Framework.HUGGING_FACE_AUTOMATIC_SPEECH_RECOGNITION

Sample input and output schema.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32())), # required: the audio stored in numpy arrays of shape (num_samples,) and data type `float32`
    pa.field('return_timestamps', pa.string()) # optional: return start & end times for each predicted chunk
]) 

output_schema = pa.schema([
    pa.field('text', pa.string()), # required: the output text corresponding to the audio input
    pa.field('chunks', pa.list_(pa.struct([('text', pa.string()), ('timestamp', pa.list_(pa.float32()))]))), # required (if `return_timestamps` is set), start & end times for each predicted chunk
])
ParameterDescription
Web Sitehttps://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
FrameworkFramework.PYTORCH aka pytorch
Supported File Typespt ot pth in TorchScript format
Runtimeonnx/flight
  • IMPORTANT NOTE: The PyTorch model must be in TorchScript format. scripting (i.e. torch.jit.script() is always recommended over tracing (i.e. torch.jit.trace()). From the PyTorch documentation: “Scripting preserves dynamic control flow and is valid for inputs of different sizes.” For more details, see TorchScript-based ONNX Exporter: Tracing vs Scripting.

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

  • IMPORTANT CONFIGURATION NOTE: For PyTorch input schemas, the floats must be pyarrow.float32() for the PyTorch model to be converted to the Native Wallaroo Runtime during the upload process.

Sci-kit Learn aka SKLearn.

ParameterDescription
Web Sitehttps://scikit-learn.org/stable/index.html
Supported Libraries
  • scikit-learn==1.3.0
FrameworkFramework.SKLEARN aka sklearn
Runtimeonnx / flight

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

SKLearn Schema Inputs

SKLearn schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an SKLearn model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

SKLearn Schema Outputs

Outputs for SKLearn that are meant to be predictions or probabilities when output by the model are labeled in the output schema for the model when uploaded to Wallaroo. For example, a model that outputs either 1 or 0 as its output would have the output schema as follows:

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

When used in Wallaroo, the inference result is contained in the out metadata as out.predictions.

pipeline.infer(dataframe)
 timein.inputsout.predictionscheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.tensorflow.org/api_docs/python/tf/keras/Model
Supported Libraries
  • tensorflow==2.9.3
  • keras==2.9.0
FrameworkFramework.KERAS aka keras
Supported File TypesSavedModel format as .zip file and HDF5 format
Runtimeonnx/flight

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

TensorFlow Keras SavedModel Format

TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm:

├── saved_model.pb
└── variables
    ├── variables.data-00000-of-00002
    ├── variables.data-00001-of-00002
    └── variables.index

This is compressed into the .zip file alohacnnlstm.zip with the following command:

zip -r alohacnnlstm.zip alohacnnlstm/

See the SavedModel guide for full details.

TensorFlow Keras H5 Format

Wallaroo supports the H5 for Tensorflow Keras models.

ParameterDescription
Web Sitehttps://xgboost.ai/
Supported Libraries
  • scikit-learn==1.3.0
  • xgboost==1.7.4
FrameworkFramework.XGBOOST aka xgboost
Supported File Typespickle (XGB files are not supported.)
Runtimeonnx / flight

During the model upload process, the Wallaroo instance will attempt to convert the model to a Native Wallaroo Runtime. If unsuccessful based , it will create a Wallaroo Containerized Runtime for the model. See the model deployment section for details on how to configure pipeline resources based on the model’s runtime.

XGBoost Schema Inputs

XGBoost schema follows a different format than other models. To prevent inputs from being out of order, the inputs should be submitted in a single row in the order the model is trained to accept, with all of the data types being the same. If a model is originally trained to accept inputs of different data types, it will need to be retrained to only accept one data type for each column - typically pa.float64() is a good choice.

For example, the following DataFrame has 4 columns, each column a float.

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

For submission to an XGBoost model, the data input schema will be a single array with 4 float values.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=4))
])

When submitting as an inference, the DataFrame is converted to rows with the column data expressed as a single array. The data must be in the same order as the model expects, which is why the data is submitted as a single array rather than JSON labeled columns: this insures that the data is submitted in the exact order as the model is trained to accept.

Original DataFrame:

 sepal length (cm)sepal width (cm)petal length (cm)petal width (cm)
05.13.51.40.2
14.93.01.40.2

Converted DataFrame:

 inputs
0[5.1, 3.5, 1.4, 0.2]
1[4.9, 3.0, 1.4, 0.2]

XGBoost Schema Outputs

Outputs for XGBoost are labeled based on the trained model outputs. For this example, the output is simply a single output listed as output. In the Wallaroo inference result, it is grouped with the metadata out as out.output.

output_schema = pa.schema([
    pa.field('output', pa.int32())
])
pipeline.infer(dataframe)
 timein.inputsout.outputcheck_failures
02023-07-05 15:11:29.776[5.1, 3.5, 1.4, 0.2]00
12023-07-05 15:11:29.776[4.9, 3.0, 1.4, 0.2]00
ParameterDescription
Web Sitehttps://www.python.org/
Supported Librariespython==3.8
FrameworkFramework.CUSTOM aka custom
RuntimeContainerized aka flight

Arbitrary Python models, also known as Bring Your Own Predict (BYOP) allow for custom model deployments with supporting scripts and artifacts. These are used with pre-trained models (PyTorch, Tensorflow, etc) along with whatever supporting artifacts they require. Supporting artifacts can include other Python modules, model files, etc. These are zipped with all scripts, artifacts, and a requirements.txt file that indicates what other Python models need to be imported that are outside of the typical Wallaroo platform.

Contrast this with Wallaroo Python models - aka “Python steps”. These are standalone python scripts that use the python libraries natively supported by the Wallaroo platform. These are used for either simple model deployment (such as ARIMA Statsmodels), or data formatting such as the postprocessing steps. A Wallaroo Python model will be composed of one Python script that matches the Wallaroo requirements.

Arbitrary Python File Requirements

Arbitrary Python (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:

ArtifactTypeDescription
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilderPython ScriptExtend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Arbitrary Python Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below.
requirements.txtPython requirements fileThis sets the Python libraries used for the arbitrary python model. These libraries should be targeted for Python 3.8 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process.
Other artifactsFilesOther models, files, and other artifacts used in support of this model.

For example, the if the arbitrary python model will be known as vgg_clustering, the contents may be in the following structure, with vgg_clustering as the storage directory:

vgg_clustering\
    feature_extractor.h5
    kmeans.pkl
    custom_inference.py
    requirements.txt

Note the inclusion of the custom_inference.py file. This file name is not required - any Python script or scripts that extend the classes listed above are sufficient. This Python script could have been named vgg_custom_model.py or any other name as long as it includes the extension of the classes listed above.

The sample arbitrary python model file is created with the command zip -r vgg_clustering.zip vgg_clustering/.

Wallaroo Arbitrary Python uses the Wallaroo SDK mac module, included in the Wallaroo SDK 2023.2.1 and above. See the Wallaroo SDK Install Guides for instructions on installing the Wallaroo SDK.

Arbitrary Python Script Requirements

The entry point of the arbitrary python model is any python script that extends the following classes. These are included with the Wallaroo SDK. The required methods that must be overridden are specified in each section below.

  • mac.inference.Inference interface serves model inferences based on submitted input some input. Its purpose is to serve inferences for any supported arbitrary model framework (e.g. scikit, keras etc.).

    classDiagram
        class Inference {
            <<Abstract>>
            +model Optional[Any]
            +expected_model_types()* Set
            +predict(input_data: InferenceData)*  InferenceData
            -raise_error_if_model_is_not_assigned() None
            -raise_error_if_model_is_wrong_type() None
        }
  • mac.inference.creation.InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to to the Inference object.

    classDiagram
        class InferenceBuilder {
            +create(config InferenceConfig) * Inference
            -inference()* Any
        }

mac.inference.Inference

mac.inference.Inference Objects
ObjectTypeDescription
model (Required)[Any]One or more objects that match the expected_model_types. This can be a ML Model (for inference use), a string (for data conversion), etc. See Arbitrary Python Examples for examples.
mac.inference.Inference Methods
MethodReturnsDescription
expected_model_types (Required)SetReturns a Set of models expected for the inference as defined by the developer. Typically this is a set of one. Wallaroo checks the expected model types to verify that the model submitted through the InferenceBuilder method matches what this Inference class expects.
_predict (input_data: mac.types.InferenceData) (Required)mac.types.InferenceDataThe entry point for the Wallaroo inference with the following input and output parameters that are defined when the model is updated.
  • mac.types.InferenceData: The input InferenceData is a Dictionary of numpy arrays derived from the input_schema detailed when the model is uploaded, defined in PyArrow.Schema format.
  • mac.types.InferenceData: The output is a Dictionary of numpy arrays as defined by the output parameters defined in PyArrow.Schema format.
The InferenceDataValidationError exception is raised when the input data does not match mac.types.InferenceData.
raise_error_if_model_is_not_assignedN/AError when a model is not set to Inference.
raise_error_if_model_is_wrong_typeN/AError when the model does not match the expected_model_types.

The example, the expected_model_types can be defined for the KMeans model.

from sklearn.cluster import KMeans

class SampleClass(mac.inference.Inference):
    @property
    def expected_model_types(self) -> Set[Any]:
        return {KMeans}

mac.inference.creation.InferenceBuilder

InferenceBuilder builds a concrete Inference, i.e. instantiates an Inference object, loads the appropriate model and assigns the model to the Inference.

classDiagram
    class InferenceBuilder {
        +create(config InferenceConfig) * Inference
        -inference()* Any
    }

Each model that is included requires its own InferenceBuilder. InferenceBuilder loads one model, then submits it to the Inference class when created. The Inference class checks this class against its expected_model_types() Set.

mac.inference.creation.InferenceBuilder Methods
MethodReturnsDescription
create(config mac.config.inference.CustomInferenceConfig) (Required)The custom Inference instance.Creates an Inference subclass, then assigns a model and attributes. The CustomInferenceConfig is used to retrieve the config.model_path, which is a pathlib.Path object pointing to the folder where the model artifacts are saved. Every artifact loaded must be relative to config.model_path. This is set when the arbitrary python .zip file is uploaded and the environment for running it in Wallaroo is set. For example: loading the artifact vgg_clustering\feature_extractor.h5 would be set with config.model_path \ feature_extractor.h5. The model loaded must match an existing module. For our example, this is from sklearn.cluster import KMeans, and this must match the Inference expected_model_types.
inferencecustom Inference instance.Returns the instantiated custom Inference object created from the create method.

Arbitrary Python Runtime

Arbitrary Python always run in the containerized model runtime.

ParameterDescription
Web Sitehttps://mlflow.org
Supported Librariesmlflow==1.3.0
RuntimeContainerized aka mlflow

For models that do not fall under the supported model frameworks, organizations can use containerized MLFlow ML Models.

This guide details how to add ML Models from a model registry service into Wallaroo.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

Wallaroo users can register their trained MLFlow ML Models from a containerized model container registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

As of this time, Wallaroo only supports MLFlow 1.30.0 containerized models. For information on how to containerize an MLFlow model, see the MLFlow Documentation.

Wallaroo supports both public and private containerized model registries. See the Wallaroo Private Containerized Model Container Registry Guide for details on how to configure a Wallaroo instance with a private model registry.

List Wallaroo Frameworks

Wallaroo frameworks are listed from the Wallaroo.Framework class. The following demonstrates listing all available supported frameworks.

from wallaroo.framework import Framework

[e.value for e in Framework]

    ['onnx',
    'tensorflow',
    'python',
    'keras',
    'sklearn',
    'pytorch',
    'xgboost',
    'hugging-face-feature-extraction',
    'hugging-face-image-classification',
    'hugging-face-image-segmentation',
    'hugging-face-image-to-text',
    'hugging-face-object-detection',
    'hugging-face-question-answering',
    'hugging-face-stable-diffusion-text-2-img',
    'hugging-face-summarization',
    'hugging-face-text-classification',
    'hugging-face-translation',
    'hugging-face-zero-shot-classification',
    'hugging-face-zero-shot-image-classification',
    'hugging-face-zero-shot-object-detection',
    'hugging-face-sentiment-analysis',
    'hugging-face-text-generation']

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

RoleRecommended ActionsDescription
DevOps EngineerCreate Model RegistryCreate the model (AKA artifact) registry service
 Retrieve Model Registry TokensGenerate the model registry service credentials.
MLOps EngineerConnect Model Registry to WallarooAdd the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
 Add Wallaroo Registry Service to WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
 Get Registry DetailsRetrieve the connection details for a Wallaroo registry.
 Remove Wallaroo Registry from a WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
Data ScientistList Models in RegistryList available models in a model registry.
 List Model Version ArtifactsRetrieve the artifacts (usually files) for a model stored in a model registry.
 Upload Model from RegistryUpload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

  1. Log into the Azure Databricks workspace.
  2. From the upper right corner access the User Settings.
  3. From the Access tokens, select Generate new token.
  4. Specify any token description and lifetime. Once complete, select Generate.
  5. Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.
Retrieve Azure Databricks User Token

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Quick Start Guide.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

  • Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
  • Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
  • Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the following endpoint.

  • REQUEST URL
    • v1/api/models/create_registry
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to create the registry in.
    • name (String Required): The name for the registry. Registry names are not unique.
    • url (String Required): The full URL of the registry service. For example: https://registry.wallaroo.ai
    • token (String Required): The authentication token used by the registry service.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Connect Model Registry to Wallaroo Example

The following registry will be added to the workspace with the id 1.

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/create_registry", 
                    json={
                        "workspace_id": 1, 
                        "name": "sample registry", 
                        "url": "https://registry.wallaroo.ai", 
                        "token": token
                        }, 
                        headers=wl.auth.auth_header()
                    )

{'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 'workspace_id': 1}

Add Wallaroo Registry Service to Workspace

Registries are assigned to a Wallaroo workspace with the following endpoint. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

  • REQUEST URL
    • v1/api/models//v1/api/models/attach_registry_to_workspace
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to create the registry in.
    • registry_id (String Required): The ID of the registry in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Add Wallaroo Registry Service to Workspace Example

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/attach_registry_to_workspace", 
                    json={
                      'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 
                      'workspace_id': 1},
                        headers=wl.auth.auth_header()
                    )

{
  "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "workspace_id": 0
}

Remove Wallaroo Registry from a Workspace

Registries are removed from a registry with the following endpoint. This does not remove the registry connection information from the Wallaroo instance, merely removes the association between the registry and that particular workspace.

  • REQUEST URL
    • v1/api/models//v1/api/models/remove_registry_from_workspace
  • PARAMETERS
    • workspace_id (Integer Required): The numerical ID of the workspace to remove the registry from.
    • registry_id (String Required): The ID of the registry in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • workspace_id: The numerical ID of the workspace the registry was connected to.

Remove Wallaroo Registry from a Workspace Example

import requests
token = "abcdefg"

x = requests.post("https://{APIURL}/v1/api/models/remove_registry_from_workspace", 
                    json={
                      'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 
                      'workspace_id': 1
                      },
                        headers=wl.auth.auth_header()
                    )

{
  "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "workspace_id": 0
}

Get Registry Details

  • REQUEST URL
    • v1/api/models/get_registry
  • PARAMETERS
    • id (String Required): The registry ID in UUID format.
  • RETURNS
    • id (String): The UUID of the registry.
    • name (String Required): The name for the registry. Registry names are not unique.
    • url (String Required): The full URL of the registry service. For example: https://registry.wallaroo.ai
    • token (String Required): The authentication token used by the registry service.
    • created_at (String Required): The creation date in DateTime format.
    • updated_at (String Required): The updated date in DateTime format.

Get Registry Details Example

The following example demonstrates retrieving the registry with id 98f9ca1d-c4e7-4d70-8df4-05c25a64be29.

import requests
x = requests.post("https://{APIURL}/v1/api/models/get_registry", 
                    json={
                        "registry_id": '98f9ca1d-c4e7-4d70-8df4-05c25a64be29'
                        }, 
                    headers=wl.auth.auth_header()
                )
{
    'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29',
    'name': 'sample registry',
    'url': 'https://registry.wallaroo.ai',
    'token': 'dapi67c8c0b04606f730e78b7ae5e3221015-3',
    'created_at': '2023-06-23T15:37:38.38427+00:00',
    'updated_at': '2023-06-23T15:37:38.38427+00:00'
}

Wallaroo Registry Model Operations

List Registries in Workspace

A list of registries associates with a specific workspace are retrieved from the following endpoint.

  • REQUEST URL
    • POST /v1/api/models/list_registries
  • PARAMETERS
    • workspace-id (String Required): The numerical id of the workspace.
  • RETURNS
    • A list of registries with the following fields.
      • created_at (String): The UTC DateTime stamp of when the registry was created.
      • id (String): The id of the Wallaroo registry in UUID format.
      • name (String): The assigned name of the registry.
      • token (String): The authentication token to the model registry.
      • updated_at (String): The UTC DateTime stamp of when the registry was last updated.
      • url (String): The URL to the registry service.

List Registries in Workspace Example

import requests
x = requests.post("{APIURL}v1/api/models/list_registries", 
                    json={
                      "workspace_id": 1
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

[
  {
    "created_at": "2023-07-11T15:21:24.403Z",
    "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
    "name": "sample registry",
    "token": "string",
    "updated_at": "2023-07-11T15:21:24.403Z",
    "url": "string"
  }
]

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the following endpoint.

  • REQUEST URL
    • v1/api/models/list_registry_models
  • PARAMETERS
    • id (String Required): The registry ID in UUID format.
  • RETURNS
    • next_page_token (String): Used to retrieve the next List of models. If None, there are no additional models to list after this request.
    • registered_models (List): List of Models based on the mlflow.entities.model_registry.ModelVersion specification.
      • name (String): The name of the model.
      • user_id (String): The ID of the user.
      • latest_versions (List): The list of other versions of the model. If there are no other versions of the model, this will be None. Each version has the following fields.
        • name (String): The name of the artifact.
        • description (String): The description of the artifact.
        • version (Integer): The version number. Other versions may be removed from the registry, so it is possible for a version to be higher than 1 even if there are no other versions still stored in the registry.
        • status (String): The current status of the model.
        • run_id (String): The run id from the MLFlow service that created the model.
        • run_link (String): Link to the run from the MLFlow service that created the model.
        • source (String): The URL for the specific version artifact on this registry service. For example: 'dbfs:/databricks/mlflow-tracking/123456/abcdefg/artifacts/random_forest_model'.
        • current_stage (String): The current stage of the model version.
        • creation_timestamp (Integer): Creation timestamp in milliseconds since the Unix epoch.
        • last_updated_timestamp (Integer): Last updated timestamp in milliseconds since the Unix epoch.

List Models in a Registry Example

import requests
x = requests.post("{APIURL}v1/api/models/list_registry_models", 
                    json={
                      "registry_id": id
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

{
    'next_page_token': None,
    'registered_models': [
        {
            'name': 'testmodel', 
            'user_id': 'sample.usersj@wallaroo.ai', 
            'latest_versions': None, 
            'creation_timestamp': 1686940722329, 
            'last_updated_timestamp': 1686940722329
        },
        {
            'name': 'testmodel2', 
            'user_id': 'sample.user@wallaroo.ai', 
            'latest_versions': None, 
            'creation_timestamp': 1686940864528, 
            'last_updated_timestamp': 1686940864528
        },
        {
            'name': 'wine_quality', 
            'user_id': 'sample.user@wallaroo.ai', 
            'latest_versions': [
                {
                    'name': 'wine_quality', 
                    'description': None, 
                    'version': '1', 
                    'status': 'READY', 
                    'run_id': 'abcdefg', 
                    'run_link': None, 
                    'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/random_forest_model', 
                    'current_stage': 'Archived', 
                    'creation_timestamp': 1686942353367, 
                    'last_updated_timestamp': 1686942597509
                },
                {
                    'name': 'wine_quality', 
                    'description': None, 
                    'version': '2', 
                    'status': 'READY', 
                    'run_id': 'abcdefg', 
                    'run_link': None, 
                    'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/model', 
                    'current_stage': 'Production', 
                    'creation_timestamp': 1686942576120, 
                    'last_updated_timestamp': 1686942597646
                }
            ], 
            'creation_timestamp': 1686942353127, 
            'last_updated_timestamp': 1686942597646
        }
    ]
}

List Model Version Artifacts

The artifacts of a specific model version are retrieved through the following endpoint.

  • REQUEST URL
    • v1/api/models/list_registry_model_version_artifacts
  • PARAMETERS
    • registry_id (String Required): The registry ID in UUID format.
    • name (String Required): The name of the model to retrieve artifacts from.
    • version (String Required): The version of the model to retrieve artifacts from.
  • RETURNS
    • List of artifacts with the following fields.
      • file_size (Integer): The size of the file in bytes.
      • full_path (String): The full path to the artifact. For example: https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
      • is_dir (Boolean): Whether the artifact is a directory or file.
      • modification_time (Integer): Last modification timestamp in milliseconds since the Unix epoch
      • path (String): Relative path to the artifact within the registry. For example: /databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl

List Model Version Artifacts Example

import requests
x = requests.post("{APIURL}v1/api/models//list_registry_model_version_artifacts", 
                    json={
                      "name": "wine_quality",
                      "registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
                      "version": "2"
                      }, 
                      headers=wl.auth.auth_header()
                      )
x.json()

[
  {
    "file_size": 156456,
    "full_path": "https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl",
    "is_dir": false,
    "modification_time": 1686942597646,
    "path": "/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl"
  }
]

Upload Model from Registry

Models are uploaded from a model registry configured in Wallaroo through the following endpoint. The specific artifact that is the model to be deployed is the item to upload to the Wallaroo workspace. Models must comply with Wallaroo model framework and versions as defined in Artifact Requirements.

  • REQUEST URL
    • v1/api/models/upload_from_registry
  • PARAMETERS
    • registry_id (String Required): The registry ID in UUID format.
    • name (String Required): The name to assign the model in Wallaroo. Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.
    • path (String Required): URL of the model to upload as specified in List Models in a Registry source field.
    • visibility (String Required): Whether the model is public or private.
    • workspace_id (Integer Required): The numerical id of the workspace to upload the model to.
    • conversion (String Required): The conversion parameters that include the following:
      • framework (String Required): The framework of the model being uploaded. See the list of supported models for more details.
      • python_version (String Required): The version of Python required for model.
      • requirements (String Required): Required libraries. Can be [] if the requirements are default Wallaroo JupyterHub libraries.
      • input_schema (String Optional): The input schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
      • output_schema (String Optional): The output schema from the Apache Arrow pyarrow.lib.Schema format, encoded with base64.b64encode. Only required for non-native runtime models.
  • RETURNS
    • model_id (String): The numerical id of the model uploaded

Upload Model from Registry Example

import requests
x = requests.post("{APIURL}v1/api/models/upload_from_registry", json={
    "registry_id": id, 
    "name": "uploaded-model-name",
    "path": "<DBFS URL from list_artifacts() call here>",
    "visibility": "public",
    "workspace_id": 1,
    "conversion": {
        "framework": "sklearn",
        "requirements": [],
    },
    "input_schema": "<base64-encoded input schema here>",
    "output_schema": "<base64-encoded output schema here>"
}, headers=wl.auth.auth_header())
x.json()

{'model_id': 34}