Packaging and Deployment

How to upload and deploy your Wallaroo models.

Organizations upload or register ML Models to their Wallaroo Ops instance. From here, they can:

Deploy models through a Wallaroo pipeline.
Compare model versions against each other for performance, accuracy, or other criteria.
Upload or new versions of models.

A model or Machine Learning (ML) model is an algorithm developed using historical datasets (also known as training data) to generate a specific set of insights. Trained models can operate on future datasets (non-training sets) and offer predictions (also known as inferences). Inferences help inform decisions based on the similarity between historical data and future data.
Some examples of using a ML model are:
Approving credit card transaction based on fraud predictions.
Recommending a specific therapy to a patient based on diagnosis predictions.
Recommending a specific product to purchase in an e-commerce experience based on consumer’s likelihood to be interested in it, their predicted shopping budgets as well as projected revenue from this consumer.

Model in Wallaroo refers to the resulting object from converting the model file artifact. For example, a model file would typically be produced from training a model (e.g .zip file, .onnx file etc) outside of Wallaroo. Uploading the model file to be able to run in a given Wallaroo runtime (onnx, TensorFlow etc.) results in a Wallaroo model object. Model artifacts imported to Wallaroo may include other files related to a given model such as preprocessing files, postprocessing files, training sets, notebooks etc.

ML Models uploaded or registered in a Wallaroo Ops instance create a new model version. Model version refers to the version of the model object in Wallaroo. In Wallaroo, a model version update happens when we upload a new model file (artifact) against the same model object name.

When working with model deployments, it is the model version that is deployed and used in the Wallaroo Inference Engine to result results.

The typical actions for Wallaroo model management are:

Create or access a workspace.
Upload or access a model and retrieve the specific model version.
If necessary, add any additional model version configuration.

Once a model is uploaded, it is deployed within a Wallaroo pipeline. When a pipeline is deployed, resources from the cluster are allocated for the pipeline’s use, and any models part of that pipeline.

Deployment follow this process:

Add models to a pipeline as model steps.
Configure the pipeline deployment configuration to allocate the number of cpus, gpus, RAM, autoscaling, replications, and other settings.
Deploy the pipeline.

Once the pipeline is deployed, inference requests are submitted to the pipeline.

Model Upload

How to upload models to a Wallaroo Ops instance.

Model Deploy

How to deploy ML models for inference requests.