Wallaroo SDK Essentials Guide: Model Uploads and Registrations: PyTorch

How to upload and use PyTorch ML Models with Wallaroo

Model Naming Requirements

Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. . and _ are not allowed.

Wallaroo supports PyTorch models by containerizing the model and running as an image.

Parameter Description
Web Site https://pytorch.org/
Supported Libraries
  • torch==1.13.1
  • torchvision==0.14.1
Framework Framework.PYTORCH aka pytorch
Supported File Types pt ot pth in TorchScript format
Runtime Containerized aka mlflow

Uploading PyTorch Models

PyTorch models are uploaded to Wallaroo through the Wallaroo Client upload_model method.

Upload PyTorch Model Parameters

The following parameters are required for PyTorch models. Note that while some fields are considered as optional for the upload_model method, they are required for proper uploading of a PyTorch model to Wallaroo.

Parameter Type Description
name string (Required) The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model.
path string (Required) The path to the model file being uploaded.
framework string (Upload Method Optional, PyTorch model Required) Set as the Framework.PyTorch.
input_schema pyarrow.lib.Schema (Upload Method Optional, PyTorch model Required) The input schema in Apache Arrow schema format.
output_schema pyarrow.lib.Schema (Upload Method Optional, PyTorch model Required) The output schema in Apache Arrow schema format.
convert_wait bool (Upload Method Optional, PyTorch model Optional) (Default: True)
  • True: Waits in the script for the model conversion completion.
  • False: Proceeds with the script without waiting for the model conversion process to display complete.

Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.

Upload PyTorch Model Return

The following is returned with a successful model upload and conversion.

Field Type Description
name string The name of the model.
version string The model version as a unique UUID.
file_name string The file name of the model as stored in Wallaroo.
image_path string The image used to deploy the model in the Wallaroo engine.
last_update_time DateTime When the model was last updated.

Upload PyTorch Model Example

The following example is of uploading a PyTorch ML Model to a Wallaroo instance.

input_schema = pa.schema(
        pa.field('input', pa.list_(pa.float64(), list_size=10))

output_schema = pa.schema(
    pa.field('output', pa.list_(pa.float64(), list_size=1))

model = wl.upload_model('pt-single-io-model', 

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting...........Ready.
    'name': 'pt-single-io-model', 
    'version': '8f91dee1-79e0-449b-9a59-0e93ba4a1ba9', 
    'file_name': 'model-auto-conversion_pytorch_single_io_model.pt', 
    'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3397', 
    'last_update_time': datetime.datetime(2023, 6, 23, 2, 8, 56, 669565, tzinfo=tzutc())

Pipeline Deployment Configurations

Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.

Once uploaded, the Model method config().runtime() will display which space the model is in.

Runtime Display Model Runtime Space Pipeline Configuration
tensorflow Native Native Runtime Configuration Methods
onnx Native Native Runtime Configuration Methods
python Native Native Runtime Configuration Methods
mlflow Containerized Containerized Runtime Deployment

For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime():

ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()

For example, the following containerized model after conversion is allocated to the containerized runtime as follows:

model = wl.upload_model(model_name, model_file_name, 

Native Runtime Pipeline Deployment Configuration Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.

deployment_config = DeploymentConfigBuilder()

Containerized Runtime Deployment Example

The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.

deployment_config = DeploymentConfigBuilder()
                    .sidekick_cpus(sm_model, 0.25)
                    .sidekick_memory(sm_model, '1Gi')
                        "__timeout=188 --workers=1"}