Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Model Registry Services

How to upload and use Registry ML Models with Wallaroo

Table of Contents

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

RoleRecommended ActionsDescription
DevOps EngineerCreate Model RegistryCreate the model (AKA artifact) registry service
 Retrieve Model Registry TokensGenerate the model registry service credentials.
MLOps EngineerConnect Model Registry to WallarooAdd the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
 Add Wallaroo Registry Service to WorkspaceAdd the registry service configuration to a Wallaroo workspace for use by workspace users.
Data ScientistList Registries in a WorkspaceList registries available from a workspace.
 List Models in RegistryList available models in a model registry.
 List Model Versions of Registered ModelList versions of a registry stored model.
 List Model Version ArtifactsRetrieve the artifacts (usually files) for a model stored in a model registry.
 Upload Model from RegistryUpload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

  1. Log into the Azure Databricks workspace.
  2. From the upper right corner access the User Settings.
  3. From the Access tokens, select Generate new token.
  4. Specify any token description and lifetime. Once complete, select Generate.
  5. Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.
Retrieve Azure Databricks User Token

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Guides.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

  • Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
  • Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
  • Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the Wallaroo.Client.create_model_registry method.

Connect Model Registry to Wallaroo Parameters

ParameterTypeDescription
namestring (Required)The name of the MLFlow Registry service.
tokenstring (Required)The authentication token used to authenticate to the MLFlow Registry.
urlstring (Required)The URL of the MLFlow registry service.

Connect Model Registry to Wallaroo Return

The following is returned when a MLFlow Registry is successfully created.

FieldTypeDescription
NameStringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Note that the token is not displayed for security reasons.

Connect Model Registry to Wallaroo Example

The following example creates a Wallaroo MLFlow Registry with the name ExampleNotebook stored in a sample Azure DataBricks environment.

wl.create_model_registry(name="ExampleNotebook", 
                        token="abcdefg-3", 
                        url="https://abcd-123489.456.azuredatabricks.net")
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Add Registry to Workspace

Registries are assigned to a Wallaroo workspace with the Wallaroo.registry.add_registry_to_workspace method. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

Add Registry to Workspace Parameters

ParameterTypeDescription
namestring (Required)The numerical identifier of the workspace.

Add Registry to Workspace Returns

The following is returned when a MLFlow Registry is successfully added to a workspace.

FieldTypeDescription
NameStringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[string]The name of all workspaces this registry was added to.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Example

registry.add_registry_to_workspace(workspace_id=workspace_id)
FieldValue
NameExampleNotebook
URLhttps://abcd-123489.456.azuredatabricks.net
Workspacessample.user@wallaroo.ai - Default Workspace
Created At2023-27-Jun 13:57:26
Updated At2023-27-Jun 13:57:26

Remove Registry from Workspace

Registries are removed from a Wallaroo workspace with the Registry remove_registry_from_workspace method.

Remove Registry from Workspace Parameters

ParameterTypeDescription
workspace_idInteger (Required)The numerical identifier of the workspace.

Remove Registry from Workspace Return

FieldTypeDescription
NameStringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
WorkspacesList[String]A list of workspaces by name that still contain the registry.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

Remove Registry from Workspace Example

registry.remove_registry_from_workspace(workspace_id=workspace_id)
FieldValue
NameJeffRegistry45
URLhttps://sample.registry.azuredatabricks.net
Workspacesjohn.hummel@wallaroo.ai - Default Workspace
Created At2023-17-Jul 17:56:52
Updated At2023-17-Jul 17:56:52

Wallaroo Registry Model Operations

  • List Registries in a Workspace: List the available registries in the current workspace.
  • List Models: List Models in a Registry
  • Upload Model: Upload a version of a ML Model from the Registry to a Wallaroo workspace.
  • List Model Versions: List the versions of a particular model.
  • Remove Registry from Workspace: Remove a specific Registry configuration from a specific workspace.

List Registries in a Workspace

Registries associated with a workspace are listed with the Wallaroo.Client.list_model_registries() method. This lists all registries associated with the current workspace.

List Registries in a Workspace Parameters

None

List Registries in a Workspace Returns

A List of Registries with the following fields.

FieldTypeDescription
NameStringThe name of the MLFlow Registry service.
URLstringThe URL for connecting to the service.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Registries in a Workspace Example

wl.list_model_registries()
nameregistry urlcreated atupdated at
gibhttps://sampleregistry.wallaroo.ai2023-27-Jun 03:22:462023-27-Jun 03:22:46
ExampleNotebookhttps://sampleregistry.wallaroo.ai2023-27-Jun 13:57:262023-27-Jun 13:57:26

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the Wallaroo.Registry.list_models() method.

List Models in a Registry Parameters

None

List Models in a Registry Returns

A List of models with the following fields.

FieldTypeDescription
NameStringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Models in a Registry Example

registry.list_models()
NameRegistry UserVersionsCreated AtUpdated At
testmodelsample.user@wallaroo.ai02023-16-Jun 14:38:422023-16-Jun 14:38:42
testmodel2sample.user@wallaroo.ai02023-16-Jun 14:41:042023-16-Jun 14:41:04
wine_qualitysample.user@wallaroo.ai22023-16-Jun 15:05:532023-16-Jun 15:09:57

Retrieve Specific Model Details from the Registry

Model details are retrieved by assigning a MLFlow Registry Model to an object with the Wallaroo.Registry.list_models(), then specifying the element in the list to save it to a Registered Model object.

The following will return the most recent model added to the MLFlow Registry service.

mlflow_model = registry.list_models()[-1]
mlflow_model
FieldTypeDescription
NameStringThe name of the model.
Registry UserstringThe user account that is tied to the registry service for this model.
VersionsintThe number of versions for the model, starting at 0.
Created AtDateTimeWhen the registry was added to the Wallaroo instance.
Updated AtDateTimeWhen the registry was last updated.

List Model Versions of Registered Model

MLFlow registries can contain multiple versions of a ML Model. These are listed and are listed with the Registered Model versions attribute. The versions are listed in reverse order of insertion, with the most recent model version in position 0.

List Model Versions of Registered Model Parameters

None

List Model Versions of Registered Model Returns

A List of the Registered Model Versions with the following fields.

FieldTypeDescription
NameStringThe name of the model.
VersionintThe version number. The higher numbers are the most recent.
DescriptionStringThe registered model’s description from the MLFlow Registry service.

List Model Versions of Registered Model Example

The following will return the most recent model added to the MLFlow Registry service and list its versions.

mlflow_model = registry.list_models()[-1]
mlflow_model.versions
NameVersionDescription
wine_quality2None
wine_quality1None

List Model Version Artifacts

Artifacts belonging to a MLFlow registry model are listed with the Model Version list_artifacts() method. This returns all artifacts for the model.

List Model Version Artifacts Parameters

None

List Model Version Artifacts Returns

A List of artifacts with the following fields.

FieldTypeDescription
file_nameStringThe name assigned to the artifact.
file_sizeStringThe size of the artifact in bytes.
full_pathStringThe path of the artifact. This will be used to upload the artifact to Wallaroo.

List Model Version Artifacts Example

The following will list the artifacts in a single registry model.

single_registry_model.versions[0].list_artifacts()
File NameFile SizeFull Path
MLmodel546Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/MLmodel
conda.yaml182Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/conda.yaml
model.pkl1429Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
python_env.yaml122Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/python_env.yaml
requirements.txt73Bhttps://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/requirements.txt

Upload a Model from a Registry

Models uploaded to the Wallaroo workspace are uploaded from a MLFlow Registry with the Wallaroo.Registry.upload method.

Upload a Model from a Registry Parameters

ParameterTypeDescription
namestring (Required)The name to assign the model once uploaded. Model names are unique within a workspace. Models assigned the same name as an existing model will be uploaded as a new model version.
pathstring (Required)The full path to the model artifact in the registry.
frameworkstring (Required)The Wallaroo model Framework. See Model Uploads and Registrations Supported Frameworks
input_schemapyarrow.lib.Schema (Required for non-native runtimes)The input schema in Apache Arrow schema format.
output_schemapyarrow.lib.Schema (Required for non-native runtimes)The output schema in Apache Arrow schema format.

Upload a Model from a Registry Returns

The registry model details as follows.

FieldTypeDescription
NameStringThe name of the model.
VersionstringThe version registered in the Wallaroo instance in UUID format.
File NamestringThe file name associated with the ML Model in the Wallaroo instance.
SHAstringThe models hash value.
StatusstringThe status of the model from the following list.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.
Image PathstringThe image used for the containerization of the model.
Updated AtDateTimeWhen the model was last updated.

Upload a Model from a Registry Example

The following will retrieve the most recent uploaded model and upload it with the XGBOOST framework into the current Wallaroo workspace.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = registry.upload_model(
  name="sklearnonnx", 
  path="https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl", 
  framework=Framework.SKLEARN,
  input_schema=input_schema,
  output_schema=output_schema)
  
Namesklearnonnx
Version63bd932d-320d-4084-b972-0cfe1a943f5a
File Namemodel.pkl
SHA970da8c178e85dfcbb69fab7bad0fb58cd0c2378d27b0b12cc03a288655aa28d
Statuspending_conversion
ImagePathNone
Updated At2023-05-Jul 19:14:49

Retrieve Model Status

The model status is retrieved with the Model status() method.

Retrieve Model Status Parameters

None

Retrieve Model Status Returns

FieldTypeDescription
statusStringThe current status of the uploaded model.
  • pending_conversion: The model is uploaded to Wallaroo and is ready to convert.
  • converting: The model is being converted into a Wallaroo supported runtime.
  • ready
  • : The model is ready and available for use.
  • error: The model conversion has failed. Check error messages and verify the model is the correct version and framework.

Retrieve Model Status Returns Example

The following demonstrates checking the status in the for loop until the model shows either ready or error.

import time
while model.status() != "ready" and model.status() != "error":
    print(model.status())
    time.sleep(3)
print(model.status())

converting
converting
ready