Wallaroo SDK Essentials Guide: Model Uploads and Registrations: Model Registry Services

How to upload and use Registry ML Models with Wallaroo

Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.

This guide details how to add ML Models from a model registry service into a Wallaroo instance.

Artifact Requirements

Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.

Registry Services Roles

Registry service use in Wallaroo typically falls under the following roles.

Role	Recommended Actions	Description
DevOps Engineer	Create Model Registry	Create the model (AKA artifact) registry service
	Retrieve Model Registry Tokens	Generate the model registry service credentials.
MLOps Engineer	Connect Model Registry to Wallaroo	Add the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts.
	Add Wallaroo Registry Service to Workspace	Add the registry service configuration to a Wallaroo workspace for use by workspace users.
Data Scientist	List Registries in a Workspace	List registries available from a workspace.
	List Models in Registry	List available models in a model registry.
	List Model Versions of Registered Model	List versions of a registry stored model.
	List Model Version Artifacts	Retrieve the artifacts (usually files) for a model stored in a model registry.
	Upload Model from Registry	Upload a model and artifacts stored in a model registry into a Wallaroo workspace.

Model Registry Operations

The following links to guides and information on setting up a model registry (also known as an artifact registry).

Create Model Registry

See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.

The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.

Log into the Azure Databricks workspace.
From the upper right corner access the User Settings.
From the Access tokens, select Generate new token.
Specify any token description and lifetime. Once complete, select Generate.
Copy the token and store in a secure place. Once the Generate New Token module is closed, the token will not be retrievable.

The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Guides.

A generic MLFlow model registry requires no token.

Wallaroo Registry Operations

Connect Model Registry to Wallaroo: This details the link and connection information to a existing MLFlow registry service. Note that this does not create a MLFlow registry service, but adds the connection and credentials to Wallaroo to allow that MLFlow registry service to be used by other entities in the Wallaroo instance.
Add a Registry to a Workspace: Add the created Wallaroo Model Registry so make it available to other workspace members.
Remove a Registry from a Workspace: Remove the link between a Wallaroo Model Registry and a Wallaroo workspace.

Connect Model Registry to Wallaroo

MLFlow Registry connection information is added to a Wallaroo instance through the Wallaroo.Client.create_model_registry method.

Connect Model Registry to Wallaroo Parameters

Parameter	Type	Description
`name`	string (Required)	The name of the MLFlow Registry service.
`token`	string (Required)	The authentication token used to authenticate to the MLFlow Registry.
`url`	string (Required)	The URL of the MLFlow registry service.

Connect Model Registry to Wallaroo Return

The following is returned when a MLFlow Registry is successfully created.

Field	Type	Description
`Name`	String	The name of the MLFlow Registry service.
`URL`	string	The URL for connecting to the service.
`Workspaces`	List[string]	The name of all workspaces this registry was added to.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

Note that the token is not displayed for security reasons.

Connect Model Registry to Wallaroo Example

The following example creates a Wallaroo MLFlow Registry with the name ExampleNotebook stored in a sample Azure DataBricks environment.

wl.create_model_registry(name="ExampleNotebook", 
                        token="abcdefg-3", 
                        url="https://abcd-123489.456.azuredatabricks.net")

Field	Value
Name	ExampleNotebook
URL	https://abcd-123489.456.azuredatabricks.net
Workspaces	sample.user@wallaroo.ai - Default Workspace
Created At	2023-27-Jun 13:57:26
Updated At	2023-27-Jun 13:57:26

Add Registry to Workspace

Registries are assigned to a Wallaroo workspace with the Wallaroo.registry.add_registry_to_workspace method. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.

Add Registry to Workspace Parameters

Parameter	Type	Description
`name`	string (Required)	The numerical identifier of the workspace.

Add Registry to Workspace Returns

The following is returned when a MLFlow Registry is successfully added to a workspace.

Field	Type	Description
`Name`	String	The name of the MLFlow Registry service.
`URL`	string	The URL for connecting to the service.
`Workspaces`	List[string]	The name of all workspaces this registry was added to.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

Example

registry.add_registry_to_workspace(workspace_id=workspace_id)

Field	Value
Name	ExampleNotebook
URL	https://abcd-123489.456.azuredatabricks.net
Workspaces	sample.user@wallaroo.ai - Default Workspace
Created At	2023-27-Jun 13:57:26
Updated At	2023-27-Jun 13:57:26

Remove Registry from Workspace

Registries are removed from a Wallaroo workspace with the Registry remove_registry_from_workspace method.

Remove Registry from Workspace Parameters

Parameter	Type	Description
`workspace_id`	Integer (Required)	The numerical identifier of the workspace.

Remove Registry from Workspace Return

Field	Type	Description
`Name`	String	The name of the MLFlow Registry service.
`URL`	string	The URL for connecting to the service.
`Workspaces`	List[String]	A list of workspaces by name that still contain the registry.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

Remove Registry from Workspace Example

registry.remove_registry_from_workspace(workspace_id=workspace_id)

Field	Value
Name	JeffRegistry45
URL	https://sample.registry.azuredatabricks.net
Workspaces	john.hummel@wallaroo.ai - Default Workspace
Created At	2023-17-Jul 17:56:52
Updated At	2023-17-Jul 17:56:52

Wallaroo Registry Model Operations

List Registries in a Workspace: List the available registries in the current workspace.
List Models: List Models in a Registry
Upload Model: Upload a version of a ML Model from the Registry to a Wallaroo workspace.
List Model Versions: List the versions of a particular model.
Remove Registry from Workspace: Remove a specific Registry configuration from a specific workspace.

List Registries in a Workspace

Registries associated with a workspace are listed with the Wallaroo.Client.list_model_registries() method. This lists all registries associated with the current workspace.

List Registries in a Workspace Parameters

None

List Registries in a Workspace Returns

A List of Registries with the following fields.

Field	Type	Description
`Name`	String	The name of the MLFlow Registry service.
`URL`	string	The URL for connecting to the service.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Registries in a Workspace Example

wl.list_model_registries()

name	registry url	created at	updated at
gib	https://sampleregistry.wallaroo.ai	2023-27-Jun 03:22:46	2023-27-Jun 03:22:46
ExampleNotebook	https://sampleregistry.wallaroo.ai	2023-27-Jun 13:57:26	2023-27-Jun 13:57:26

List Models in a Registry

A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the Wallaroo.Registry.list_models() method.

List Models in a Registry Parameters

None

List Models in a Registry Returns

A List of models with the following fields.

Field	Type	Description
`Name`	String	The name of the model.
`Registry User`	string	The user account that is tied to the registry service for this model.
`Versions`	int	The number of versions for the model, starting at 0.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Models in a Registry Example

registry.list_models()

Name	Registry User	Versions	Created At	Updated At
testmodel	sample.user@wallaroo.ai	0	2023-16-Jun 14:38:42	2023-16-Jun 14:38:42
testmodel2	sample.user@wallaroo.ai	0	2023-16-Jun 14:41:04	2023-16-Jun 14:41:04
wine_quality	sample.user@wallaroo.ai	2	2023-16-Jun 15:05:53	2023-16-Jun 15:09:57

Retrieve Specific Model Details from the Registry

Model details are retrieved by assigning a MLFlow Registry Model to an object with the Wallaroo.Registry.list_models(), then specifying the element in the list to save it to a Registered Model object.

The following will return the most recent model added to the MLFlow Registry service.

mlflow_model = registry.list_models()[-1]
mlflow_model

Field	Type	Description
`Name`	String	The name of the model.
`Registry User`	string	The user account that is tied to the registry service for this model.
`Versions`	int	The number of versions for the model, starting at 0.
`Created At`	DateTime	When the registry was added to the Wallaroo instance.
`Updated At`	DateTime	When the registry was last updated.

List Model Versions of Registered Model

MLFlow registries can contain multiple versions of a ML Model. These are listed and are listed with the Registered Model versions attribute. The versions are listed in reverse order of insertion, with the most recent model version in position 0.

List Model Versions of Registered Model Parameters

None

List Model Versions of Registered Model Returns

A List of the Registered Model Versions with the following fields.

Field	Type	Description
`Name`	String	The name of the model.
`Version`	int	The version number. The higher numbers are the most recent.
`Description`	String	The registered model’s description from the MLFlow Registry service.

List Model Versions of Registered Model Example

The following will return the most recent model added to the MLFlow Registry service and list its versions.

mlflow_model = registry.list_models()[-1]
mlflow_model.versions

Name	Version	Description
wine_quality	2	None
wine_quality	1	None

List Model Version Artifacts

Artifacts belonging to a MLFlow registry model are listed with the Model Version list_artifacts() method. This returns all artifacts for the model.

List Model Version Artifacts Parameters

None

List Model Version Artifacts Returns

A List of artifacts with the following fields.

Field	Type	Description
`file_name`	String	The name assigned to the artifact.
`file_size`	String	The size of the artifact in bytes.
`full_path`	String	The path of the artifact. This will be used to upload the artifact to Wallaroo.

List Model Version Artifacts Example

The following will list the artifacts in a single registry model.

single_registry_model.versions[0].list_artifacts()

File Name	File Size	Full Path
MLmodel	546B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/MLmodel
conda.yaml	182B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/conda.yaml
model.pkl	1429B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
python_env.yaml	122B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/python_env.yaml
requirements.txt	73B	https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/requirements.txt

Upload a Model from a Registry

Models uploaded to the Wallaroo workspace are uploaded from a MLFlow Registry with the Wallaroo.Registry.upload method.

Upload a Model from a Registry Parameters

Parameter	Type	Description
`name`	string (Required)	The name to assign the model once uploaded. Model names are unique within a workspace. Models assigned the same name as an existing model will be uploaded as a new model version.
`path`	string (Required)	The full path to the model artifact in the registry.
`framework`	string (Required)	The Wallaroo model `Framework`. See Model Uploads and Registrations Supported Frameworks
`input_schema`	`pyarrow.lib.Schema` (Required for non-native runtimes)	The input schema in Apache Arrow schema format.
`output_schema`	`pyarrow.lib.Schema` (Required for non-native runtimes)	The output schema in Apache Arrow schema format.

Upload a Model from a Registry Returns

The registry model details as follows.

Field	Type	Description
`Name`	String	The name of the model.
`Version`	string	The version registered in the Wallaroo instance in UUID format.
`File Name`	string	The file name associated with the ML Model in the Wallaroo instance.
`SHA`	string	The models hash value.
`Status`	string	The status of the model from the following list. `pending_conversion`: The model is uploaded to Wallaroo and is ready to convert. `converting`: The model is being converted into a Wallaroo supported runtime. `ready` : The model is ready and available for use. `error`: The model conversion has failed. Check error messages and verify the model is the correct version and framework.
`Image Path`	string	The image used for the containerization of the model.
`Updated At`	DateTime	When the model was last updated.

Upload a Model from a Registry Example

The following will retrieve the most recent uploaded model and upload it with the XGBOOST framework into the current Wallaroo workspace.

input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float32(), list_size=4))
])

output_schema = pa.schema([
    pa.field('predictions', pa.int32())
])

model = registry.upload_model(
  name="sklearnonnx", 
  path="https://sampleregistry.wallaroo.ai/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl", 
  framework=Framework.SKLEARN,
  input_schema=input_schema,
  output_schema=output_schema)


Name	sklearnonnx
Version	63bd932d-320d-4084-b972-0cfe1a943f5a
File Name	model.pkl
SHA	970da8c178e85dfcbb69fab7bad0fb58cd0c2378d27b0b12cc03a288655aa28d
Status	pending_conversion
ImagePath	None
Updated At	2023-05-Jul 19:14:49

Retrieve Model Status

The model status is retrieved with the Model status() method.

Retrieve Model Status Parameters

None

Retrieve Model Status Returns

Field	Type	Description
status	String	The current status of the uploaded model. `pending_conversion`: The model is uploaded to Wallaroo and is ready to convert. `converting`: The model is being converted into a Wallaroo supported runtime. `ready` : The model is ready and available for use. `error`: The model conversion has failed. Check error messages and verify the model is the correct version and framework.

Retrieve Model Status Returns Example

The following demonstrates checking the status in the for loop until the model shows either ready or error.

import time
while model.status() != "ready" and model.status() != "error":
    print(model.status())
    time.sleep(3)
print(model.status())

converting
converting
ready