Wallaroo users can register their trained machine learning models from a model registry into their Wallaroo instance and perform inferences with it through a Wallaroo pipeline.
This guide details how to add ML Models from a model registry service into a Wallaroo instance.
Models are uploaded to the Wallaroo instance as the specific artifact - the “file” or other data that represents the file itself. This must comply with the Wallaroo model requirements framework and version or it will not be deployed. Note that for models that fall outside of the supported model types, they can be registered to a Wallaroo workspace as MLFlow 1.30.0 containerized models.
Registry service use in Wallaroo typically falls under the following roles.
Role | Recommended Actions | Description |
---|---|---|
DevOps Engineer | Create Model Registry | Create the model (AKA artifact) registry service |
Retrieve Model Registry Tokens | Generate the model registry service credentials. | |
MLOps Engineer | Connect Model Registry to Wallaroo | Add the Registry Service URL and credentials into a Wallaroo instance for use by other users and scripts. |
Add Wallaroo Registry Service to Workspace | Add the registry service configuration to a Wallaroo workspace for use by workspace users. | |
Get Registry Details | Retrieve the connection details for a Wallaroo registry. | |
Remove Wallaroo Registry from a Workspace | Add the registry service configuration to a Wallaroo workspace for use by workspace users. | |
Data Scientist | List Models in Registry | List available models in a model registry. |
List Model Version Artifacts | Retrieve the artifacts (usually files) for a model stored in a model registry. | |
Upload Model from Registry | Upload a model and artifacts stored in a model registry into a Wallaroo workspace. |
The following links to guides and information on setting up a model registry (also known as an artifact registry).
See Model serving with Azure Databricks for setting up a model registry service using Azure Databricks.
The following steps create an Access Token used to authenticate to an Azure Databricks Model Registry.
The MLflow Model Registry provides a method of setting up a model registry service. Full details can be found at the MLflow Registry Guides.
A generic MLFlow model registry requires no token.
MLFlow Registry connection information is added to a Wallaroo instance through the following endpoint.
v1/api/models/create_registry
https://registry.wallaroo.ai
The following registry will be added to the workspace with the id 1
.
import requests
token = "abcdefg"
x = requests.post("https://{APIURL}/v1/api/models/create_registry",
json={
"workspace_id": 1,
"name": "sample registry",
"url": "https://registry.wallaroo.ai",
"token": token
},
headers=wl.auth.auth_header()
)
{'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29', 'workspace_id': 1}
Registries are assigned to a Wallaroo workspace with the following endpoint. This allows members of the workspace to access the registry connection. A registry can be associated with one or more workspaces.
v1/api/models//v1/api/models/attach_registry_to_workspace
import requests
token = "abcdefg"
x = requests.post("https://{APIURL}/v1/api/models/attach_registry_to_workspace",
json={
'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29',
'workspace_id': 1},
headers=wl.auth.auth_header()
)
{
"registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"workspace_id": 0
}
Registries are removed from a registry with the following endpoint. This does not remove the registry connection information from the Wallaroo instance, merely removes the association between the registry and that particular workspace.
v1/api/models//v1/api/models/remove_registry_from_workspace
import requests
token = "abcdefg"
x = requests.post("https://{APIURL}/v1/api/models/remove_registry_from_workspace",
json={
'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29',
'workspace_id': 1
},
headers=wl.auth.auth_header()
)
{
"registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"workspace_id": 0
}
v1/api/models/get_registry
https://registry.wallaroo.ai
The following example demonstrates retrieving the registry with id 98f9ca1d-c4e7-4d70-8df4-05c25a64be29
.
import requests
x = requests.post("https://{APIURL}/v1/api/models/get_registry",
json={
"registry_id": '98f9ca1d-c4e7-4d70-8df4-05c25a64be29'
},
headers=wl.auth.auth_header()
)
{
'id': '98f9ca1d-c4e7-4d70-8df4-05c25a64be29',
'name': 'sample registry',
'url': 'https://registry.wallaroo.ai',
'token': 'dapi67c8c0b04606f730e78b7ae5e3221015-3',
'created_at': '2023-06-23T15:37:38.38427+00:00',
'updated_at': '2023-06-23T15:37:38.38427+00:00'
}
A list of registries associates with a specific workspace are retrieved from the following endpoint.
/v1/api/models/list_registries
import requests
x = requests.post("{APIURL}v1/api/models/list_registries",
json={
"workspace_id": 1
},
headers=wl.auth.auth_header()
)
x.json()
[
{
"created_at": "2023-07-11T15:21:24.403Z",
"id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"name": "sample registry",
"token": "string",
"updated_at": "2023-07-11T15:21:24.403Z",
"url": "string"
}
]
A List of models available to the Wallaroo instance through the MLFlow Registry is performed with the following endpoint.
v1/api/models/list_registry_models
None
, there are no additional models to list after this request.None
. Each version has the following fields.'dbfs:/databricks/mlflow-tracking/123456/abcdefg/artifacts/random_forest_model'
.import requests
x = requests.post("{APIURL}v1/api/models/list_registry_models",
json={
"registry_id": id
},
headers=wl.auth.auth_header()
)
x.json()
{
'next_page_token': None,
'registered_models': [
{
'name': 'testmodel',
'user_id': 'sample.usersj@wallaroo.ai',
'latest_versions': None,
'creation_timestamp': 1686940722329,
'last_updated_timestamp': 1686940722329
},
{
'name': 'testmodel2',
'user_id': 'sample.user@wallaroo.ai',
'latest_versions': None,
'creation_timestamp': 1686940864528,
'last_updated_timestamp': 1686940864528
},
{
'name': 'wine_quality',
'user_id': 'sample.user@wallaroo.ai',
'latest_versions': [
{
'name': 'wine_quality',
'description': None,
'version': '1',
'status': 'READY',
'run_id': 'abcdefg',
'run_link': None,
'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/random_forest_model',
'current_stage': 'Archived',
'creation_timestamp': 1686942353367,
'last_updated_timestamp': 1686942597509
},
{
'name': 'wine_quality',
'description': None,
'version': '2',
'status': 'READY',
'run_id': 'abcdefg',
'run_link': None,
'source': 'dbfs:/databricks/mlflow-tracking/abcdefg/abcdefg/artifacts/model',
'current_stage': 'Production',
'creation_timestamp': 1686942576120,
'last_updated_timestamp': 1686942597646
}
],
'creation_timestamp': 1686942353127,
'last_updated_timestamp': 1686942597646
}
]
}
The artifacts of a specific model version are retrieved through the following endpoint.
v1/api/models/list_registry_model_version_artifacts
https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl
import requests
x = requests.post("{APIURL}v1/api/models//list_registry_model_version_artifacts",
json={
"name": "wine_quality",
"registry_id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
"version": "2"
},
headers=wl.auth.auth_header()
)
x.json()
[
{
"file_size": 156456,
"full_path": "https://adb-5939996465837398.18.azuredatabricks.net/api/2.0/dbfs/read?path=/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl",
"is_dir": false,
"modification_time": 1686942597646,
"path": "/databricks/mlflow-registry/9f38797c1dbf4e7eb229c4011f0f1f18/models/testmodel2/model.pkl"
}
]
Models are uploaded from a model registry configured in Wallaroo through the following endpoint. The specific artifact that is the model to be deployed is the item to upload to the Wallaroo workspace. Models must comply with Wallaroo model framework and versions as defined in Artifact Requirements.
v1/api/models/upload_from_registry
.
and _
are not allowed.source
field.public
or private
.[]
if the requirements are default Wallaroo JupyterHub libraries.pyarrow.lib.Schema
format, encoded with base64.b64encode
. Only required for non-native runtime models.pyarrow.lib.Schema
format, encoded with base64.b64encode
. Only required for non-native runtime models.import requests
x = requests.post("{APIURL}v1/api/models/upload_from_registry", json={
"registry_id": id,
"name": "uploaded-model-name",
"path": "<DBFS URL from list_artifacts() call here>",
"visibility": "public",
"workspace_id": 1,
"conversion": {
"framework": "sklearn",
"requirements": [],
},
"input_schema": "<base64-encoded input schema here>",
"output_schema": "<base64-encoded output schema here>"
}, headers=wl.auth.auth_header())
x.json()
{'model_id': 34}