This tutorial and the assets can be downloaded as part of the Wallaroo Tutorials repository.
Wallaroo SDK Tag Tutorial
The following tutorial demonstrates how to use Wallaroo Tags. Tags are applied to either model versions or pipelines. This allows organizations to track different versions of models, and search for what pipelines have been used for specific purposes such as testing versus production use.
The following will be demonstrated:
- List all tags in a Wallaroo instance.
- List all tags applied to a model.
- List all tags applied to a pipeline.
- Apply a tag to a model.
- Remove a tag from a model.
- Apply a tag to a pipeline.
- Remove a tag from a pipeline.
- Search for a model version by a tag.
- Search for a pipeline by a tag.
This demonstration provides the following through the Wallaroo Tutorials Github Repository:
models/ccfraud.onnx
: a sample model used as part of the Wallaroo 101 Tutorials.
Prerequisites
- A deployed Wallaroo instance
- The following Python libraries installed:
os
string
random
wallaroo
: The Wallaroo SDK. Included with the Wallaroo JupyterHub service by default.
Steps
The following steps are performed use to connect to a Wallaroo instance and demonstrate how to use tags with models and pipelines.
Load Libraries
The first step is to load the libraries used to connect and use a Wallaroo instance.
import wallaroo
from wallaroo.object import EntityNotFoundError
import pandas as pd
# used to display dataframe information without truncating
from IPython.display import display
pd.set_option('display.max_colwidth', None)
Connect to the Wallaroo Instance
The first step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.
This is accomplished using the wallaroo.Client()
command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.
If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client()
. For more information on Wallaroo Client settings, see the Client Connection guide.
# Client connection from local Wallaroo instance
wl = wallaroo.Client()
Set Variables
The following variables are used to create or connect to existing workspace and pipeline. The model name and model file are set as well. Adjust as required for your organization’s needs.
The methods get_workspace
and get_pipeline
are used to either create a new workspace and pipeline based on the variables below, or connect to an existing workspace and pipeline with the same name. Once complete, the workspace will be set as the current workspace where pipelines and models are used.
To allow this tutorial to be run multiple times or by multiple users in the same Wallaroo instance, a random 4 character prefix will be added to the workspace, pipeline, and model.
import string
import random
# make a random 4 character prefix
prefix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))
workspace_name = f'{prefix}tagtestworkspace'
pipeline_name = f'{prefix}tagtestpipeline'
model_name = f'{prefix}tagtestmodel'
model_file_name = './models/ccfraud.onnx'
def get_workspace(name):
workspace = None
for ws in wl.list_workspaces():
if ws.name() == name:
workspace= ws
if(workspace == None):
workspace = wl.create_workspace(name)
return workspace
def get_pipeline(name):
try:
pipeline = wl.pipelines_by_name(name)[0]
except EntityNotFoundError:
pipeline = wl.build_pipeline(name)
return pipeline
workspace = get_workspace(workspace_name)
wl.set_current_workspace(workspace)
{'name': 'rehqtagtestworkspace', 'id': 24, 'archived': False, 'created_by': '028c8b48-c39b-4578-9110-0b5bdd3824da', 'created_at': '2023-05-17T21:56:18.63721+00:00', 'models': [], 'pipelines': []}
Upload Model and Create Pipeline
The tagtest_model
and tagtest_pipeline
will be created (or connected if already existing) based on the variables set earlier.
tagtest_model = wl.upload_model(model_name, model_file_name, framework=wallaroo.framework.Framework.ONNX).configure()
tagtest_model
{'name': 'rehqtagtestmodel', 'version': '53febe9a-bb4b-4a01-a6a2-a17f943d6652', 'file_name': 'ccfraud.onnx', 'image_path': None, 'last_update_time': datetime.datetime(2023, 5, 17, 21, 56, 20, 208454, tzinfo=tzutc())}
tagtest_pipeline = get_pipeline(pipeline_name)
tagtest_pipeline
name | rehqtagtestpipeline |
---|---|
created | 2023-05-17 21:56:21.405556+00:00 |
last_updated | 2023-05-17 21:56:21.405556+00:00 |
deployed | (none) |
tags | |
versions | e259f6db-8ce2-45f1-b2d7-a719fde3b18f |
steps |
List Pipeline and Model Tags
This tutorial assumes that no tags are currently existing, but that can be verified through the Wallaroo client list_pipelines
and list_models
commands. For this demonstration, it is recommended to use unique tags to verify each example.
wl.list_pipelines()
name | created | last_updated | deployed | tags | versions | steps |
---|---|---|---|---|---|---|
rehqtagtestpipeline | 2023-17-May 21:56:21 | 2023-17-May 21:56:21 | (unknown) | e259f6db-8ce2-45f1-b2d7-a719fde3b18f | ||
osysapiinferenceexamplepipeline | 2023-17-May 21:54:56 | 2023-17-May 21:54:56 | False | 8f244f23-73f9-4af2-a95e-2a03214dca63 | osysccfraud | |
fvqusdkinferenceexamplepipeline | 2023-17-May 21:53:14 | 2023-17-May 21:53:15 | False | a987e13f-ffbe-4826-a6f5-9fd8de9f47fa, 0966d243-ce76-4132-aa69-0d287ae9a572 | fvquccfraud | |
gobtedgepipelineexample | 2023-17-May 21:50:13 | 2023-17-May 21:51:06 | False | dc0238e7-f3e3-4579-9a63-24902cb3e3bd, 5cf788a6-50ff-471f-a3ee-4bfdc24def34, 9efda57b-c18b-4ebb-9681-33647e7d7e66 | gobtalohamodel | |
logpipeline | 2023-17-May 21:41:06 | 2023-17-May 21:46:51 | False | 66fb765b-d46c-4472-9976-dba2eac5b8ce, 328b2b59-7a57-403b-abd5-70708a67674e, 18eb212d-0af5-4c0b-8bdb-3abbc4907a3e, c39b5215-0535-4006-a26a-d78b1866435b | logcontrol | |
btffhotswappipeline | 2023-17-May 21:37:16 | 2023-17-May 21:37:39 | False | 438796a3-e320-4a51-9e64-35eb32d57b49, 4fc11650-1003-43c2-bd3a-96b9cdacbb6d, e4b8d7ca-00fa-4e31-8671-3d0a3bf4c16e, 3c5f951b-e815-4bc7-93bf-84de3d46718d | btffhousingmodelcontrol | |
qjjoccfraudpipeline | 2023-17-May 21:32:06 | 2023-17-May 21:32:08 | False | 89b634d6-f538-4ac6-98a2-fbb9883fdeb6, c0f8551d-cefe-49c8-8701-c2a307c0ad99 | qjjoccfraudmodel | |
housing-pipe | 2023-17-May 21:26:56 | 2023-17-May 21:29:05 | False | 34e75a0c-01bd-4ca2-a6e8-ebdd25473aab, b7dbd380-e48c-487c-8f23-398a2ba558c3, 5ea6f182-5764-4377-9f83-d363e349ef32 | preprocess | |
xgboost-regression-autoconvert-pipeline | 2023-17-May 21:21:56 | 2023-17-May 21:21:59 | False | f5337089-2756-469a-871a-1cb9e3416847, 324433ae-db9a-4d43-9563-ff76df59953d | xgb-regression-model | |
xgboost-classification-autoconvert-pipeline | 2023-17-May 21:21:19 | 2023-17-May 21:21:22 | False | 5f7bb0cc-f60d-4cee-8425-c5e85331ae2f, bbe4dce4-f62a-4f4f-a45c-aebbfce23304 | xgb-class-model | |
statsmodelpipeline | 2023-17-May 21:19:52 | 2023-17-May 21:19:55 | False | 4af264e3-f427-4b02-b5ad-4f6690b0ee06, 5456dd2a-3167-4b3c-ad3a-85544292a230 | bikedaymodel | |
isoletpipeline | 2023-17-May 21:17:33 | 2023-17-May 21:17:44 | False | c129b33c-cefc-4873-ad2c-d186fe2b8228, 145b768e-79f2-44fd-ab6b-14d675501b83 | isolettest | |
externalkerasautoconvertpipeline | 2023-17-May 21:13:27 | 2023-17-May 21:13:30 | False | 7be0dd01-ef82-4335-b60d-6f1cd5287e5b, 3948e0dc-d591-4ff5-a48f-b8d17195a806 | externalsimple-sentiment-model | |
gcpsdkpipeline | 2023-17-May 21:03:44 | 2023-17-May 21:03:49 | False | 6398cafc-50c4-49e3-9499-6025b7808245, 7c043d3c-c894-4ae9-9ec1-c35518130b90 | gcpsdkmodel | |
databricksazuresdkpipeline | 2023-17-May 21:02:55 | 2023-17-May 21:02:59 | False | f125dc67-f690-4011-986a-8f6a9a23c48a, 8c4a15b4-2ef0-4da1-8e2d-38088fde8c56 | ccfraudmodel | |
azuremlsdkpipeline | 2023-17-May 21:01:46 | 2023-17-May 21:01:51 | False | 28a7a5aa-5359-4320-842b-bad84258f7e4, e011272d-c22c-4b2d-ab9f-b17c60099434 | azuremlsdkmodel | |
copiedmodelpipeline | 2023-17-May 20:54:01 | 2023-17-May 20:54:01 | (unknown) | bcf5994f-1729-4036-a910-00b662946801 | ||
pipelinemodels | 2023-17-May 20:52:06 | 2023-17-May 20:52:06 | False | 55f45c16-591e-4a16-8082-3ab6d843b484 | apimodel | |
pipelinenomodel | 2023-17-May 20:52:04 | 2023-17-May 20:52:04 | (unknown) | a6dd2cee-58d6-4d24-9e25-f531dbbb95ad | ||
sdkquickpipeline | 2023-17-May 20:43:38 | 2023-17-May 20:46:02 | False | 961c909d-f5ae-472a-b8ae-1e6a00fbc36e, bf7c2146-ed14-430b-bf96-1e8b1047eb2e, 2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b | sdkquickmodel | |
housepricepipe | 2023-17-May 20:41:50 | 2023-17-May 20:41:50 | False | 4d9dfb3b-c9ae-402a-96fc-20ae0a2b2279, fc68f5f2-7bbf-435e-b434-e0c89c28c6a9 | housepricemodel |
wl.list_models()
Name | # of Versions | Owner ID | Last Updated | Created At |
---|---|---|---|---|
rehqtagtestmodel | 1 | "" | 2023-05-17 21:56:20.208454+00:00 | 2023-05-17 21:56:20.208454+00:00 |
Create Tag
Tags are created with the Wallaroo client command create_tag(String tagname)
. This creates the tag and makes it available for use.
The tag will be saved to the variable currentTag
to be used in the rest of these examples.
# Now we create our tag
currentTag = wl.create_tag("My Great Tag")
List Tags
Tags are listed with the Wallaroo client command list_tags()
, which shows all tags and what models and pipelines they have been assigned to. Note that if a tag has not been assigned, it will not be displayed.
# List all tags
wl.list_tags()
(no tags)
Assign Tag to a Model
Tags are assigned to a model through the Wallaroo Tag add_to_model(model_id)
command, where model_id
is the model’s numerical ID number. The tag is applied to the most current version of the model.
For this example, the currentTag
will be applied to the tagtest_model
. All tags will then be listed to show it has been assigned to this model.
# add tag to model
currentTag.add_to_model(tagtest_model.id())
{'model_id': 29, 'tag_id': 1}
# list all tags to verify
wl.list_tags()
id | tag | models | pipelines |
---|---|---|---|
1 | My Great Tag | [('rehqtagtestmodel', ['53febe9a-bb4b-4a01-a6a2-a17f943d6652'])] | [] |
Search Models by Tag
Model versions can be searched via tags using the Wallaroo Client method search_models(search_term)
, where search_term
is a string value. All models versions containing the tag will be displayed. In this example, we will be using the text from our tag to list all models that have the text from currentTag
in them.
# Search models by tag
wl.search_models('My Great Tag')
name | version | file_name | image_path | last_update_time |
---|---|---|---|---|
rehqtagtestmodel | 53febe9a-bb4b-4a01-a6a2-a17f943d6652 | ccfraud.onnx | None | 2023-05-17 21:56:20.208454+00:00 |
Remove Tag from Model
Tags are removed from models using the Wallaroo Tag remove_from_model(model_id)
command.
In this example, the currentTag
will be removed from tagtest_model
. A list of all tags will be shown with the list_tags
command, followed by searching the models for the tag to verify it has been removed.
### remove tag from model
currentTag.remove_from_model(tagtest_model.id())
{'model_id': 29, 'tag_id': 1}
# list all tags to verify it has been removed from `tagtest_model`.
wl.list_tags()
(no tags)
# search models for currentTag to verify it has been removed from `tagtest_model`.
wl.search_models('My Great Tag')
(no model versions)
Add Tag to Pipeline
Tags are added to a pipeline through the Wallaroo Tag add_to_pipeline(pipeline_id)
method, where pipeline_id
is the pipeline’s integer id.
For this example, we will add currentTag
to testtest_pipeline
, then verify it has been added through the list_tags
command and list_pipelines
command.
# add this tag to the pipeline
currentTag.add_to_pipeline(tagtest_pipeline.id())
{'pipeline_pk_id': 45, 'tag_pk_id': 1}
# list tags to verify it was added to tagtest_pipeline
wl.list_tags()
id | tag | models | pipelines |
---|---|---|---|
1 | My Great Tag | [] | [('rehqtagtestpipeline', ['e259f6db-8ce2-45f1-b2d7-a719fde3b18f'])] |
# get all of the pipelines to show the tag was added to tagtest-pipeline
wl.list_pipelines()
name | created | last_updated | deployed | tags | versions | steps |
---|---|---|---|---|---|---|
rehqtagtestpipeline | 2023-17-May 21:56:21 | 2023-17-May 21:56:21 | (unknown) | My Great Tag | e259f6db-8ce2-45f1-b2d7-a719fde3b18f | |
osysapiinferenceexamplepipeline | 2023-17-May 21:54:56 | 2023-17-May 21:54:56 | False | 8f244f23-73f9-4af2-a95e-2a03214dca63 | osysccfraud | |
fvqusdkinferenceexamplepipeline | 2023-17-May 21:53:14 | 2023-17-May 21:53:15 | False | a987e13f-ffbe-4826-a6f5-9fd8de9f47fa, 0966d243-ce76-4132-aa69-0d287ae9a572 | fvquccfraud | |
gobtedgepipelineexample | 2023-17-May 21:50:13 | 2023-17-May 21:51:06 | False | dc0238e7-f3e3-4579-9a63-24902cb3e3bd, 5cf788a6-50ff-471f-a3ee-4bfdc24def34, 9efda57b-c18b-4ebb-9681-33647e7d7e66 | gobtalohamodel | |
logpipeline | 2023-17-May 21:41:06 | 2023-17-May 21:46:51 | False | 66fb765b-d46c-4472-9976-dba2eac5b8ce, 328b2b59-7a57-403b-abd5-70708a67674e, 18eb212d-0af5-4c0b-8bdb-3abbc4907a3e, c39b5215-0535-4006-a26a-d78b1866435b | logcontrol | |
btffhotswappipeline | 2023-17-May 21:37:16 | 2023-17-May 21:37:39 | False | 438796a3-e320-4a51-9e64-35eb32d57b49, 4fc11650-1003-43c2-bd3a-96b9cdacbb6d, e4b8d7ca-00fa-4e31-8671-3d0a3bf4c16e, 3c5f951b-e815-4bc7-93bf-84de3d46718d | btffhousingmodelcontrol | |
qjjoccfraudpipeline | 2023-17-May 21:32:06 | 2023-17-May 21:32:08 | False | 89b634d6-f538-4ac6-98a2-fbb9883fdeb6, c0f8551d-cefe-49c8-8701-c2a307c0ad99 | qjjoccfraudmodel | |
housing-pipe | 2023-17-May 21:26:56 | 2023-17-May 21:29:05 | False | 34e75a0c-01bd-4ca2-a6e8-ebdd25473aab, b7dbd380-e48c-487c-8f23-398a2ba558c3, 5ea6f182-5764-4377-9f83-d363e349ef32 | preprocess | |
xgboost-regression-autoconvert-pipeline | 2023-17-May 21:21:56 | 2023-17-May 21:21:59 | False | f5337089-2756-469a-871a-1cb9e3416847, 324433ae-db9a-4d43-9563-ff76df59953d | xgb-regression-model | |
xgboost-classification-autoconvert-pipeline | 2023-17-May 21:21:19 | 2023-17-May 21:21:22 | False | 5f7bb0cc-f60d-4cee-8425-c5e85331ae2f, bbe4dce4-f62a-4f4f-a45c-aebbfce23304 | xgb-class-model | |
statsmodelpipeline | 2023-17-May 21:19:52 | 2023-17-May 21:19:55 | False | 4af264e3-f427-4b02-b5ad-4f6690b0ee06, 5456dd2a-3167-4b3c-ad3a-85544292a230 | bikedaymodel | |
isoletpipeline | 2023-17-May 21:17:33 | 2023-17-May 21:17:44 | False | c129b33c-cefc-4873-ad2c-d186fe2b8228, 145b768e-79f2-44fd-ab6b-14d675501b83 | isolettest | |
externalkerasautoconvertpipeline | 2023-17-May 21:13:27 | 2023-17-May 21:13:30 | False | 7be0dd01-ef82-4335-b60d-6f1cd5287e5b, 3948e0dc-d591-4ff5-a48f-b8d17195a806 | externalsimple-sentiment-model | |
gcpsdkpipeline | 2023-17-May 21:03:44 | 2023-17-May 21:03:49 | False | 6398cafc-50c4-49e3-9499-6025b7808245, 7c043d3c-c894-4ae9-9ec1-c35518130b90 | gcpsdkmodel | |
databricksazuresdkpipeline | 2023-17-May 21:02:55 | 2023-17-May 21:02:59 | False | f125dc67-f690-4011-986a-8f6a9a23c48a, 8c4a15b4-2ef0-4da1-8e2d-38088fde8c56 | ccfraudmodel | |
azuremlsdkpipeline | 2023-17-May 21:01:46 | 2023-17-May 21:01:51 | False | 28a7a5aa-5359-4320-842b-bad84258f7e4, e011272d-c22c-4b2d-ab9f-b17c60099434 | azuremlsdkmodel | |
copiedmodelpipeline | 2023-17-May 20:54:01 | 2023-17-May 20:54:01 | (unknown) | bcf5994f-1729-4036-a910-00b662946801 | ||
pipelinemodels | 2023-17-May 20:52:06 | 2023-17-May 20:52:06 | False | 55f45c16-591e-4a16-8082-3ab6d843b484 | apimodel | |
pipelinenomodel | 2023-17-May 20:52:04 | 2023-17-May 20:52:04 | (unknown) | a6dd2cee-58d6-4d24-9e25-f531dbbb95ad | ||
sdkquickpipeline | 2023-17-May 20:43:38 | 2023-17-May 20:46:02 | False | 961c909d-f5ae-472a-b8ae-1e6a00fbc36e, bf7c2146-ed14-430b-bf96-1e8b1047eb2e, 2bd5c838-f7cc-4f48-91ea-28a9ce0f7ed8, d72c468a-a0e2-4189-aa7a-4e27127a2f2b | sdkquickmodel | |
housepricepipe | 2023-17-May 20:41:50 | 2023-17-May 20:41:50 | False | 4d9dfb3b-c9ae-402a-96fc-20ae0a2b2279, fc68f5f2-7bbf-435e-b434-e0c89c28c6a9 | housepricemodel |
Search Pipelines by Tag
Pipelines can be searched through the Wallaroo Client search_pipelines(search_term)
method, where search_term
is a string value for tags assigned to the pipelines.
In this example, the text “My Great Tag” that corresponds to currentTag
will be searched for and displayed.
wl.search_pipelines('My Great Tag')
name | version | creation_time | last_updated_time | deployed | tags | steps |
---|---|---|---|---|---|---|
rehqtagtestpipeline | e259f6db-8ce2-45f1-b2d7-a719fde3b18f | 2023-17-May 21:56:21 | 2023-17-May 21:56:21 | (unknown) | My Great Tag |
Remove Tag from Pipeline
Tags are removed from a pipeline with the Wallaroo Tag remove_from_pipeline(pipeline_id)
command, where pipeline_id
is the integer value of the pipeline’s id.
For this example, currentTag
will be removed from tagtest_pipeline
. This will be verified through the list_tags
and search_pipelines
command.
## remove from pipeline
currentTag.remove_from_pipeline(tagtest_pipeline.id())
{'pipeline_pk_id': 45, 'tag_pk_id': 1}
wl.list_tags()
(no tags)
## Verify it was removed
wl.search_pipelines('My Great Tag')
(no pipelines)