The Wallaroo.AI Cheat Sheet
Table of Contents
The following tables provide a quick reference to common Wallaroo commands for the busy systems administrator and developer.
Installation and Install Administration
The following commands required administrative access to the Wallaroo installation and the kubectl command.
| Request | Command | Description | 
|---|---|---|
| Launch Kots Admin Dashboard | kubectl kots admin-console --namespace wallaroo | Launches the Kots Administrative Dashboard. Replace the namespace wallaroowith the namespace Wallaroo is installed in. | 
| Get Keycloak Admin Password | kubectl \-n wallaroo get secret keycloak-admin-secret \-o go-template='{{.data.KEYCLOAK_ADMIN_PASSWORD | base64decode }}{{"\n"}}' | Display the adminpassword. Replace-n wallaroowith the name of the namespace Wallaroo is installed in. | 
Wallaroo SDK
The following commands are used via the Wallaroo SDK. Each example assumes the Wallaroo Client variable is saved to wl, for example:
import wallaroo
wl = wallaroo.Client()
The commands and parameters below are not complete and comprehensive, and represent the most common parameters. For a comprehensive list, see the reference links in the Description fields.
Connection
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Connect Client | Within the Wallaroo JupyterHub service: wl = wallaroo.Client().From an external SDK session: wl = wallaroo.Client(api_endpoint="https://example.wallaroo.ai", auth_type="sso"). For more details, see Wallaroo SDK Essentials Guide: Client Connection. | Connects the Wallaroo client to the Wallaroo Ops instance. | wallaroo.Client | 
 | 
Workspaces
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Create Workspace | workspace = wl.create_workspace("example-workspace") | Creates the workspace if the workspace name does not already exist. For more details, see Workspace Management | wallaroo.Client.create_workspace | 
 | 
| Get Workspace | workspace = wl.get_workspace(name="metric-retrieval-tutorial", create_if_not_exist=True) | Retrieves the workspace by name if exists; if create_if_not_existisTrue, creates the workspace if it does not already exist. For more details, see Workspace Management | wallaroo.Client.get_workspace | 
 | 
| Set Current Workspace | wl.set_current_workspace(workspace) | Sets the current SDK session workspace as the specified workspace. At this point all model uploads, pipeline builds, etc take place in this workspace by default until the end of the SDK session. For more details, see Workspace Management | wallaroo.Client.set_current_workspace | 
 | 
Models Upload
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Upload Model | model = wl.upload_model(name=model_name, path=model_file_name, framework=wallaroo.framework.Framework.ONNX) | Uploads the model to the current workspace. Models uploaded with the same nameas an existing model create a new model version. This returns the references to the model version just uploaded. For more details, see Model Upload. | wallaroo.client.Client.upload_model | 
 | 
| Get Model Version from Workspace | model = wl.get_model("sample-model") | Retrieves the most model version matching the name and model version in the current workspace. For more details, see Model Management | wallaroo.Client.get_model | 
 | 
Pipeline Deployment
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Build New Pipeline | pipeline = wl.build_pipeline("sample-pipeline") | Creates a new pipeline in the current workspace. If a new pipeline is built with the same name, a new pipeline version is created. Pipeline versions are not stored in Wallaroo until deployment. For more details, see Create a Pipeline. | wallaroo.Client.build_pipeline | 
 | 
| Get Pipeline from Workspace | pipeline = wl.get_pipeline(name="sample-pipeline") | Retrieves the latest pipeline version by name or the specified pipeline version matching the name and version. For more details, see Get Pipeline. | wallaroo.Client.get_pipeline | 
 | 
| Get pipeline status | pipeline.status() | Displays the current status including deployment status, models, etc. For more details, see Get Pipeline Status | wallaroo.pipeline.Pipeline.status() | 
 | 
| Add Pipeline Step | pipeline.add_model_step(sample_model) | Sets the model version as a step for the pipeline. Pipeline steps are not saved in Wallaroo until the pipeline is deployed. For more details, see Add a Step to a Pipeline. | wallaroo.pipeline.Pipeline.add_model_step | 
 | 
Deployment Configuration Options
Deployment configuration determines that resources are allocated from the cluster for the model’s use. The configuration main determination is whether the all models are Native Runtimes, or if one or more are Containerized Runtime models. For more details on supported models, runtimes and hardware options, see Deployment Configuration with the Wallaroo SDK.
For these examples the DeploymentConfigBuilder is imported directly into the Python script for ease of use:
from wallaroo.deployment_config import DeploymentConfigBuilder
- Native Runtime Deployment Configuration: All models run on Wallaroo Native Runtime.
from wallaroo.deployment_config import DeploymentConfigBuilder
DeploymentConfigBuilder()
    .cpus(4)
    .memory('3Gi')
    .replica_autoscale_min_max(minimum=0, maximum=5)
    .build()
- Containerized Runtime Deployment Configuration: A model is deployed to the Wallaroo Containerized Runtime with GPU support with no models deployed to the Wallaroo Native runtime. Note that for Containerized Runtimes:- The term sidekickis included for specify the model(s) deployed in the Wallaroo Containerized Runtime and the specific model the resources are applied to. For this example, no GPUs are applied to the Wallaroo Native Runtime
- When deploying with GPUs, the parameter deployment_labelmust be included whether in the Wallaroo Native Runtime or the Wallaroo Containerized Runtime.
 
- The term 
from wallaroo.deployment_config import DeploymentConfigBuilder
wallaroo.DeploymentConfigBuilder()
    .replica_count(5)
    .cpus(0.25)
    .memory('1Gi')
    .gpus(0)
    .deployment_label('doc-gpu-label:true')
    .sidekick_gpus(model, 1)
    .sidekick_cpus(model, 4)
    .sidekick_memory(model, '3Gi')
    .build()
Pipeline Deploy and Undeploy Options
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Deploy | pipeline.deploy(deployment_config=deploy_config) | Deploys the pipeline with the set pipeline steps and the resources as defined by the deployment configuration. The optional wait_for_statusparameter allows for asynchronous deployment without waiting for the results. Deployment results are checked from the pipeline status. For more details, see Deploy a Pipeline. | wallaroo.pipeline.Pipeline.deploy() | 
 | 
| Undeploy Pipeline | pipeline.undeploy() | Deploys the pipeline and returns the resources back to the cluster. For more details, see Undeploy a Pipeline. | wallaroo.pipeline.Pipeline.undeploy() | 
 | 
Pipeline Inference
| Request | Example | Description | Command | Params | 
|---|---|---|---|---|
| Infer from Object | inference_results = pipeline.infer(dataframe_input, timeout=300) | Performs an inference from the submitted data for each model set as pipeline steps, starting with the first model accepting the submitted input, then each subsequent model receiving the input from the previous model. For more details, see Inference. | wallaroo.pipeline.Pipeline.infer | 
 | 
| Infer from File | inference_results = pipeline.infer_from_file("./data/sample_input.json") | Submits the file as an inference request from the submitted data for each model set as pipeline steps, starting with the first model accepting the submitted input, then each subsequent model receiving the input from the previous model. For more details, see Inference from File | wallaroo.pipeline.Pipeline.infer_from_file | 
 | 
| Get Pipeline Inference Logs and Metadata | pipeline.logs(start_datetime = task_start,end_datetime = task_end) | Returns as a pandas DataFrame the inference results in reverse chronological order. For more details, see Get Pipeline Logs. | wallaroo.pipeline.Pipeline.logs() | 
 | 
Troubleshooting
Some of the following commands required administrative access to the Wallaroo installation and the kubectl command.
Retrieve Pipeline System Logs
The following steps are used to view the Kubernetes pod log during Pipeline deployment and execution. DevOps engineers modify the deployed Pipeline in Kubernetes to allow for trace level logs and display them as needed.
Set Log Level to Trace
Wallaroo pipelines are deployed through its own Kubernetes namespace in the format , we can retrieve additional log data directly from the Kubernetes deployment.
- From the terminal prompt, determine the namespace for the deployed pipeline with the command - kubectl get namespaces. This is in the format- {pipeline-name}-{id}. For example, the pipeline- python-step-demo-pipelineis- python-step-demo-pipeline-48:- kubectl get namespaces NAME STATUS AGE default Active 153d edge-pipeline-6 Terminating 22m kube-node-lease Active 153d kube-public Active 153d kube-system Active 153d python-step-demo-pipeline-48 Active 53s velero Active 140d wallaroo Active 4d2h
- Edit the - enginefor the pipeline with the command- kubectl edit deploy engine -n {pipeline-namespace}. For example, using the pipeline deployed as- python-step-demo-pipeline-48, that command is:- kubectl edit deploy engine -n python-step-demo-pipeline-48
- Within the deploy engine config, scroll to - spec.template.spec.containers.envwhere name=- RUST_LOG. The easiest way is to search by pressing the- /and typing in- RUST_LOG. This is how it looks by default:- env: - name: RUST_LOG value: debug,fitzroy::sink=info - name: RUST_BACKTRACE value: full - name: MINIO_NAMESPACE value: wallaroo
- Update the - RUST_LOG- valuefrom- debugto- trace. This is done by placing the cursor over at- debug, then typing- i(for- insert), and deleting- debugto- trace. Look up instructions on using the- vieditor. It will look like this.- env: - name: RUST_LOG value: trace,fitzroy::sink=info - name: RUST_BACKTRACE value: full - name: MINIO_NAMESPACE value: wallaroo
- When finished, hit - ESCto exit the editor, then enter- :wq. Wait for the- enginepod to reset. Use the command- kubectl get pods -n {pipeline_namespace}to view the pods. The engine pod(s) will reset. For example:- kubectl get pods -n python-step-demo-pipeline-48 NAME READY STATUS RESTARTS AGE engine-85cb7b6bd8-bqf6v 1/1 Running 0 17s engine-d4cfb84c8-fs89h 1/1 Terminating 0 2m55s engine-lb-584f54c899-qj55z 1/1 Running 0 2m55s helm-runner-s5t5k 0/1 Completed 0 2m56s
View Kubernetes Pipeline Logs
- Now we can view the logs with - kubectl logs --follow -n {pipeline_namespace} {engine_pod}. With the pipeline and pods above:- kubectl logs --follow -n python-step-demo-pipeline-48 engine-85cb7b6bd8-bqf6v- Here’s the standard output: - {"msg":"deregistering event source from poller","level":"TRCE","ts":"2023-09-12T16:42:52.660761783Z","file_line":"/usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.8.8/src/poll.rs:662"} {"msg":"registering event source with poller: token=Token(419430405), interests=READABLE | WRITABLE","level":"TRCE","ts":"2023-09-12T16:43:02.659772451Z","file_line":"/usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.8.8/src/poll.rs:531"} {"msg":"deregistering event source from poller","level":"TRCE","ts":"2023-09-12T16:43:02.661156669Z","file_line":"/usr/local/cargo/registry/src/github.com-1ecc6299db9ec823/mio-0.8.8/src/poll.rs:662"}- Here’s the output when an inference is run for the following Python step model. The outputs are from the lines - {"msg":"Python: stdout:- import pandas as pd # take a dataframe output of the house price model, and reformat the `dense_2` # column as `output` def wallaroo_json(data: pd.DataFrame): print(data) return [{"output": [data["dense_2"].to_list()[0][0]]}]- {"msg":"beginning inference for python-step","level":"TRCE","ts":"2023-09-12T16:43:25.176094863Z","file_line":"src/model/manager.rs:416"} {"msg":"input: {\"_wallaroo_pandas\":[{\"dense_2\":[12.886651]}]}","level":"TRCE","ts":"2023-09-12T16:43:25.176198964Z","file_line":"src/runtimes/python/pipeline_session.rs:92"} {"msg":"Python: stdout: dense_2","level":"INFO","ts":"2023-09-12T16:43:25.265281832Z","file_line":"src/runtimes/python/process.rs:196"} {"msg":"Python: stdout: 0 [12.886651]","level":"INFO","ts":"2023-09-12T16:43:25.265315833Z","file_line":"src/runtimes/python/process.rs:196"} {"msg":"Python result receiver: Got result from python","level":"INFO","ts":"2023-09-12T16:43:25.265321733Z","file_line":"src/runtimes/python/session_wrapper.rs:79"} {"msg":"output: [{\"output\":[12.886651]}]","level":"TRCE","ts":"2023-09-12T16:43:25.265327433Z","file_line":"src/runtimes/python/pipeline_session.rs:103"}
Get Orchestration Task Logs
The list of tasks in the Wallaroo instance is retrieves through the Wallaroo Client list_tasks() method. The task list returned is based on the workspaces the user is a member of and the filtered parameters, in reverse chronological order.
Admin users have unrestricted access to all workspaces. For more details, see Wallaroo Enterprise User Management.
The following demonstrates retrieving the task list, and how to assign a task to a variable.
wl.list_tasks()
| id | name | last run status | type | active | schedule | created at | updated at | workspace id | workspace name | 
|---|---|---|---|---|---|---|---|---|---|
| e44070f4-2638-4778-9d87-b13d457181ec | simpletaskdemo | running | Temporary Run | True | - | 2024-16-Jul 19:31:38 | 2024-16-Jul 19:31:44 | 30 | simpleorchestrationworkspace2 | 
| 5cd594fe-36fd-4db5-9000-8b090a8fa9e3 | simple_inference_schedule | running | Scheduled Run | True | */5 * * * * | 2024-16-Jul 19:18:07 | 2024-16-Jul 19:18:08 | 28 | simpleorchestrationworkspace | 
| 2de50c93-dbe3-45af-ae9d-657540275405 | simpletaskdemo | success | Temporary Run | True | - | 2024-16-Jul 19:15:47 | 2024-16-Jul 19:17:39 | 28 | simpleorchestrationworkspace | 
| 01e13d2e-a402-4b43-b790-ab76148bba51 | simpletaskdemo | failure | Temporary Run | True | - | 2024-16-Jul 19:03:05 | 2024-16-Jul 19:03:32 | 28 | simpleorchestrationworkspace | 
task = wl.list_tasks()[0]
Orchestration tasks logs are retrieved via the Wallaroo SDK. For full details, see Task Run Logs.
The output of a task is displayed with the Task Run logs() method that takes the following parameters.
| Parameter | Type | Description | 
|---|---|---|
| limit | Integer (Optional) | Limits the lines returned from the task run log. The limitparameter is based on the log tail - starting from the last line of the log file, then working up until the limit of lines is reached. This is useful for viewing final outputs, exceptions, etc. | 
The Task Run logs() returns the log entries as a string list, with each entry as an item in the list. For example, if the task has run 5 times, then the last_runs() command shows the first task as last_runs()[0], last_runs()[1], etc.
- IMPORTANT NOTE: It may take around a minute for task run logs to be integrated into the Wallaroo log database.
# give time for the task to complete and the log files entered
time.sleep(60)
recent_run = task.last_runs()[0]
display(recent_run.logs())
2023-22-May 19:59:29 Getting the pipeline orchestrationpipelinetgiq
2023-22-May 19:59:29 Getting arrow table file
2023-22-May 19:59:29 Inference time.  Displaying results after.
2023-22-May 19:59:29 pyarrow.Table
2023-22-May 19:59:29 time: timestamp[ms]
2023-22-May 19:59:29 in.tensor: list<item: float> not null
2023-22-May 19:59:29   child 0, item: float
2023-22-May 19:59:29 out.variable: list<inner: float not null> not null
2023-22-May 19:59:29 anomaly.count: int8
2023-22-May 19:59:29   child 0, inner: float not null
2023-22-May 19:59:29 ----
2023-22-May 19:59:29 time: [[2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,...,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767,2023-05-22 19:58:49.767]]
2023-22-May 19:59:29 in.tensor: [[[4,2.5,2900,5505,2,...,2970,5251,12,0,0],[2,2.5,2170,6361,1,...,2310,7419,6,0,0],...,[3,1.75,2910,37461,1,...,2520,18295,47,0,0],[3,2,2005,7000,1,...,1750,4500,34,0,0]]]
2023-22-May 19:59:29 check_failures: [[0,0,0,0,0,...,0,0,0,0,0]]
2023-22-May 19:59:29 out.variable: [[[718013.75],[615094.56],...,[706823.56],[581003]]]</code></pre>
Get Orchestration Upload Logs
During the Orchestration upload process, the packaging and conversion process is viewable through the following procedures using kubectl.
- During the orchestration upload process, use - kubectl get namespacesto identify that namespace used. It will be in the format- tasks-{identifier}. For example, in the following the- tasks-1f26cbf9-ccc9-4c-201 second old, indicating it is the task we want for a recent upload orchestration command.- kubectl get namespaces NAME STATUS AGE api-pipeline-with-models-11 Active 23d cellai-pipeline-sdk-10 Active 28d default Active 588d gke-managed-system Active 588d gke-managed-volumepopulator Active 159d gmp-public Active 588d gmp-system Active 588d kube-node-lease Active 588d kube-public Active 588d kube-system Active 588d metrics-retrieval-tutorial-pipeline-9 Active 30d simpleorchestrationtutorial-12 Active 3m22s tasks-1f26cbf9-ccc9-4c-20 Active 1s <--- OUR TASK> velero Active 512d wallaroo Active 140d
- Use the - get pods -n {namespace}command to retrieve a list of the pods used. For example:- kubectl get pods -n tasks-1f26cbf9-ccc9-4c-20 NAME READY STATUS RESTARTS AGE 1f2-exec-orch-packaging-arb-one-exe-76r4f 1/1 Running 0 25s
- Use the command - kubectl -f {POD NAME} -n {NAMESPACE}to retrieve and continue to follow the process. For example:- kubectl logs -f 1f2-exec-orch-packaging-arb-one-exe-76r4f -n tasks-1f26cbf9-ccc9-4c-20 +++ dirname /build/mkenv.sh ++ realpath /build + BUILD_PATH=/build + MINIO_NAME=minio + MINIO_HOST=http://minio.wallaroo.svc.cluster.local:9000 + MINIO_BASE=model-bucket + MINIO_USER=minio + MINIO_PASS=AYf47eeewd67ELd0kyzisr2Zcm35zhRg + SCRIPT_PATH=/tmp + NATS_SUBJECT_SUB=orch_packaging.start + NATS_SUBJECT_PUB=orch_packaging.update + READY_NATS_SUBJECT_PUB=orch_packaging.ready + VENV_PATH=/home/jovyan/venv + PYPI_INDEX_ARGS= + '[' -n '' ']' + echo 'No private PyPI configuration found, using default public PyPI' No private PyPI configuration found, using default public PyPI + trap 'cat /build/start.json | jq -cM '\''.data.status="failure"'\'' | nats -s nats.wallaroo.svc.cluster.local publish --force-stdin orch_packaging.update' ERR + sleep 5 + nats -s nats.wallaroo.svc.cluster.local sub orch_packaging.start -r --count=1 + nats -s nats.wallaroo.svc.cluster.local publish --force-stdin orch_packaging.ready ++ date -u --iso-8601=ns ++ sed s/+00:00/Z/ ++ sed s/,/./