The following tutorial is available on the Wallaroo Github Repository.
The Yolov8 computer vision model is used for fast recognition of objects in images. This tutorial demonstrates how to deploy a Yolov8n pre-trained model into a Wallaroo Ops server and perform inferences on it.
Wallaroo Ops Center provides the ability to publish Wallaroo pipelines to an Open Continer Initative (OCI) compliant registry, then deploy those pipelines on edge devices as Docker container or Kubernetes pods. See Wallaroo SDK Essentials Guide: Pipeline Edge Publication for full details.
For this tutorial, the helper module CVDemoUtils
and WallarooUtils
are used to transform a sample image into a pandas DataFrame. This DataFrame is then submitted to the Yolov8n model deployed in Wallaroo.
This tutorial relies on Computer Vision Yolov8n Pipeline Publish in Wallaroo being run first to create the publish.
This demonstration follows these steps:
To run this tutorial in the Wallaroo JupyterHub Service, import the tensorflow-cpu
library by executing the following command in the terminal shell:
pip install tensorflow-cpu==2.13.1 --user
Then proceed with the tutorial. This only applies to running this tutorial in Wallaroo’s JupyterHub service, and does not affect model upload and packaging in Wallaroo.
The first step is loading the required libraries including the Wallaroo Python module.
# Import Wallaroo Python SDK
import wallaroo
from wallaroo.object import EntityNotFoundError
from wallaroo.framework import Framework
from CVDemoUtils import CVDemo
from WallarooUtils import Util
cvDemo = CVDemo()
util = Util()
# used to display DataFrame information without truncating
from IPython.display import display
import pandas as pd
pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_columns', None)
The next step is to connect to Wallaroo through the Wallaroo client. The Python library is included in the Wallaroo install and available through the Jupyter Hub interface provided with your Wallaroo environment.
This is accomplished using the wallaroo.Client()
command, which provides a URL to grant the SDK permission to your specific Wallaroo environment. When displayed, enter the URL into a browser and confirm permissions. Store the connection into a variable that can be referenced later.
If logging into the Wallaroo instance through the internal JupyterHub service, use wl = wallaroo.Client()
. For more information on Wallaroo Client settings, see the Client Connection guide.
wl = wallaroo.Client()
We retrieve the publish to this specific pipeline through the pipeline.publishes()
method. We’ll retrieve the pipeline by setting the current pipeline and retrieving the current pipeline version, then list the publishes.
model_name = 'yolov8n'
model_filename = './models/yolov8n.onnx'
pipeline_name = 'yolo8demonstration'
workspace_name = f'yolo8-edge-demonstration'
workspace = wl.get_workspace(name=workspace_name, create_if_not_exist=True)
wl.set_current_workspace(workspace)
pipeline = wl.get_pipeline(pipeline_name)
# list the publishes from this pipeline
pipeline.publishes()
id | pipeline_version_name | engine_url | pipeline_url | created_by | created_at | updated_at |
---|---|---|---|---|---|---|
2 | 960dc42c-2aaf-4b18-81b1-4d2a8027eb75 | ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.4.0-6077 | ghcr.io/wallaroolabs/doc-samples/pipelines/yolo8demonstration:960dc42c-2aaf-4b18-81b1-4d2a8027eb75 | john.hansarick@wallaroo.ai | 2025-29-Apr 20:11:29 | 2025-29-Apr 20:11:29 |
# list publish details
pipeline.publishes()[0].pipeline_url
ID | 2 | |
Pipeline Name | yolo8demonstration | |
Pipeline Version | 960dc42c-2aaf-4b18-81b1-4d2a8027eb75 | |
Status | Published | |
Engine URL | ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.4.0-6077 | |
Pipeline URL | ghcr.io/wallaroolabs/doc-samples/pipelines/yolo8demonstration:960dc42c-2aaf-4b18-81b1-4d2a8027eb75 | |
Helm Chart URL | oci://ghcr.io/wallaroolabs/doc-samples/charts/yolo8demonstration | |
Helm Chart Reference | ghcr.io/wallaroolabs/doc-samples/charts@sha256:0b4d4dd37daed6628f03ea4d618bf5ae1538f8531ccd86522b141385c2a0e869 | |
Helm Chart Version | 0.0.1-960dc42c-2aaf-4b18-81b1-4d2a8027eb75 | |
Engine Config | {'engine': {'resources': {'limits': {'cpu': 1.0, 'memory': '1Gi'}, 'requests': {'cpu': 1.0, 'memory': '1Gi'}, 'arch': 'x86', 'accel': 'none', 'gpu': False}}, 'engineAux': {'images': {}, 'autoscale': {'type': 'none'}}} | |
User Images | [] | |
Created By | john.hansarick@wallaroo.ai | |
Created At | 2025-04-29 20:11:29.045179+00:00 | |
Updated At | 2025-04-29 20:11:29.045179+00:00 | |
Replaces | ||
Docker Run Command |
Note: Please set the EDGE_PORT , OCI_USERNAME , and OCI_PASSWORD environment variables. | |
Helm Install Command |
Note: Please set the HELM_INSTALL_NAME , HELM_INSTALL_NAMESPACE ,
OCI_USERNAME , and OCI_PASSWORD environment variables. |
Once a pipeline is deployed to the Edge Registry service, it can be deployed in environments such as Docker, Kubernetes, or similar container running services by a DevOps engineer.
First, the DevOps engineer must authenticate to the same OCI Registry service used for the Wallaroo Edge Deployment registry.
For more details, check with the documentation on your artifact service. The following are provided for the three major cloud services:
For the deployment, the engine URL is specified with the following environmental variables:
DEBUG
(true|false): Whether to include debug output.OCI_REGISTRY
: The URL of the registry service.CONFIG_CPUS
: The number of CPUs to use.OCI_USERNAME
: The edge registry username.OCI_PASSWORD
: The edge registry password or token.PIPELINE_URL
: The published pipeline URL.From our published pipeline we have the docker run
command. This command is used to deploy the publish.
docker run \
-p $EDGE_PORT:8080 \
-e OCI_USERNAME=$OCI_USERNAME \
-e OCI_PASSWORD=$OCI_PASSWORD \
-e CONFIG_CPUS=1.0 --cpus=1.0 --memory=1g \
-e PIPELINE_URL={your registry server}/pipelines/edge-cv-retail:bf70eaf7-8c11-4b46-b751-916a43b1a555 \
ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.4.0-6077
docker_deploy = f'''
docker run \\
-p $EDGE_PORT:8080 \\
-e OCI_USERNAME=$OCI_USERNAME \\
-e OCI_PASSWORD=$OCI_PASSWORD \\
-e CONFIG_CPUS=1.0 --cpus=1.0 --memory=1g \\
-e PIPELINE_URL={pipeline.publishes()[0].pipeline_url} \\
ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.4.0-6077
'''
print(docker_deploy)
docker run \
-p $EDGE_PORT:8080 \
-e OCI_USERNAME=$OCI_USERNAME \
-e OCI_PASSWORD=$OCI_PASSWORD \
-e CONFIG_CPUS=1.0 --cpus=1.0 --memory=1g \
-e PIPELINE_URL=ghcr.io/wallaroolabs/doc-samples/pipelines/yolo8demonstration:960dc42c-2aaf-4b18-81b1-4d2a8027eb75 \
ghcr.io/wallaroolabs/doc-samples/engines/proxy/wallaroo/ghcr.io/wallaroolabs/fitzroy-mini:v2024.4.0-6077
Once deployed, we can check the pipelines and models available. We’ll use a curl
command, but any HTTP based request will work the same way.
The endpoint /pipelines
returns:
Running
, or Error
if there are any issues.curl localhost:8080/pipelines
{"pipelines":[{"id":"yolo8demonstration","status":"Running"}]}
The following example uses the host localhost
. Replace with your own host name of your Edge deployed pipeline.
!curl testboy.lan:8081/pipelines
{"pipelines":[{"id":"yolo8demonstration","version":"960dc42c-2aaf-4b18-81b1-4d2a8027eb75","status":"Running"}]}
The endpoint /models
returns a List of models with the following fields:
{"models":[{"name":"yolov8n","sha":"3ed5cd199e0e6e419bd3d474cf74f2e378aacbf586e40f24d1f8c89c2c476a08","status":"Running","version":"7af40d06-d18f-4b3f-9dd3-0a15248f01c8"}]}
The following example uses the host localhost
. Replace with your own host name of your Edge deployed pipeline.
!curl testboy.lan:8081/models
{"models":[{"sha":"3ed5cd199e0e6e419bd3d474cf74f2e378aacbf586e40f24d1f8c89c2c476a08","name":"yolov8n","version":"3fad4605-7384-4f26-9f4c-b6712140310f","status":"Running","model_version_id":17}]}
The inference endpoint takes the following pattern:
/infer
Wallaroo inference endpoint URLs accept the following data inputs through the Content-Type
header:
Content-Type: application/vnd.apache.arrow.file
: For Apache Arrow tables.Content-Type: application/json; format=pandas-records
: For pandas DataFrame in record format.Once deployed, we can perform an inference through the deployment URL.
The endpoint returns Content-Type: application/json; format=pandas-records
by default with the following fields:
null
if the input may be too long for a proper return.Once deployed, we can perform an inference through the deployment URL. We’ll assume we’re running the inference request through the localhost and submitting the local file ./data/dogbike.df.json
. Note that our inference endpoint is pipelines/yolo8demonstration
- the same as our pipeline name.
The following example demonstrates sending an inference request to the edge deployed pipeline and storing the results in a pandas DataFrame in record format. The results can then be exported to other processes to render the detected images or other use cases.
!curl -X POST testboy.lan:8081/infer \
-H "Content-Type: application/json; format=pandas-records" \
--data @./data/dogbike.df.json > edge-results.df.json
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 38.0M 100 22.9M 100 15.0M 4760k 3133k 0:00:04 0:00:04 --:--:-- 7631k 15.0M 0 7327k 0:00:02 0:00:02 --:--:-- 7327k