Custom Model Best Practices
Table of Contents
The following guide details the best practices for Wallaroo’s Custom Model, also known as Bring Your Own Model (BYOP) framework. It outlines the recommended order of operations to test, validate, and deploy a BYOP model efficiently.
It is recommended that scripts are tested locally (or on a virtual machine) to verify the Custom Model works as expected before uploading to Wallaroo. This helps find issues before uploading and deploying the model in Wallaroo.
Some of the troubleshooting checks require administrative access to the cluster hosting the Wallaroo installation and the kubectl command.
Custom Model Basics
Custom Model (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:
| Artifact | Type | Description |
|---|---|---|
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder | Python Script (Required) | Extend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Custom Model Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below. |
requirements.txt | Python requirements file (Required) | This sets the Python libraries used for the Custom Model. Best practice is to lock the requirements to a specific version; for example sample-library==1.2.1 instead of sample-library. These libraries should be targeted for Python 3.10 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process. This file is required - even if the requirements.txt file is empty, the file must be present and be the only requirements.txt file included in ZIP file. |
artifacts folder | Folder (Optional) | The artifacts folder in the root directory of the ZIP file contains additional files, directories, etc that are not part of the inference process but may be used for other purposes. This directory is ignored by the Wallaroo validation process at model upload (including other Python scripts, training artifacts, documentation, etc). This directory is not required. |
| Additional Files and Folders | Files (Optional) | Other models, files, and other artifacts used in support of this model. Note that any items outside of the root folder artifacts are subject to verification based on the rules above. |
For full details on Custom Models requirements and use, see Custom Model.
Test the Custom Model
Before uploading the Custom Model to Wallaroo, follow these best practices to verify the Custom Model performs as expected.
Prepare the Testing Environment
This step prepares the testing environment, where the Custom Model is verified before uploading to Wallaroo. This environment is created via the following steps.
- Create a Virtual Environment. For this step, use
condaorvenvto create a new environment with the following conditions:- Python Version: Set to Python 3.10 (e.g., 3.10.16).
- Wallaroo SDK Install the
wallaroopackage via pip. Ensure the version matches your Wallaroo instance version.
For example, using conda, the command would be:
conda create -n "byop-test-environment" python=3.10
Then install Wallaroo with the command:
pip install wallaroo==2025.2.1
Prepare the Model Artifact
The BYOP framework handles most native Python code, but specific formatting is required for production stability. Use the following process to create the proper structure.
- Gather Packages: Create a
requirements.txtfile listing all necessary Python packages.- Note: Ensure these packages are compatible with Linux-based environments (packages with C-extensions must be Linux-compatible).
- Folder Structure: Create a clean working directory with the following structure:
byop/(Contains your model script and artifacts)requirements.txt: REQUIRED A list of required Python libraries for this Custom Model not provided by default with the Wallaroo SDK. Best practice is to lock by version; for examplesample-library=1.2.4instead ofsample-library.artifacts/: OPTIONAL Additional files and folders not part of the inference process but are included but used for other purposes. Any files or folders in this directory are ignored by the Custom Model validation process.
data/(Contains testing inputs/outputs)schemas/(Contains Arrow schema files)deployment.ipynb(Ortest.pyfor local orchestration)
IMPORTANT NOTE
Before running any scripts, ensure that there are no hidden files in the folder. Common hidden files to check for and remove include:
.venv directories.DS_Storefiles__pycache__ directories
Prepare Test Data
This step highly recommends using a Python script, for example test.py that contains the Custom Model mac.inference.Inference and mac.inference.creation.InferenceBuilder extensions. For best results, define the inputs and expected outputs to ensure the “contract” is clear.
This script should include all steps for running the model locally, from loading and shaping data to performing the inference to dealing with outputs.
- Input/Output Relationship: Ensure a strictly 1:1 relationship between input rows and output rows. Wallaroo does not support many-to-one or one-to-many inference mapping in this context.
- Data Format:
- Format data as a dictionary of numpy arrays.
- Each key represents a field name; the value is the numpy array of data.
- This replicates the
InferenceDataformat used by the Wallaroo backend
- Action: Verify the output data format from the
test.pyagrees with all Wallaroo requirements before proceeding to the next step.
Develop the Inference Builder Class
Create a script (e.g., custom_model.py) inside the byop folder. This will be the main entry point.
The InferenceBuilder class loads artifacts efficiently to prevent high latency during inference. To achieve this:
- Implement the
createfunction.- Inside
create, call a helper function (e.g.,load_artifacts()) to load heavy files: model weights, static lookup tables, explainers, or configuration files. - Do not load the
load_artifactsmethod in thepredictfunction. Doing so will cause them to reload on every single inference request, significantly negatively affecting performance. - Store these artifacts in a dictionary and assign them to
inference.model.
- Inside
Develop the Inference Class
The Inference class manages the inference process, taking in the data, processing the inference request and returning the results.
The predict(input_data:InferenceData):InferenceData is the entry point for execution and performs the following:
- Retrieving artifacts from
self.model['artifact_key']. - Accepting input data as a dictionary of numpy arrays.
The following best practices for the Inference Class are:
- Code Organization:
- Keep the
predictmethod readable. - Extract complex logic into helper functions located in a separate script (e.g.,
utils.py) within thebyopfolder.
- Keep the
- Error Handling & Logging:
- Wrap model steps in
try/exceptblocks. - Import the
tracebackmodule. - Use a logger (instantiated at the script level) to capture errors:
logger.error(traceback.format_exc()). - Timing: Log the execution time of critical functions to assist with latency debugging.
- Wrap model steps in
- Return Values:
- Return a dictionary of numpy arrays.
- Ensure data types match the expected schema.
- Data can not be nested dictionaries.
- Nested JSON’s should be converted to a string and returned.
Local Verification
Use the test.py script to verify the sample Custom Model script as follows:
- Convert Results: In your
test.py, convert the result dictionary to a Pandas DataFrame for easy inspection. - Shape Check: Verify the input row count matches the output row count exactly.
- Value Check: validate that the inference results match expected values.
Sample Test Script
The following example shows a test script in action which performs the following:
- Loads sample data.
- Loads any modules used by the Custom Model.
- Executes an inference request and displays the results and the data shape.
# import libraries
from pathlib import Path
from mac.config.inference import CustomInferenceConfig
from byop.byop import BYOPInferenceBuilder
import pandas as pd
import pyarrow as pa
# create the input dataframe and convert to dictionary for testing
input_df = pd.DataFrame({"input_number": [1,2,3],
"id": [20000000004093819,20012684296980773,481562342]
})
input_dictionary = {
col: input_df[col].to_numpy() for col in input_df.columns
}
print(input_dictionary)
# prepare the BYOP and import any modules
builder = BYOPInferenceBuilder()
config = CustomInferenceConfig(
framework="custom",
model_path=Path("./byop/"), modules_to_include={"*.py"}
)
# create the BYOP object
inference = builder.create(config)
# run a simulated inference
results = inference.predict(input_data=input_dictionary)
print(results)
# Schema Generation
# convert results into a dataframe
results_df = pd.DataFrame({
key : value.tolist() for key, value in results.items()
})
input_schema = pa.Schema.from_pandas(input_df).remove_metadata()
output_schema = pa.Schema.from_pandas(results_df).remove_metadata()
print(input_schema)
print(output_schema)
The following shows the results of this script:
python test.py
{'input_number': array([1, 2, 3]), 'id': array([20000000004093819, 20012684296980773, 481562342])}
INFO byop.byop - INFO: Starting prediction process byop.py:36
INFO byop.byop - INFO: Gathering of input data features: 2 byop.py:38
INFO byop.byop - INFO: Converting input data to DataFrame byop.py:41
INFO byop.byop - INFO: Running model prediction byop.py:48
INFO byop.byop - INFO: Predictions completed. byop.py:61
{'result': array([21, 22, 23]), 'id': array([20000000004093819, 20012684296980773, 481562342])}
input_number: int64
id: int64
result: int64
id: int64
Schema Generation
Once the test script verifies the Custom Model works as defined:
- Generate the Input and Output schemas using the verified data.
- Save these as PyArrow schemas in your
schemasfolder.- These schemas are mandatory during the Wallaroo model upload process.
For more details, see:
Packaging and Deployment
With the Custom Model testing complete, it is ready for packaging and uploading to Wallaroo.
Zipping the Artifacts
Custom Model (BYOP) models are uploaded to Wallaroo via a ZIP file with the following components:
| Artifact | Type | Description |
|---|---|---|
Python scripts aka .py files with classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder | Python Script (Required) | Extend the classes mac.inference.Inference and mac.inference.creation.InferenceBuilder. These are included with the Wallaroo SDK. Further details are in Custom Model Script Requirements. Note that there is no specified naming requirements for the classes that extend mac.inference.Inference and mac.inference.creation.InferenceBuilder - any qualified class name is sufficient as long as these two classes are extended as defined below. |
requirements.txt | Python requirements file (Required) | This sets the Python libraries used for the Custom Model. Best practice is to lock the requirements to a specific version; for example sample-library==1.2.1 instead of sample-library. These libraries should be targeted for Python 3.10 compliance. These requirements and the versions of libraries should be exactly the same between creating the model and deploying it in Wallaroo. This insures that the script and methods will function exactly the same as during the model creation process. This file is required - even if the requirements.txt file is empty, the file must be present and be the only requirements.txt file included in ZIP file. |
artifacts folder | Folder (Optional) | The artifacts folder in the root directory of the ZIP file contains additional files, directories, etc that are not part of the inference process but may be used for other purposes. This directory is ignored by the Wallaroo validation process at model upload (including other Python scripts, training artifacts, documentation, etc). This directory is not required. |
| Additional Files and Folders | Files (Optional) | Other models, files, and other artifacts used in support of this model. Note that any items outside of the root folder artifacts are subject to verification based on the rules above. |
The following represents a sample Custom Model and the artifacts, and the zip command for packaging them.
sample_model/
requirements.txt
sample_model.h5
custom_model.py
artifacts/
additional_file.txt
another_file.txt
zip -r sample_model.zip sample_model/*
Once zipped, use the Model Upload guide to upload the model into Wallaroo. The following is a sample command of that procedure using the Wallaroo SDK and the model sample-model and the model file sample_model.zip, and assumes the input_schema and output_schema variables are properly defined.
# import the Wallaroo SDK
import wallaroo
# set the Wallaroo client
wl = wallaroo.Client()
# upload the model
model = wl.upload_model("sample-model",
path="sample_model.zip",
framework=wallaroo.framework.Framework.CUSTOM,
input_schema=input_schema,
output_schema=output_schema)
Model Upload Troubleshooting
Use the following guides to resolve issues that may arise. Some of these require use of the kubectl terminal application and administrative access to the cluster hosting the Wallaroo installation.
Namespace Check
If deployment stalls, run kubectl get ns. Look for a namespace named task-{number} (usually the most recent one). Use the kubectl command to get the pods and show the logs. For example:
Get the pods in the namespace:
kubectl get pods -n task-sample-namespace
Get the logs of a specific pod. The option -f is to continue following the log output.
kubectl logs -f sample-pod-name -n task-sample-namespace
Deployment and Inference Testing
With the model uploaded, next is testing the actual deployment of the model. Deploying a model follows the same procedure as Model Deploy:
- Create a pipeline and assign the model as a pipeline step.
- Define the deployment configuration and assign how many cpus, memory, etc that will be allocated. Custom Models are assigned to the Wallaroo Containerized Runtime, and their resource allocations are processed using the
sidekickmethods (sidekick_cpus,sidekick_memory, etc) as defined in Deployment Configuration with the Wallaroo SDK. - Issue the deploy command with the deployment configuration.
The following is a simplified version of this process for a sample Custom Model previously uploaded and the model version saved to the variable model. The model is assigned 4 cpus and 2 G of memory through the sidekick_cpus and sidekick_memory options.
Note that this model is a sample - large models may require more resources, gpus, or other requirements.
# set the deployment configuration
deployment_config = DeploymentConfigBuilder() \
.cpus(1).memory('2Gi') \
.sidekick_cpus(model, 4) \
.sidekick_memory(model, '2Gi') \
.build()
# build the pipeline and assign the model as a pipeline step
pipeline = wl.build_pipeline("sample-pipeline")
pipeline.add_model_step(model)
# deploy the pipeline with the deployment configuration
pipeline.deploy(deployment_config=deployment_config)
Verify the pipeline is deployed via the pipeline.status() method. If there are no issues, the deployment status will first show Starting, and when deployment is complete the status is Running. If there are errors, the status is Error. The following is an example of a successful deployment. Note that sample-model is running under the engine_lbs and sidekicks sections.
pipeline.status()
{'status': 'Running',
'details': [],
'engines': [{'ip': '10.240.5.6',
'name': 'engine-7bd8d4664d-69qfx',
'status': 'Running',
'reason': None,
'details': [],
'pipeline_statuses': {'pipelines': [{'id': 'sample-pipeline',
'status': 'Running',
'version': 'c27736f6-0ee2-4ca0-9982-9845d2d5f756'}]},
'model_statuses': {'models': [{'name': 'sample-model',
'sha': 'ffa1a170b0e1628924c18a96c44f43c8afef1e535e378c2eb071a61dd282c669',
'status': 'Running',
'version': '4d3f402d-e242-409f-8678-29c18f59a4a8'}]}}],
'engine_lbs': [{'ip': '10.240.5.7',
'name': 'engine-lb-776bbf49b9-rb5mt',
'status': 'Running',
'reason': None,
'details': []}],
'sidekicks': [{'ip': '10.240.5.8',
'name': 'engine-sidekick-sample-model-99-55d95d96f5-gjml9',
'status': 'Running',
'reason': None,
'details': [],
'statuses': '\n'}]}
Troubleshooting Custom Model Deployment
If an error is returned, one method to check for errors are to check the Kubernetes logs through the following procedure:
Identify the namespace for the pipeline in the format
{pipeline-name}-{id}. For example, the pipelinesample-pipelinehas the namespacesample-pipeline-48:kubectl get namespaces NAME STATUS AGE default Active 153d edge-pipeline-6 Terminating 22m kube-node-lease Active 153d kube-public Active 153d kube-system Active 153d sample-pipeline-48 Active 53s velero Active 140d wallaroo Active 4d2hCheck the pods and identify the sidekick pod - this is the pod the Custom Model is run in.
kubectl get pods -n sample-pipeline-48 NAME READY STATUS RESTARTS AGE engine-85cb7b6bd8-bqf6v 1/1 Running 0 17s engine-d4cfb84c8-fs89h 1/1 Terminating 0 2m55s engine-lb-584f54c899-qj55z 1/1 Running 0 2m55s helm-runner-s5t5k 0/1 Completed 0 2m56s engine-sidekick-sample-model-75dc59c578-sld6m 1/1 Running 0 2m56sCheck the sidekick pod logs via the
kubectl logscommand. The following example shows the command. The option-fis to continue following the log output.kubectl logs -f engine-sidekick-sample-model-75dc59c578-sld6m -n sample-pipeline-48
Additional logging options are described in The Wallaroo.AI Cheat Sheet: Retrieve Pipeline System Logs.
Inference Testing
After uploading to Wallaroo, perform sample inference tests using the same input data as the local tests and verify data returns is the same shape and format as the local test. Note that the inputs and outputs for the sample tests are likely to be a Dictionary of Numpys, while the sample test for Wallaroo inputs and outputs are in either pandas DataFrames or Apache Arrow Tables. For example:
Local Inference Test:
# run a simulated inference
results = inference.predict(input_data=input_dictionary)
print(results)
{'result': array([21, 22, 23]),
'id': array([20000000004093819, 20012684296980773, 481562342])}
Wallaroo Inference Test
print(pipeline.infer(input_df))
| time | in.id | in.input_number | out.id | out.result | anomaly.count | |
|---|---|---|---|---|---|---|
| 0 | 2026-01-28 21:51:37.363 | 20000000004093819 | 1 | 20000000004093819 | 21 | 0 |
| 1 | 2026-01-28 21:51:37.363 | 20012684296980773 | 2 | 20012684296980773 | 22 | 0 |
| 2 | 2026-01-28 21:51:37.363 | 481562342 | 3 | 481562342 | 23 | 0 |
Inference Troubleshooting
The following tips can help resolve inference troubleshooting issues.
Status Check: Pipeline status is checked via the SDK command
pipeline.status()(replacepipelinewith the name of the variable used in the SDK) may reportRunningbefore the sidekick is fully ready. If the first inference fails immediately, wait 60 seconds and retry.Failure Analysis:
- If inference fails, check the sidekick logs using the process described in Troubleshooting Custom Model Deployment. Additional logging options
- Look for the
tracebackand logger messages added in Develop the Inference Class.
Connectivity: If the sidekick shows no activity, check the
engine-lb(Load Balancer) andenginepod logs for errors. This usually indicates a data formatting issue preventing the request from reaching the sidekick. For example:kubectl get pods -n sample-pipeline-48 NAME READY STATUS RESTARTS AGE engine-85cb7b6bd8-bqf6v 1/1 Running 0 17s engine-lb-584f54c899-qj55z 1/1 Running 0 2m55s helm-runner-s5t5k 0/1 Completed 0 2m56s engine-sidekick-sample-model-75dc59c578-sld6m 1/1 Running 0 2m56sGet the
enginelogs:kubectl logs engine-85cb7b6bd8-bqf6v -n sample-pipeline-48Get the
engine-lblogs:kubectl logs engine-lb-584f54c899-qj55z -n sample-pipeline-48