sklearn and XGBoost Regression Auto-Conversion Tutorial Within Wallaroo

How to use the Wallaroo convert_model method with a sklearn model.

Auto-Conversion And Upload Tutorial

Machine Learning (ML) models can be converted into a Wallaroo and uploaded into Wallaroo workspace using the Wallaroo Client convert_model(path, source_type, conversion_arguments) method. This conversion process transforms the model into an open format that can be run across different frameworks at compiled C-language speeds.

The three input parameters are:

  • path (STRING): The path to the ML model file.
  • source_type (ModelConversionSource): The type of ML model to be converted. As of this time Wallaroo auto-conversion supports the following source types and their associated ModelConversionSource:
    • sklearn: ModelConversionSource.SKLEARN
    • xgboost: ModelConversionSource.XGBOOST
  • conversion_arguments: The arguments for the conversion:
    • name: The name of the model being converted.
    • comment: Any comments for the model.
    • number_of_columns: The number of columns the model was trained for.
    • input_type: The ModelConversationInputType, typically Float or Double depending on the model.

The following tutorial demonstrates how to convert a sklearn Linear Model and a XGBoost Regression Model, and upload them into a Wallaroo Workspace. The following is provided for the tutorial:

  • sklearn-linear-model.pickle: A sklearn linear model. An example of training the model is provided in the Jupyter Notebook sklearn-linear-model-example.ipynb. It has 25 columns.
  • xgb_reg.pickle: A XGBoost regression model. An example of training the model is provided in the Jupyter Notebook xgboost-regression-model-example.ipynb. It has 25 columns.



Before starting, the following must be available:

  • The model to upload into a workspace.
  • The number of columns the model was trained for.

Wallaroo supports the following model versions:

  • XGBoost: Version 1.6.2
  • SKLearn: 1.1.2

Import Libraries

Import the libraries that will be used for the auto-conversion process.

import pickle
import json

import wallaroo

from wallaroo.ModelConversion import ConvertSKLearnArguments, ConvertXGBoostArgs, ModelConversionSource, ModelConversionInputType
from wallaroo.object import EntityNotFoundError
# Verify the version of XGBoost used to generate the models

import sklearn
import sklearn.datasets

import xgboost as xgb


The following code is used to either connect to an existing workspace or to create a new one. For more details on working with workspaces, see the Wallaroo Workspace Management Guide.

Connect to Wallaroo

Connect to your Wallaroo instance.

# Client connection from local Wallaroo instance

wl = wallaroo.Client()

# SSO login through keycloak

# wallarooPrefix = "YOUR PREFIX"
# wallarooSuffix = "YOUR SUFFIX"

# wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}.api.{wallarooSuffix}", 
#                     auth_endpoint=f"https://{wallarooPrefix}.keycloak.{wallarooSuffix}", 
#                     auth_type="sso")

Set the Workspace

We’ll connect or create the workspace testautoconversion and use it for our model testing.

workspace_name = 'testautoconversion'
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace
workspace = get_workspace(workspace_name)


{'name': 'testautoconversion', 'id': 20, 'archived': False, 'created_by': '435da905-31e2-4e74-b423-45c38edb5889', 'created_at': '2023-02-27T20:05:40.6269+00:00', 'models': [], 'pipelines': []}

Set the Model Conversion Arguments

We’ll create two different configurations, one for each of our models:

  • sklearn_model_conversion_args: Used for our sklearn model.
  • xgboost_model_converstion_args: Used for our XGBoost model.
# The number of columns

sklearn_model_conversion_args = ConvertSKLearnArguments(
    comment="test linear regression",
sklearn_model_conversion_type = ModelConversionSource.SKLEARN

xgboost_model_conversion_args = ConvertXGBoostArgs(
    comment="xgboost regression model test",
xgboost_model_conversion_type = ModelConversionSource.XGBOOST

Convert the Models

The convert_model method converts the model using the arguments, and uploads it into the current workspace - in this case, testconversion. Once complete, we can run get_current_workspace to verify that the models were uploaded.

# converts and uploads the sklearn model.
wl.convert_model('sklearn-linear-model.pickle', sklearn_model_conversion_type, sklearn_model_conversion_args)

# converts and uploads the XGBoost model.
wl.convert_model('xgb_reg.pickle', xgboost_model_conversion_type, xgboost_model_conversion_args)
{'name': 'xgbtestreg', 'version': '42779f3a-e874-4f8d-b985-822f2128a954', 'file_name': 'bffa90a2-5a27-4e77-a2f4-cd0f68f99d24-converted.onnx', 'image_path': None, 'last_update_time': datetime.datetime(2023, 2, 27, 20, 11, 36, 906580, tzinfo=tzutc())}
{'name': 'testautoconversion', 'id': 20, 'archived': False, 'created_by': '435da905-31e2-4e74-b423-45c38edb5889', 'created_at': '2023-02-27T20:05:40.6269+00:00', 'models': [{'name': 'sklearntest', 'versions': 1, 'owner_id': '""', 'last_update_time': datetime.datetime(2023, 2, 27, 20, 11, 35, 745445, tzinfo=tzutc()), 'created_at': datetime.datetime(2023, 2, 27, 20, 11, 35, 745445, tzinfo=tzutc())}, {'name': 'xgbtestreg', 'versions': 1, 'owner_id': '""', 'last_update_time': datetime.datetime(2023, 2, 27, 20, 11, 36, 906580, tzinfo=tzutc()), 'created_at': datetime.datetime(2023, 2, 27, 20, 11, 36, 906580, tzinfo=tzutc())}], 'pipelines': []}