XGBoost Regression Auto-Convert Within Wallaroo

How to convert XGBoost ML Regression models and upload to Wallaroo with the Wallaroo convert_model method.

Introduction

The following tutorial is a brief example of how to convert a XGBoost Regression ML model with the convert_model method and upload it into your Wallaroo instance.

This tutorial assumes that you have a Wallaroo instance and are running this Notebook from the Wallaroo Jupyter Hub service.

  • Convert a XGBoost Regression ML model and upload it into the Wallaroo engine.
  • Run a sample inference on the converted model in a Wallaroo instance.

This tutorial provides the following:

  • xgb_reg.pickle: A pretrained XGBoost Regression model with 25 columns.
  • xgb_regression_eval.json: Test data to perform a sample inference.

Conversion Steps

To use the Wallaroo auto-converter convert_model(path, source_type, conversion_arguments) method takes 3 parameters. The parameters for XGBoost conversions are:

  • path (STRING): The path to the ML model file.
  • source_type (ModelConversionSource): The type of ML model to be converted. As of this time Wallaroo auto-conversion supports the following source types and their associated ModelConversionSource:
    • sklearn: ModelConversionSource.SKLEARN
    • xgboost: ModelConversionSource.XGBOOST
    • keras: ModelConversionSource.KERAS
  • conversion_arguments: The arguments for the conversion based on the type of model being converted. These are:
  • wallaroo.ModelConversion.ConvertXGBoostArgs: Used for XGBoost models and takes the following parameters:
    • name: The name of the model being converted.
    • comment: Any comments for the model.
    • number_of_columns: The number of columns the model was trained for.
    • input_type: A tensorflow Dtype called in the format ModelConversionInputType.{type}, where {type} is Float, Double, etc depending on the model.

Import Libraries

The first step is to import the libraries needed.

import wallaroo

from wallaroo.ModelConversion import ConvertXGBoostArgs, ModelConversionSource, ModelConversionInputType
from wallaroo.object import EntityNotFoundError

Configuration and Methods

The following will set the workspace, pipeline, model name, the model file name used when uploading and converting the keras model, and the sample data.

The functions get_workspace(name) will either set the current workspace to the requested name, or create it if it does not exist. The function get_pipeline(name) will either set the pipeline used to the name requested, or create it in the current workspace if it does not exist.

workspace_name = 'xgboost-regression-autoconvert-workspace'
pipeline_name = 'xgboost-regression-autoconvert-pipeline'
model_name = 'xgb-regression-model'
model_file_name = 'xgb_reg.pickle'
sample_data = 'xgb_regression_eval.json'

def get_workspace(name):
    wl = wallaroo.Client()
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    wl = wallaroo.Client()
    try:
        pipeline = wl.pipelines_by_name(pipeline_name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(pipeline_name)
    return pipeline

Connect to Wallaroo

Connect to your Wallaroo instance and store the connection into the variable wl.

wl = wallaroo.Client()

Set the Workspace and Pipeline

Set or create the workspace and pipeline based on the names configured earlier.

workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline
   
name xgboost-regression-autoconvert-pipeline
created 2022-07-08 14:12:08.527632+00:00
last_updated 2022-07-08 14:12:08.527632+00:00
deployed (none)
tags
steps

Set the Model Autoconvert Parameters

Set the parameters for converting the xgb-class-model.

#the number of columns
NF = 25

model_conversion_args = ConvertXGBoostArgs(
    name=model_name,
    comment="xgboost regression model test",
    number_of_columns=NF,
    input_type=ModelConversionInputType.Float32
)
model_conversion_type = ModelConversionSource.XGBOOST

Upload and Convert the Model

Now we can upload the convert the model. Once finished, it will be stored as {unique-file-id}-converted.onnx.

# convert and upload
model_wl = wl.convert_model(model_file_name, model_conversion_type, model_conversion_args)

Test Inference

With the model uploaded and converted, we can run a sample inference.

Deploy the Pipeline

Add the uploaded and converted model_wl as a step in the pipeline, then deploy it.

pipeline.add_model_step(model_wl).deploy()
Waiting for deployment - this will take up to 45s ..... ok
   
name xgboost-regression-autoconvert-pipeline
created 2022-07-08 14:12:08.527632+00:00
last_updated 2022-07-08 14:12:10.324722+00:00
deployed True
tags
steps xgb-regression-model

Run the Inference

Use the test_class_eval.json as set earlier as our sample_data and perform the inference.

result = pipeline.infer_from_file(sample_data)
result[0].data()
[array([[  30.71360016],
        [-202.30688477],
        [ 285.74139404],
        [ -56.76713943],
        [-238.28738403]])]

Undeploy the Pipeline

With the tests complete, we will undeploy the pipeline to return the resources back to the Wallaroo instance.

pipeline.undeploy()
Waiting for undeployment - this will take up to 45s ................................. ok
   
name xgboost-regression-autoconvert-pipeline
created 2022-07-08 14:12:08.527632+00:00
last_updated 2022-07-08 14:12:10.324722+00:00
deployed False
tags
steps xgb-regression-model