Model Naming Requirements
Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must lower case ASCII alpha-numeric characters or dash (-) only. .
and _
are not allowed.
Wallaroo supports TensorFlow models by containerizing the model and running as an image.
Parameter | Description |
---|---|
Web Site | https://www.tensorflow.org/ |
Supported Libraries | tensorflow==2.9.3 |
Framework | Framework.TENSORFLOW aka tensorflow |
Runtime | Native aka tensorflow |
Supported File Types | SavedModel format as .zip file |
IMPORTANT NOTE
These requirements are not for Tensorflow Keras models, only for non-Keras Tensorflow models in the SavedModel format. For Tensorflow Keras deployment in Wallaroo, see the Tensorflow Keras requirements.TensorFlow File Format
TensorFlow models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm
:
├── saved_model.pb
└── variables
├── variables.data-00000-of-00002
├── variables.data-00001-of-00002
└── variables.index
This is compressed into the .zip file alohacnnlstm.zip
with the following command:
zip -r alohacnnlstm.zip alohacnnlstm/
ML models that meet the Tensorflow and SavedModel format will run as Wallaroo Native runtimes by default.
See the SavedModel guide for full details.
Uploading TensorFlow Models
TensorFlow models are uploaded to Wallaroo through the Wallaroo Client upload_model
method.
Upload TensorFlow Model Parameters
The following parameters are required for TensorFlow models. Tensorflow models are native runtimes in Wallaroo, so the input_schema
and output_schema
parameters are optional.
Parameter | Type | Description |
---|---|---|
name | string (Required) | The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model. |
path | string (Required) | The path to the model file being uploaded. |
framework | string (Required) | Set as the Framework.TENSORFLOW . |
input_schema | pyarrow.lib.Schema (Optional) | The input schema in Apache Arrow schema format. |
output_schema | pyarrow.lib.Schema (Optional) | The output schema in Apache Arrow schema format. |
convert_wait | bool (Optional) (Default: True) | Not required for native runtimes.
|
arch | wallaroo.engine_config.Architecture | The architecture the model is deployed to. If a model is intended for deployment to an ARM architecture, it must be specified during this step. Values include: X86 (Default): x86 based architectures. ARM : ARM based architectures. |
Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.
Model Config Options
Model version configurations are updated with the wallaroo.model_version.config
and include the following parameters. Most are optional unless specified.
Parameter | Type | Description |
---|---|---|
runtime | String (Optional) | The model runtime from wallaroo.framework. } |
tensor_fields | (List[string]) (Optional) | A list of alternate input fields. For example, if the model accepts the input fields ['variable1', 'variable2'] , tensor_fields allows those inputs to be overridden to ['square_feet', 'house_age'] , or other values as required. |
input_schema | pyarrow.lib.Schema | The input schema for the model in pyarrow.lib.Schema format. |
output_schema | pyarrow.lib.Schema | The output schema for the model in pyarrow.lib.Schema format. |
batch_config | (List[string]) (Optional) | Batch config is either None for multiple-input inferences, or single to accept an inference request with only one row of data. |
Upload TensorFlow Model Return
For example, the following example is of uploading a TensorFlow ML Model to a Wallaroo instance.
from wallaroo.framework import Framework
model = wl.upload_model(model_name,
model_file_name,
framework=Framework.TENSORFLOW
)
Pipeline Deployment Configurations
Pipeline configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space.
This model will always run in the native runtime space.
Native Runtime Pipeline Deployment Configuration Example
The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.
deployment_config = DeploymentConfigBuilder()
.cpus(0.25)
.memory('1Gi')
.build()
Pipeline Deployment Timeouts
Pipeline deployments typically take 45 seconds for Wallaroo Native Runtimes, and 90 seconds for Wallaroo Containerized Runtimes.
If Wallaroo Pipeline deployment times out from a very large or complex ML model being deployed, the timeout is extended from with the wallaroo.Client.Client(request_timeout:int)
setting, where request_timeout
is in integer seconds. Wallaroo Native Runtime deployments are scaled at 1x the request_timeout
setting. Wallaroo Containerized Runtimes are scaled at 2x the request_timeout
setting.
The following example shows extending the request_timeout
to 2 minutes.
wl = wallaroo.Client(request_timeout=120)
wl.timeout
120