Model names map onto Kubernetes objects, and must be DNS compliant. The strings for model names must be ASCII alpha-numeric characters or dash (-) only. .
and _
are not allowed.
Wallaroo supports TensorFlow/Keras models by containerizing the model and running as an image.
Parameter | Description |
---|---|
Web Site | https://www.tensorflow.org/api_docs/python/tf/keras/Model |
Supported Libraries |
|
Framework | Framework.KERAS aka keras |
Supported File Types | SavedModel format as .zip file and HDF5 format |
Runtime | Containerized aka mlflow |
TensorFlow Keras SavedModel models are .zip file of the SavedModel format. For example, the Aloha sample TensorFlow model is stored in the directory alohacnnlstm
:
├── saved_model.pb
└── variables
├── variables.data-00000-of-00002
├── variables.data-00001-of-00002
└── variables.index
This is compressed into the .zip file alohacnnlstm.zip
with the following command:
zip -r alohacnnlstm.zip alohacnnlstm/
See the SavedModel guide for full details.
Wallaroo supports the H5 for Tensorflow Keras models.
TensorFlow Keras models are uploaded to Wallaroo through the Wallaroo Client upload_model
method.
The following parameters are required for TensorFlow keras models. Note that while some fields are considered as optional for the upload_model
method, they are required for proper uploading of a TensorFlow Keras model to Wallaroo.
Parameter | Type | Description |
---|---|---|
name | string (Required) | The name of the model. Model names are unique per workspace. Models that are uploaded with the same name are assigned as a new version of the model. |
path | string (Required) | The path to the model file being uploaded. |
framework | string (Upload Method Optional, TensorFlow keras model Required) | Set as the Framework.KERAS . |
input_schema | pyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required) | The input schema in Apache Arrow schema format. |
output_schema | pyarrow.lib.Schema (Upload Method Optional, TensorFlow Keras model Required) | The output schema in Apache Arrow schema format. |
convert_wait | bool (Upload Method Optional, TensorFlow model Optional) (Default: True) |
|
Once the upload process starts, the model is containerized by the Wallaroo instance. This process may take up to 10 minutes.
For example, the following example is of uploading a PyTorch ML Model to a Wallaroo instance.
input_schema = pa.schema([
pa.field('input',
pa.list_(pa.float64(),
list_size=10)
)
]
)
output_schema = pa.schema([
pa.field('output',
pa.list_(pa.float64(),
list_size=32)
)
]
)
model = wl.upload_model('mac-keras-single-io-example',
'./models/single_io_keras_sequential_model.h5',
framework=Framework.KERAS,
input_schema=input_schema,
output_schema=output_schema
)
Pipeline deployment configurations are dependent on whether the model is converted to the Native Runtime space, or Containerized Model Runtime space. This is determined when the model is uploaded based on the size, complexity, and other factors.
Once uploaded, the Model method config().runtime()
will display which space the model is in.
Runtime Display | Model Runtime Space | Pipeline Configuration |
---|---|---|
tensorflow | Native | Native Runtime Configuration Methods |
onnx | Native | Native Runtime Configuration Methods |
python | Native | Native Runtime Configuration Methods |
mlflow | Containerized | Containerized Runtime Deployment |
For example, uploading an runtime model to a Wallaroo workspace would return the following config().runtime()
:
ccfraud_model = wl.upload_model(model_name, model_file_name, Framework.ONNX).configure()
ccfraud_model.config().runtime()
'onnx'
For example, the following containerized model after conversion is allocated to the containerized runtime as follows:
model = wl.upload_model(model_name, model_file_name,
framework=framework,
input_schema=input_schema,
output_schema=output_schema
)
model.config().runtime()
'mlflow'
The following configuration allocates 0.25 CPU and 1 Gi RAM to the native runtime models for a pipeline.
deployment_config = DeploymentConfigBuilder()
.cpus(0.25)
.memory('1Gi')
.build()
The following configuration allocates 0.25 CPU and 1 Gi RAM to a specific containerized model in the containerized runtime, along with other environmental variables for the containerized model. Note that for containerized models, resources must be allocated per specific model.
deployment_config = DeploymentConfigBuilder()
.sidekick_cpus(sm_model, 0.25)
.sidekick_memory(sm_model, '1Gi')
.sidekick_env(sm_model,
{"GUNICORN_CMD_ARGS":
"__timeout=188 --workers=1"}
)
.build()