Module `wallaroo.deployment`

Functions

def create_engine_config(image: str = None, cpu: int = None, memory: str = None, replicas: int = None)

Convenience function to create a reasonable engine config with some non-default values. Currently only memory and cpu can be updated and they are set for both limits and requests.

Current defaults are::

engine:
    image: "someimage"
    replicas: 1
    inputProtocol: http
    sinkType: http_response
    cpu: 4
    mode: inference
    model_concurrency: 2
    resources:
        limits:
            memory: 3Gi
            cpu: 4
        requests:
            memory: 3Gi
            cpu: 4

So, this command::

create_engine_config(image="...fitzroy-mini:latest", cpu=1, memory="2Gi")

Will create::

{
    "engine": {
        "image": "ghcr.io/wallaroolabs/fitzroy-mini:latest",
        "cpu": 1,
        "resources": {
            "limits": {"memory": "2Gi", "cpu": 1},
            "requests": {"memory": "2Gi", "cpu": 1},
        },
    }
}

Classes

class Deployment (gql_client: gql.client.Client)

Manages model deployments (ie. engine services, endpoints, predictors). Models are uploaded seperately and then Deployments are created and managed to manifest the resources to process inference requests.

Create a deployment client with the specified gql client.

Methods

def create_deployment(self, deployment_id: str, model_config_id: int, deployed: bool, replicas: int = None, image: str = None, cpu=None, memory=None) ‑> dict: Creates a deployment with a model config using the provided parameters. If image, cpu and memory are not provided uses the default system model config. Returns a dict of the deployment data including a url field that can be used to run inferences against this deployment.

NOTE: This function returns immediately but the operation may take several minutes to complete.
def create_deployment_model_config(self, deployment_id: int, model_config_id: int) ‑> dict: Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def create_pipeline_deployment(self, deployment_id_str: str, pipeline_pk_id: int, model_config_ids: List[int], deployed: bool, replicas: int = None, image: str = None, cpu=None, memory=None) ‑> Optional[dict]: Creates a deployment that serves a pipeline. Requires the pipeline row id and the model config ids plus other general deployment parameters to control resources.

NOTE: This function returns immediately but the operation may take several minutes to complete.
def get_deployment(self, deploy_id: str) ‑> Optional[dict]: Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def get_deployment_model_config(self) ‑> List[dict]: Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def get_deployments(self) ‑> List[dict]: Get a list of the current deployments in the system.
def update_deployment(self, deploy_id: str, deployed: bool = None, image: str = None, cpu: int = None, memory=None): General function to update a deployment. Uses update engine config and set deployed in two steps to update the deployment. Returns the updated deployment details from the database.

NOTE: This function returns immediately but the operation may take several minutes to complete.
def update_deployment_pipeline_version(self, deployment_id: int, pipeline_version_pk_id: int) ‑> dict