Module wallaroo.deployment
Functions
def create_engine_config(image: str = None, cpu: int = None, memory: str = None, replicas: int = None)
-
Convenience function to create a reasonable engine config with some non-default values. Currently only memory and cpu can be updated and they are set for both limits and requests.
Current defaults are::
engine: image: "someimage" replicas: 1 inputProtocol: http sinkType: http_response cpu: 4 mode: inference model_concurrency: 2 resources: limits: memory: 3Gi cpu: 4 requests: memory: 3Gi cpu: 4
So, this command::
create_engine_config(image="...fitzroy-mini:latest", cpu=1, memory="2Gi")
Will create::
{ "engine": { "image": "ghcr.io/wallaroolabs/fitzroy-mini:latest", "cpu": 1, "resources": { "limits": {"memory": "2Gi", "cpu": 1}, "requests": {"memory": "2Gi", "cpu": 1}, }, } }
Classes
class Deployment (gql_client: gql.client.Client)
-
Manages model deployments (ie. engine services, endpoints, predictors). Models are uploaded seperately and then Deployments are created and managed to manifest the resources to process inference requests.
Create a deployment client with the specified gql client.
Methods
def create_deployment(self, deployment_id: str, model_config_id: int, deployed: bool, replicas: int = None, image: str = None, cpu=None, memory=None) ‑> dict
-
Creates a deployment with a model config using the provided parameters. If image, cpu and memory are not provided uses the default system model config. Returns a dict of the deployment data including a url field that can be used to run inferences against this deployment.
NOTE: This function returns immediately but the operation may take several minutes to complete.
def create_deployment_model_config(self, deployment_id: int, model_config_id: int) ‑> dict
-
Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def create_pipeline_deployment(self, deployment_id_str: str, pipeline_pk_id: int, model_config_ids: List[int], deployed: bool, replicas: int = None, image: str = None, cpu=None, memory=None) ‑> Optional[dict]
-
Creates a deployment that serves a pipeline. Requires the pipeline row id and the model config ids plus other general deployment parameters to control resources.
NOTE: This function returns immediately but the operation may take several minutes to complete.
def get_deployment(self, deploy_id: str) ‑> Optional[dict]
-
Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def get_deployment_model_config(self) ‑> List[dict]
-
Gets the details for a particular deployment. Returns None if no deployment with that deploy_id is found.
def get_deployments(self) ‑> List[dict]
-
Get a list of the current deployments in the system.
def update_deployment(self, deploy_id: str, deployed: bool = None, image: str = None, cpu: int = None, memory=None)
-
General function to update a deployment. Uses update engine config and set deployed in two steps to update the deployment. Returns the updated deployment details from the database.
NOTE: This function returns immediately but the operation may take several minutes to complete.
def update_deployment_pipeline_version(self, deployment_id: int, pipeline_version_pk_id: int) ‑> dict