This is the multi-page printable view of this section. Click here to print.
Wallaroo SDK Reference Guide
- 1: wallaroo.assay
- 2: wallaroo.assay_config
- 3: wallaroo.auth
- 4: wallaroo.checks
- 5: wallaroo.client
- 6: wallaroo.comment
- 7: wallaroo.connection
- 8: wallaroo.datasizeunit
- 9: wallaroo.deployment
- 10: wallaroo.deployment_config
- 11: wallaroo.engine_config
- 12: wallaroo.expression
- 13: wallaroo.framework
- 14: wallaroo.functions
- 15: wallaroo.inference_decode
- 16: wallaroo.inference_result
- 17: wallaroo.logs
- 18: wallaroo.model
- 19: wallaroo.model_config
- 20: wallaroo.ModelConversion
- 21: wallaroo.models
- 22: wallaroo.notify
- 23: wallaroo.object
- 24: wallaroo.orchestration
- 25: wallaroo.pipeline
- 26: wallaroo.pipeline_config
- 27: wallaroo.pipeline_variant
- 28: wallaroo.tag
- 29: wallaroo.task
- 30: wallaroo.task_run
- 31: wallaroo.user
- 32: wallaroo.workspace
1 - wallaroo.assay
An Assay represents a record in the database. An assay contains some high level attributes such as name, status, active, etc. as well as the sub objects Baseline, Window and Summarizer which specify how the Baseline is derived, how the Windows should be created and how the analysis should be conducted.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Disables the Assay. No further analysis will be conducted until the assay is enabled.
Creates a dataframe for the meta data in the baseline or window excluding the edge information.
Parameters
- assay_result: The dict of the raw asset result
Creates a dataframe specifically for the edge information in the baseline or window.
Parameters
- window_or_baseline: The dict from the assay result of either the window or baseline
The AssayAnalysis class helps handle the assay analysis logs from the Plateau logs. These logs are a json document with meta information on the assay and analysis as well as summary information on the baseline and window and information on the comparison between them.
Creates a simple dataframe making it easy to compare a baseline and window.
Creates a simple dataframe with the basic stats data for a baseline.
Creates a simple dataframe to compare the bin/edge information of baseline and window.
Helper class primarily to easily create a dataframe from a list of AssayAnalysis objects.
Creates and returns a summary dataframe from the assay results.
Creates and returns a dataframe with all values including inputs and outputs from the assay results.
Creates a basic chart of the scores from dataframe created from assay analysis list
Wraps a list of assays for display in an HTML display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
2 - wallaroo.assay_config
Simple function to placate pylance
Abstract base class for Baseline config objects. Currently only FixedBaseline is implemented though SlidingBaseline and others are planned.
The FixedBaseline is calculate from the inferences from a specific time window.
Inherited Members
Helper class that provides a standard way to create an ABC using inheritance.
Ensure the date it tz aware. If naive assume it is in utc.
Helps to easily create the config object for a FixedBaseline.
Inherited Members
The summarizer specifies how the bins of the baseline and window should be compared.
How should we calculate the bins. NONE - no bins. Only useful if we only care about the mean, median, etc. EQUAL - evenly spaced bins: min - max / num_bins QUANTILE - based on percentages. If num_bins is 5 then quintiles so bins are created at the 20%, 40%, 60%, 80% and 100% points. PROVIDED - user provides the edge points for the bins.
Inherited Members
- enum.Enum
- name
- value
- builtins.str
- encode
- replace
- split
- rsplit
- join
- capitalize
- casefold
- title
- center
- count
- expandtabs
- find
- partition
- index
- ljust
- lower
- lstrip
- rfind
- rindex
- rjust
- rstrip
- rpartition
- splitlines
- strip
- swapcase
- translate
- upper
- startswith
- endswith
- removeprefix
- removesuffix
- isascii
- islower
- isupper
- istitle
- isspace
- isdecimal
- isdigit
- isnumeric
- isalpha
- isalnum
- isidentifier
- isprintable
- zfill
- format
- format_map
- maketrans
What we use to calculate the score. EDGES - distnces between the edges. DENSITY - percentage of values that fall in each bin. CUMULATIVE - cumulative percentage that fall in the bins.
Inherited Members
- enum.Enum
- name
- value
- builtins.str
- encode
- replace
- split
- rsplit
- join
- capitalize
- casefold
- title
- center
- count
- expandtabs
- find
- partition
- index
- ljust
- lower
- lstrip
- rfind
- rindex
- rjust
- rstrip
- rpartition
- splitlines
- strip
- swapcase
- translate
- upper
- startswith
- endswith
- removeprefix
- removesuffix
- isascii
- islower
- isupper
- istitle
- isspace
- isdecimal
- isdigit
- isnumeric
- isalpha
- isalnum
- isidentifier
- isprintable
- zfill
- format
- format_map
- maketrans
How we calculate the score. MAXDIFF - maximum difference between corresponding bins. SUMDIFF - sum of differences between corresponding bins. PSI - Population Stability Index
Inherited Members
- enum.Enum
- name
- value
- builtins.str
- encode
- replace
- split
- rsplit
- join
- capitalize
- casefold
- title
- center
- count
- expandtabs
- find
- partition
- index
- ljust
- lower
- lstrip
- rfind
- rindex
- rjust
- rstrip
- rpartition
- splitlines
- strip
- swapcase
- translate
- upper
- startswith
- endswith
- removeprefix
- removesuffix
- isascii
- islower
- isupper
- istitle
- isspace
- isdecimal
- isdigit
- isnumeric
- isalpha
- isalnum
- isidentifier
- isprintable
- zfill
- format
- format_map
- maketrans
The UnivariateContinousSummarizer analyizes one input or output feature (Univariate) at a time. Expects the values to be continous or at least numerous enough to fall in various/all the bins.
Inherited Members
Helper class that provides a standard way to create an ABC using inheritance.
Builds the UnviariateSummarizer
Sets the binning mode. If BinMode.PROVIDED is specified a list of edges is also required.
Sets the number of bins. If weights have been previously set they must be set to none to allow changing the number of bins.
Specifies the weighting to be given to the bins. The number of weights must be 2 larger than the number of bins to accomodate outliers smaller and outliers larger than values seen in the baseline. The passed in values can be whole or real numbers and do not need to add up to 1 or any other specific value as they will be normalized during the score calculation phase. The weights passed in can be none to remove previously specified weights and to allow changing of the number of bins.
Specifies the right hand side (max value) of the bins. The number of edges must be equal to or one more than the number of bins. When equal to the number of bins the edge for the left outlier bin is calculated from the baseline. When an additional edge (one more than number of bins) that first (lower) value is used as the max value for the left outlier bin. The max value for the right hand outlier bin is always Float MAX.
Configures a window to be compared against the baseline.
Helps build a WindowConfig. model and width are required but there are no good default values for them because they depend on the baseline. We leave it up to the assay builder to configure the window correctly after it is created.
The model name (model_id) that the window should analyze.
The width of the window to use when collecting data for analysis.
The width of the window to use when collecting data for analysis.
Used to format datetimes as we need when encoding to JSON
Configuration for an Assay record.
Runs this assay interactively. The assay is not saved to the database nor are analyis records saved to a Plateau topic. Useful for exploring pipeline inference data and experimenting with thresholds.
Analyzes the inputs given to create an interactive run for each feature column. The assay is not saved to the database nor are analyis records saved to a Plateau topic. Usefull for exploring inputs for possible causes when a difference is detected in the output.
Helps build an AssayConfig
Specify what the assay should analyze. Should start with input or output and have indexes (zero based) into row and column: For example 'input 0 1' specifies the second column of the first input.
Creates and adds an UCS to this assay builder.
If the users specifies a number of bins or a strategy for calculating it use that. Else us the min of the square root or 50.
3 - wallaroo.auth
Handles authentication to the Wallaroo platform.
Performs a "device code"-style OAuth login flow.
The code is organized as follows:
Auth objects returned by
create()
should be placed on each request to platform APIs. Currently, we have the following types:- NoAuth: Does not modify requests
- PlatformAuth: Places
Authorization: Bearer XXX
headers on each outgoing request
Objects derived from TokenFetcher know how to obtain an AccessToken from a particular provider:
- KeycloakTokenFetcher: Fetches a token from Keycloak using a device code login flow
- CachedTokenFetcher: Wraps another TokenFetcher and caches the value to a JSON file to reduce the number of user logins needed.
Defines all the supported auth types.
Handles conversions from string names to enum values.
Inherited Members
- enum.Enum
- name
- value
TokenData(token, user_email, user_id)
Create new instance of TokenData(token, user_email, user_id)
Inherited Members
- builtins.tuple
- index
- count
Returns an auth object of the corresponding type.
Parameters
- str keycloak_addr: Address of the Keycloak instance to auth against
- AuthType or str auth_type: Type of authentication to use
Returns
Auth object that can be passed to all
requests
calls
Raises
- NotImplementedError: if auth_type is not recognized
Removes cached values for all third-party auth providers.
This will not invalidate auth objects already created with create()
.
Base type for all errors in this module.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Errors encountered while performing a login.
Errors encountered while refreshing an AccessToken.
4 - wallaroo.checks
Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.
Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.
Inherited Members
Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.
Inherited Members
Declares a model variable that can be used as an :py:Expression: in the
model checker. Variables are identified by their model_name
, a position
of either "input"
or "output"
, and the tensor index
.
Inherited Members
Root base class for all model-checker expressions. Provides pythonic magic-method sugar for expression definitions.
Inherited Members
Returns true if a string is compliant with DNS label name requirement to ensure it can be a part of a full DNS host name
Validates that 'name' complies with DNS naming requirements or raises an exception
5 - wallaroo.client
Client handle to a Wallaroo platform instance.
Objects of this class serve as the entrypoint to Wallaroo platform functionality.
Create a Client handle.
Parameters
- str api_endpoint: Host/port of the platform API endpoint
- str auth_endpoint: Host/port of the platform Keycloak instance
- int timeout: Max timeout of web requests, in seconds
- str auth_type: Authentication type to use. Can be one of: "none", "sso", "user_password".
- bool interactive: If provided and True, some calls will print additional human information, or won't when False. If not provided, interactive defaults to True if running inside Jupyter and False otherwise.
- str time_format: Preferred
strftime
format string for displaying timestamps in a human context.
Method to calculate the auth values specified as defaults, as params or in ENV vars. Made static to be testable without reaching out to SSO, etc.
List all deployments (active or not) on the platform.
Returns
A list of all deployments on the platform.
Search for pipelines. All parameters are optional, in which case the result is the same as
list_pipelines()
. All times are strings to be parsed by datetime.isoformat
. Example:
myclient.search_pipelines(created_end='2022-04-19 13:17:59+00:00', search_term="foo")
Parameters
- str search_term: Will be matched against tags and model names. Example: "footag123".
- bool deployed: Pipeline was deployed or not
- str created_start: Pipeline was created at or after this time
- str created_end: Pipeline was created at or before this time
- str updated_start: Pipeline was updated at or before this time
- str updated_end: Pipeline was updated at or before this time
Returns
A list of all pipelines on the platform.
Search models owned by you params: search_term: Searches the following metadata: names, shas, versions, file names, and tags uploaded_time_start: Inclusive time of upload uploaded_time_end: Inclusive time of upload
Search all models you have access to. params: search_term: Searches the following metadata: names, shas, versions, file names, and tags uploaded_time_start: Inclusive time of upload uploaded_time_end: Inclusive time of upload
Deactivates an existing user of the platform
Deactivated users cannot log into the platform. Deactivated users do not count towards the number of allotted user seats from the license.
The Models and Pipelines owned by the deactivated user are not removed from the platform.
Parameters
- str email: The email address of the user to deactivate.
Returns
None
Activates an existing user of the platform that had been previously deactivated.
Activated users can log into the platform.
Parameters
- str email: The email address of the user to activate.
Returns
None
Upload a model defined by a file as a new model variant.
Parameters
- name: str The name of the model of which this is a variant. Names must be ASCII alpha-numeric characters or dash (-) only.
- path: Union[str, pathlib.Path] Path of the model file to upload.
- framework: Optional[Framework] Supported model frameworks. Use models from Framework Enum. Example: Framework.PYTORCH, Framework.TENSORFLOW
- input_schema: Optional pa.Schema Input schema, required for flavors other than ONNX, Tensorflow, and Python
- output_schema: Optional pa.Schema Output schema, required for flavors other than ONNX, Tensorflow, and Python
- convert_wait: Optional bool Defaults to True. Specifies if method should return when conversion is over or not.
Returns
The created Model.
Registers an MLFlow model as a new model.
Parameters
- str model_name: The name of the model of which this is a variant. Names must be ASCII alpha-numeric characters or dash (-) only.
- str image: Image name of the MLFlow model to register.
Returns
The created Model.
Fetch a Model by name.
Parameters
- str model_class: Name of the model class.
- str model_name: Name of the variant within the specified model class.
Returns
The Model with the corresponding model and variant name.
Fetch a Deployment by name.
Parameters
- str deployment_name: Name of the deployment.
Returns
The Deployment with the corresponding name.
Fetch Pipelines by name.
Parameters
- str pipeline_name: Name of the pipeline.
Returns
The Pipeline with the corresponding name.
Starts building a pipeline with the given pipeline_name
,
returning a :py:PipelineConfigBuilder:
When completed, the pipeline can be uploaded with .upload()
Parameters
- pipeline_name string: Name of the pipeline, must be composed of ASCII alpha-numeric characters plus dash (-).
Creates a new PipelineVariant of a "value-split experiment" type.
Parameters
- str name: Name of the Pipeline
- meta_key str: Inference input key on which to redirect inputs to experiment models.
- default_model ModelConfig: Model to send inferences by default.
- challenger_models List[Tuple[Any, ModelConfig]]: A list of meta_key values -> Models to send inferences. If the inference data referred to by meta_key is equal to one of the keys in this tuple, that inference is redirected to the corresponding model instead of the default model.
Cleans up the inference result and log data from engine / plateau for display (ux) purposes.
Get logs for the given topic.
Parameters
- topic: str The topic to get logs for.
- limit: Optional[int] The maximum number of logs to return.
- start_datetime: Optional[datetime] The start time to get logs for.
- end_datetime: Optional[datetime] The end time to get logs for. :param dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
- directory: Optional[str] If set, logs will be exported to a file in the given directory.
- file_prefix: Optional[str] Prefix to name the exported file. Required if directory is set.
- data_size_limit: Optional[str] The maximum size of the exported data in MB. Size includes all files within the provided directory. By default, the data_size_limit will be set to 100MB.
- arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame.
Returns
Tuple[Union[pa.Table, pd.DataFrame, LogEntries], str] The logs and status.
Gets logs from Plateau for a particular time window without attempting to convert them to Inference LogEntries. Logs can be returned as strings or the json parsed into lists and dicts.
Parameters
- topic str: The name of the topic to query
- start Optional[datetime]: The start of the time window
- end Optional[datetime]: The end of the time window
- limit int: The number of records to retrieve. Note retrieving many records may be a performance bottleneck.
- parse bool: Wether to attempt to parse the string as a json object.
- verbose bool: Prints out info to help diagnose issues.
Gets logs from Plateau for a particular time window and filters them for the model specified.
Parameters
- pipeline_name str: The name/pipeline_id of the pipeline to query
- topic str: The name of the topic to query
- start Optional[datetime]: The start of the time window
- end Optional[datetime]: The end of the time window
- model_id: The name of the specific model to filter if any
- limit int: The number of records to retrieve. Note retrieving many records may be a performance bottleneck.
- verbose bool: Prints out info to help diagnose issues.
Gets the assay results for a particular time window, parses them, and returns an AssayAnalysisList of AssayAnalysis.
Parameters
- assay_id int: The id of the assay we are looking for.
- start datetime: The start of the time window
- end datetime: The end of the time window
Creates an AssayBuilder that can be used to configure and create Assays.
Parameters
- assay_name str: Human friendly name for the assay
- pipeline Pipeline: The pipeline this assay will work on
- model_name str: The model that this assay will monitor
- baseline_start datetime: The start time for the inferences to use as the baseline
- baseline_end datetime: The end time of the baseline window. the baseline. Windows start immediately after the baseline window and are run at regular intervals continously until the assay is deactivated or deleted.
Creates an assay in the database.
Parameters
- config AssayConfig: The configuration for the assay to create.
Returns
The identifier for the assay that was created. :rtype int
Create a new workspace with the current user as its first owner.
Parameters
- str workspace_name: Name of the workspace, must be composed of ASCII alpha-numeric characters plus dash (-)
List all workspaces on the platform which this user has permission see.
Returns
A list of all workspaces on the platform.
Any calls involving pipelines or models will use the given workspace from then on.
Return the current workspace. See also set_current_workspace
.
Given an inbound source model, a model type (xgboost, keras, sklearn), and conversion arguments. Convert the model to onnx, and add to available models for a pipeline.
Parameters
- Union[str, pathlib.Path] path: The path to the model to convert, i.e. the source model.
- ModelConversionSource source: The origin model type i.e. keras, sklearn or xgboost.
- ModelConversionArguments conversion_arguments: A structure representing the arguments for converting a specific model type.
Returns
An instance of the Model being converted to Onnx.
Raises
- ModelConversionGenericException: On a generic failure, please contact our support for further assistance.
- ModelConversionFailure: Failure in converting the model type.
- ModelConversionUnsupportedType: Raised when the source type passed is not supported.
- ModelConversionSourceFileNotPresent: Raised when the passed source file does not exist.
List all Orchestrations in the current workspace.
Returns
A List containing all Orchestrations in the current workspace.
Upload a file to be packaged and used as an Orchestration.
The uploaded artifact must be a ZIP file which contains:
- User code. If
main.py
exists, then that will be used as the task entrypoint. Otherwise, the firstmain.py
found in any subdirectory will be used as the entrypoint. - Optional: A standard Python
requirements.txt
for any dependencies to be provided in the task environment. The Wallaroo SDK will already be present and should not be mentioned. Multiplerequirements.txt
files are not allowed. - Optional: Any other artifacts desired for runtime, including data or code.
Parameters
- Optional[str] path: The path to the file on your filesystem that will be uploaded as an Orchestration.
- Optional[bytes] bytes_buffer: The raw bytes to upload to be used Orchestration. Cannot be used with the
path
param. - Optional[str] name: An optional descriptive name for this Orchestration.
- Optional[str] file_name: An optional filename to describe your Orchestration when using the bytes_buffer param. Ignored when
path
is used.
Returns
The Orchestration that was uploaded. :raises OrchestrationUploadFailed If a server-side error prevented the upload from succeeding.
List all Tasks in the current Workspace.
Returns
A List containing Task objects.
Retrieve a Task by its ID.
Parameters
- str task_id: The ID of the Task to retrieve.
Returns
A Task object.
Determines if this code is inside an orchestration task.
Returns
True if running in a task.
When running inside a task (see in_task()
), obtain arguments passed to the task.
Returns
Dict of the arguments
Creates a Connection with the given name, type, and type-specific details.
Returns
Connection to an external data source.
6 - wallaroo.comment
Comment that may be attached to models and pipelines.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
7 - wallaroo.connection
Connection to an external data source or destination.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Wraps a list of connections for display in a display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
8 - wallaroo.datasizeunit
Data size limits for exported pipeline log files
Inherited Members
- enum.Enum
- name
- value
9 - wallaroo.deployment
Common base class for all non-exit exceptions.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Unspecified run-time error.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Base class for all backend GraphQL API objects.
This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Deploys this deployment, if it is not already deployed.
If the deployment is already deployed, this is a no-op.
Shuts down this deployment, if it is deployed.
If the deployment is already undeployed, this is a no-op.
Returns a dict of deployment status useful for determining if a deployment has succeeded.
Returns
Dict of deployment internal state information.
Waits for the deployment status to enter the "Running" state.
Will wait up "timeout_request" seconds for the deployment to enter that state. This is set in the "Client" object constructor. Will raise various exceptions on failures.
Returns
The deployment, for chaining.
Waits for the deployment to end.
Will wait up "timeout_request" seconds for the deployment to enter that state. This is set in the "Client" object constructor. Will raise various exceptions on failures.
Returns
The deployment, for chaining.
Returns an inference result on this deployment, given a tensor.
Parameters
- tensor: Union[Dict[str, Any], List[Any], pd.DataFrame, pa.Table]. The tensor to be sent to run inference on.
- timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
- dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[str] If set to ".", returned dataset will be flattened.
Returns
InferenceResult in dictionary, dataframe or arrow format.
This method is used to run inference on a deployment using a file. The file can be in one of the following formats: pandas.DataFrame: .arrow, .json which contains data either in the pandas.records format or wallaroo custom json format.
Parameters
- filename: Union[str, pathlib.Path]. The file to be sent to run inference on.
- data_format: Optional[str]. The format of the data in the file. If not provided, the format will be inferred from the file extension.
- timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
- dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[str] If set to ".", returned dataset will be flattened.
Returns
InferenceResult in dictionary, dataframe or arrow format.
Replaces the current model with a default-configured Model.
Parameters
- Model model: Model variant to replace current model with
Replaces the current model with a configured variant.
Parameters
- ModelConfig model_config: Configured model to replace current model with
Returns the internal inference URL that is only reachable from inside of the Wallaroo cluster by SDK instances deployed in the cluster.
If both pipelines and models are configured on the Deployment, this gives preference to pipelines. The returned URL is always for the first configured pipeline or model.
Returns the inference URL.
If both pipelines and models are configured on the Deployment, this gives preference to pipelines. The returned URL is always for the first configured pipeline or model.
Deployment.logs() has been removed. Please use pipeline.logs() instead.
10 - wallaroo.deployment_config
Inherited Members
- builtins.dict
- get
- setdefault
- pop
- popitem
- keys
- items
- values
- update
- fromkeys
- clear
- copy
Configures the minimum and maximum for autoscaling
Sets the average CPU metric to scale on in a percentage
Sets the number of CPUs to be used for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.
Parameters
- Model model: The sidekick model to configure.
- int core_count: Number of CPU cores to use in this sidekick.
Returns
This DeploymentConfigBuilder instance for chaining.
Sets the memory to be used for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.
Parameters
- Model model: The sidekick model to configure.
- str memory_spec: Specification of amount of memory (e.g., "2Gi", "500Mi") to use in this sidekick.
Returns
This DeploymentConfigBuilder instance for chaining.
Sets the environment variables to be set for the model's sidekick container. Only affects image-based models (e.g. MLFlow models) in a deployment.
Parameters
- Model model: The sidekick model to configure.
- Dict[str, str] environment: Dictionary of environment variables names and their corresponding values to be set in the sidekick container.
Returns
This DeploymentConfigBuilder instance for chaining.
11 - wallaroo.engine_config
Wraps an engine config.
Creates an EngineConfig for use with standalone mode
12 - wallaroo.expression
13 - wallaroo.framework
An Enum to represent the supported frameworks.
Inherited Members
- enum.Enum
- name
- value
- builtins.str
- encode
- replace
- split
- rsplit
- join
- capitalize
- casefold
- title
- center
- count
- expandtabs
- find
- partition
- index
- ljust
- lower
- lstrip
- rfind
- rindex
- rjust
- rstrip
- rpartition
- splitlines
- strip
- swapcase
- translate
- upper
- startswith
- endswith
- removeprefix
- removesuffix
- isascii
- islower
- isupper
- istitle
- isspace
- isdecimal
- isdigit
- isnumeric
- isalpha
- isalnum
- isidentifier
- isprintable
- zfill
- format
- format_map
- maketrans
14 - wallaroo.functions
15 - wallaroo.inference_decode
Decode inference results. Since they have a potentially rich structure, this could become a substantial effort in the future.
TODO: Support multiple outputs TODO: Support multiple data types
Converts a possibly multidimentionsl list of numbers into a dict where each item in the list is represented by a key value pair in the dict. Does not maintain dimensions since dataframes are 2d. Does not maintain/manage types since it should work for any type supported by numpy.
For example [1,2,3] => {prefix_0: 1, prefix_1: 2, prefix_2: 3}. [[1,2],[3,4]] => {prefix_0_0: 1, prefix_0_1: 2, prefix_1_0: 3, prefix_1_1: 4}
Recursively flattens the input dict, setting the values on the output dict. Assumes simple value types (str, numbers, dicts, and lists). If a value is a dict it is flattened recursively. If a value is a list each item is set as a new k, v pair.
Very similar to dict_list_to_dataframe but specific to inference logs since they have input and output heiararchical fields/structures that must be treated in particular ways.
Primarily for assay result lists but can be used for any list of simple dicts.
16 - wallaroo.inference_result
17 - wallaroo.logs
Wraps a single log entry.
This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.
Wraps a list of log entries.
This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
Wraps a list of log entries.
This class is highly experimental, is unsupported/untested, and may change/disappear in the near future.
Inherited Members
- builtins.list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
18 - wallaroo.model
Wraps a backend Model object.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Convenience function to quickly deploy a Model. It will configure the model, create a pipeline with a single model step, deploy it, and return the pipeline.
Typically, the configure() method is used to configure a model prior to deploying it. However, if a default configuration is sufficient, this function can be used to quickly deploy with said default configuration.
The filename this Model was generated from needs to have a recognizable file extension so that the runtime can be inferred. Currently, this is:
.onnx
-> ONNX runtime
Parameters
- str deployment_name: Name of the deployment to create. Must be unique across all deployments. Deployment names must be ASCII alpha-numeric characters plus dash (-) only.
Wraps a list of Models for display in a display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
19 - wallaroo.model_config
Wraps a backend ModelConfig object.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Creates a ModelConfig intended for use in generating standalone configurations
20 - wallaroo.ModelConversion
An enumeration.
Inherited Members
- enum.Enum
- name
- value
ConvertKerasArguments(name, comment, input_type, dimensions)
Create new instance of ConvertKerasArguments(name, comment, input_type, dimensions)
Inherited Members
- builtins.tuple
- index
- count
ConvertSKLearnArguments(name, number_of_columns, input_type, comment)
Create new instance of ConvertSKLearnArguments(name, number_of_columns, input_type, comment)
Inherited Members
- builtins.tuple
- index
- count
ConvertXGBoostArgs(name, number_of_columns, input_type, comment)
Create new instance of ConvertXGBoostArgs(name, number_of_columns, input_type, comment)
Inherited Members
- builtins.tuple
- index
- count
An enumeration.
Inherited Members
- enum.Enum
- name
- value
Common base class for all non-exit exceptions.
Inherited Members
- builtins.Exception
- Exception
- builtins.BaseException
- with_traceback
- args
Common base class for all non-exit exceptions.
Inherited Members
- builtins.Exception
- Exception
- builtins.BaseException
- with_traceback
- args
Common base class for all non-exit exceptions.
Inherited Members
- builtins.Exception
- Exception
- builtins.BaseException
- with_traceback
- args
Common base class for all non-exit exceptions.
Inherited Members
- builtins.Exception
- Exception
- builtins.BaseException
- with_traceback
- args
21 - wallaroo.models
A Wallaroo Model object. Models may have multiple versions, accessed via .versions()
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Wraps a list of Models for display in a display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
22 - wallaroo.notify
23 - wallaroo.object
Represents a not-set sentinel value.
Attributes that are null in the database will be returned as None in Python, and we want them to be set as such, so None cannot be used as a sentinel value signaling that an optional attribute is not yet set. Objects of this class fill that role instead.
Decorator that rehydrates the named attribute if needed.
This should decorate getter calls for an attribute:
@rehydrate(_foo_attr)
def foo_attr(self):
return self._foo_attr
This will cause the API object to "rehydrate" (perform a query to fetch and fill in all attributes from the database) if the named attribute is not set.
Returns a value in a nested dictionary, or DehydratedValue.
Parameters
- str path: Dot-delimited path within a nested dictionary; e.g.
foo.bar.baz
Returns
The requested value inside the dictionary, or DehydratedValue if it doesn't exist.
Raised when an API object is initialized without a required attribute.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when a model file fails to upload.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when a model file fails to convert.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when a model conversion took longer than 10mins
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when a query for a specific API object returns no results.
This is specifically for queries by unique identifiers that are expected to return exactly one result; queries that can return 0 to many results should return empty list instead of raising this exception.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when deployment fails.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when a community instance has hit the user limit
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when deployment fails.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when inference fails.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when an entity's name does not meet the expected critieria.
Parameters
- str name: the name string that is invalid
- str req: a string description of the requirement
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when some component cannot be contacted. There is a networking, configuration or installation problem.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Base class for all backend GraphQL API objects.
This class serves as a framework for API objects to be constructed based on a partially-complete JSON response, and to fill in their remaining members dynamically if needed.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
24 - wallaroo.orchestration
An Orchestration object that represents some user-defined code that has been packaged into a container and can be deployed.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Runs this Orchestration once.
Parameters
- name str A descriptive identifier for this run.
- json_args Dict[Any, Any] A JSON object containing deploy-specific arguments.
- timeout Optional[int] A timeout in seconds. Any instance of this orchestration that is running for longer than this specified time will be automatically killed.
- debug Optional[bool] Produce extra debugging output about the run
Returns
A metadata object associated with the deploy Task
Runs this Orchestration on a cron schedule.
Parameters
- name str A descriptive identifier for this run.
- schedule str A cron-style scheduling string, e.g. "* * * * " or "/15 * * * *"
- json_args Dict[Any, Any] A JSON object containing deploy-specific arguments.
- timeout Optional[int] A timeout in seconds. Any single instance of this orchestration that is running for longer than this specified time will be automatically killed. Future runs will still be scheduled.
- debug Optional[bool] Produce extra debugging output about the run
Returns
A metadata object associated with the deploy Task
Wraps a list of pipelines for display in a display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
Raised when uploading an Orchestration fails due to a backend issue.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when uploading an Orchestration without providing a file-like object.
Inherited Members
- builtins.BaseException
- with_traceback
- args
Raised when deploying an Orchestration fails due to a backend issue.
Inherited Members
- builtins.BaseException
- with_traceback
- args
25 - wallaroo.pipeline
A pipeline is an execution context for models. Pipelines contain Steps, which are often Models. Pipelines can be deployed or un-deployed.
Base constructor.
Each object requires:
- a GraphQL client - in order to fill its missing members dynamically
- an initial data blob - typically from unserialized JSON, contains at
- least the data for required members (typically the object's primary key) and optionally other data members.
Get a pipeline configuration for a specific version.
Parameters
- version: str Version of the pipeline. :return Dict[str, Any] Pipeline configuration.
Get inference logs for this pipeline.
Parameters
- limit: Optional[int]: Maximum number of logs to return.
- start_datetime: Optional[datetime.datetime]: Start time for logs.
- end_datetime: Optional[datetime.datetime]: End time for logs.
- valid: Optional[bool]: If set to False, will include logs for failed inferences
- dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
- arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame.
Returns
Union[LogEntries, pa.Table, pd.DataFrame]
Export logs to a user provided local file.
Parameters
- directory: Optional[str] Logs will be exported to a file in the given directory. By default, logs will be exported to new "logs" subdirectory in current working directory.
- file_prefix: Optional[str] Prefix to name the exported file. By default, the file_prefix will be set to the pipeline name.
- data_size_limit: Optional[str] The maximum size of the exported data in bytes. Size includes all files within the provided directory. By default, the data_size_limit will be set to 100MB.
- limit: Optional[int] The maximum number of logs to return.
- start_datetime: Optional[datetime.datetime] The start time to filter logs by.
- end_datetime: Optional[datetime.datetime] The end time to filter logs by.
- valid: Optional[bool] If set to False, will return logs for failed inferences.
- dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
- arrow: Optional[bool] If set to True, return logs as an Arrow Table. Else, returns Pandas DataFrame. :return None
Deploy pipeline. pipeline_name
is optional if deploy was called previously. When specified,
pipeline_name
must be ASCII alpha-numeric characters, plus dash (-) only.
Returns an inference result on this deployment, given a tensor.
Parameters
- tensor: Union[Dict[str, Any], pd.DataFrame, pa.Table] Inference data. Should be a dictionary. Future improvement: will be a pandas dataframe or arrow table
- timeout: Optional[Union[int, float]] infer requests will time out after the amount of seconds provided are exceeded. timeout defaults to 15 secs.
- dataset: Optional[List[str]] By default this is set to ["*"] which returns, ["time", "in", "out", "check_failures"]. Other available options - ["metadata"]
- dataset_exclude: Optional[List[str]] If set, allows user to exclude parts of dataset.
- dataset_separator: Optional[Union[Sequence[str], str]] If set to ".", return dataset will be flattened.
Returns
InferenceResult in dictionary, dataframe or arrow format.
Returns an inference result on this deployment, given tensors in a file.
Replaces the step at the given index with a model step
Perform inference on the same input data for any number of models.
Replaces the step at the index with a multi model step
Run audit logging on a specified slice
of model outputs.
The slice must be in python-like format. start:
, start:end
, and
:end
are supported.
Split traffic based on the value at a given meta_key
in the input data,
routing to the appropriate model.
If the resulting value is a key in options
, the corresponding model is used.
Otherwise, the default
model is used for inference.
Replace the step at the index with a key split step
Routes inputs to a single model, randomly chosen from the list of
weighted
options.
Each model receives inputs that are approximately proportional to the weight it is assigned. For example, with two models having weights 1 and 1, each will receive roughly equal amounts of inference inputs. If the weights were changed to 1 and 2, the models would receive roughly 33% and 66% respectively instead.
When choosing the model to use, a random number between 0.0 and 1.0 is generated. The weighted inputs are mapped to that range, and the random input is then used to select the model to use. For example, for the two-models equal-weight case, a random key of 0.4 would route to the first model. 0.6 would route to the second.
To support consistent assignment to a model, a hash_key
can be
specified. This must be between 0.0 and 1.0. The value at this key, when
present in the input data, will be used instead of a random number for
model selection.
Replace the step at the index with a random split step
Create a "shadow deployment" experiment pipeline. The champion
model and all challengers
are run for each input. The result data for
all models is logged, but the output of the champion
is the only
result returned.
This is particularly useful for "burn-in" testing a new model with real world data without displacing the currently proven model.
This is currently implemented as three steps: A multi model step, an audit step, and a select step. To remove or replace this step, you need to remove or replace all three. You can remove steps using pipeline.remove_step
Replace a given step with a shadow deployment
Add a validation
with the given name
. All validations are run on
all outputs, and all failures are logged.
Replace the step at the given index with the specified alert
Get the details of an explainability config.
Wraps a list of pipelines for display in a display-aware environment like Jupyter.
Inherited Members
- builtins.list
- list
- clear
- copy
- append
- insert
- extend
- pop
- remove
- index
- count
- reverse
- sort
26 - wallaroo.pipeline_config
An enumeration.
Inherited Members
- enum.Enum
- name
- value
- builtins.str
- encode
- replace
- split
- rsplit
- join
- capitalize
- casefold
- title
- center
- count
- expandtabs
- find
- partition
- index
- ljust
- lower
- lstrip
- rfind
- rindex
- rjust
- rstrip
- rpartition
- splitlines
- strip
- swapcase
- translate
- upper
- startswith
- endswith
- removeprefix
- removesuffix
- isascii
- islower
- isupper
- istitle
- isspace
- isdecimal
- isdigit
- isnumeric
- isalpha
- isalnum
- isidentifier
- isprintable
- zfill
- format
- format_map
- maketrans
Inherited Members
Inherited Members
Inherited Members
Inherited Members
Inherited Members
Inherited Members
Inherited Members
Inherited Members
Perform inference with a single model.
Replaces the step at the given index with a model step
Perform inference on the same input data for any number of models.
Replaces the step at the index with a multi model step
Run audit logging on a specified slice
of model outputs.
The slice must be in python-like format. start:
, start:end
, and
:end
are supported.
Replaces the step at the index with an audit step
Replaces the step at the index with a select step