Inference

How to perform inferences and view inference logs in Wallaroo.

Deployed ML models provide endpoints for performing inference requests. The results of these requests are available to through the Wallaroo pipeline logs.

The following guides detail:

Inference: The various methods of performing inference requests on deployed pipelines in various environments.
Production Features: The tools to track inference performance and increase resources automatically.
High performance serving: How to leverage Wallaroo capabilities to increase throughput and

The following guides detail how to perform inference requests through a Wallaroo deployed ML model and retrieve the logs of those inference requests. For full details on methods and parameters, see the following reference guides.

Wallaroo SDK: Wallaroo SDK provides methods for performing inference requests through ML Models deployed through the Wallaroo Ops Center.
API Requests: API requests through the deployed ML models inference endpoints for ML models deployed through Wallaroo Ops or Run Anywhere.
OpenAI Compatibility: Inference requests on models deployed with OpenAI compatibility enabled.

Inference

Inference Methods

Production Features

High Performance Serving

Portability

Inference Tutorials