Production Features Tutorials

Model inference and results management


Infer Tutorial

How to use the Wallaroo SDK for inferences via the SDK and API with the Pipeline Inference URL.

Inference Endpoint API Spec Tutorial

How to use the Wallaroo SDK to create the inference endpoint API spec.

Autoscaling with Llama 3 8B and Llama.cpp

Inference Endpoint API Spec for OpenAI Compatibility Enabled vLLM Tutorial

How to use the Wallaroo SDK to create the inference endpoint API spec.

Async Infer Tutorial

How to perform asynchronous inferences via async_infer

Inference Logs Tutorial

How to retrieve Inference logs as DataFrame or Apache Arrow tables, and save inference logs to files.

Wallaroo SDK Parallel Infer Tutorial

How to use Wallaroo parallel infer for faster inference requests for large data sets.