Wallaroo.AI (Version 2026.1)
2026.1 (Current Version)
202601
2026.1 (Current Version)
2025.2 (Current Version)
2025.1
2024.4
2024.3
2023.2
Inference
Inference Tutorials
High Performance Tutorials
High Performance Tutorials
The following tutorials demonstrate optimizing LLM performance through Wallaroo.
Continuous Batching for Llama 3.1 8B with vLLM
Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial
Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial
Llama 3 8B Instruct with vLLM
Quantized Llava 34B with Llama.cpp