Wallaroo.AI (Version 2024.3)
2024.3
2025.2 (Current Version)
2025.1 (Version 2025.1)
2024.4
LLM Operations
LLM Tutorials
LLM Performance Optimizations
LLM Performance Optimizations
The following tutorials demonstrate optimizing LLM performance through Wallaroo.
Autoscaling with Llama 3 8B and Llama.cpp
Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial
Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial
Llama 3 8B Instruct with vLLM
Quantized Llava 34B with Llama.cpp