Wallaroo.AI (Version 2024.3)
2024.3
202403
2025.2 (Current Version)
2025.1
2024.4
2024.3
2023.2
LLM Operations
LLM Tutorials
LLM Performance Optimizations
LLM Performance Optimizations
The following tutorials demonstrate optimizing LLM performance through Wallaroo.
Autoscaling with Llama 3 8B and Llama.cpp
Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial
Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial
Llama 3 8B Instruct with vLLM
Quantized Llava 34B with Llama.cpp