Wallaroo.AI (Version 2025.1)
2025.1 (Version 2025.1)
2025.2 (Current Version)
2024.4
2024.3
LLM Operations
LLM Tutorials
LLM Performance Optimizations
LLM Performance Optimizations
The following tutorials demonstrate optimizing LLM performance through Wallaroo.
Autoscaling with Llama 3 8B and Llama.cpp
Continuous Batching for Llama 3.1 8B with vLLM
Continuous Batching for Custom Llama with vLLM
Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial
Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial
Llama 3 8B Instruct with vLLM
Quantized Llava 34B with Llama.cpp