Wallaroo.AI (Version 2024.3)
2024.3 (Current Version)
2024.2
LLM Operations
LLM Tutorials
LLM Performance Optimizations
LLM Performance Optimizations
The following tutorials demonstrate optimizing LLM performance through Wallaroo.
Autoscaling with Llama 3 8B and Llama.cpp
Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial
Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial
Llama 3 8B Instruct with vLLM
Quantized Llava 34B with Llama.cpp