High Performance Tutorials

The following tutorials demonstrate optimizing LLM performance through Wallaroo.

Continuous Batching for Llama 3.1 8B with vLLM

Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial

Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial

Llama 3 8B Instruct with vLLM

Quantized Llava 34B with Llama.cpp

Continuous Batching for Custom Llama with vLLM