Wallaroo.AI (Version 2025.2)
  • 2025.2 (Current Version) 202502-PU
    • 2025.2 (Current Version)
    • 2025.1
    • 2024.4
    • 2024.3
    • 2023.2
  • Home
  • Inference Tutorials
  1. Inference
  2. Inference Tutorials
  3. High Performance Tutorials

High Performance Tutorials

The following tutorials demonstrate optimizing LLM performance through Wallaroo.


Continuous Batching for Llama 3.1 8B with vLLM

Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial

Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial

Llama 3 8B Instruct with vLLM

Quantized Llava 34B with Llama.cpp

Continuous Batching for Custom Llama with vLLM

© 2026 Wallaroo Labs, Inc.