Wallaroo.AI (Version 2024.3)
  • 2024.3 202403
    • 2025.2 (Current Version)
    • 2025.1
    • 2024.4
    • 2024.3
    • 2023.2
  • Home
  • LLM Operations
Categories
  • Development1
  • Guides1
  • Tutorial7
  • Workshop3
Tags
  • MLOps API1
  • Wallaroo SDK1
  1. LLM Operations
  2. LLM Tutorials
  3. LLM Performance Optimizations

LLM Performance Optimizations

The following tutorials demonstrate optimizing LLM performance through Wallaroo.


Autoscaling with Llama 3 8B and Llama.cpp

Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial

Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial

Llama 3 8B Instruct with vLLM

Quantized Llava 34B with Llama.cpp

© 2026 Wallaroo Labs, Inc.