Wallaroo.AI (Version 2025.1)
  • 2025.1 202501
    • 2025.2 (Current Version)
    • 2025.1
    • 2024.4
    • 2024.3
    • 2023.2
  • Home
  • LLM Operations
  1. LLM Operations
  2. LLM Tutorials
  3. LLM Performance Optimizations

LLM Performance Optimizations

The following tutorials demonstrate optimizing LLM performance through Wallaroo.


Autoscaling with Llama 3 8B and Llama.cpp

Continuous Batching for Llama 3.1 8B with vLLM

Continuous Batching for Custom Llama with vLLM

Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial

Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial

Llama 3 8B Instruct with vLLM

Quantized Llava 34B with Llama.cpp

© 2026 Wallaroo Labs, Inc.