Wallaroo.AI (Version 2025.2)
  • 2025.2 (Current Version)
    • 2025.1
    • 2024.4
    • 2024.3
    • 2023.2
2025.2 (Current Version)
  • 2025.1
  • 2024.4
  • 2024.3
  • 2023.2
  • Home
Wallaroo Feature
  • Deploy14
  • Edge7
  • Observability1
  • Observe10
  • Optimization2
  • Parallel Infer1
  • Run Anywhere18
  • Serve11
Models
  • Aloha3
  • Ccfraud2
  • Hf Summarizer1
  • Hf-Summarization2
  • Houseprice2
  • Houseprice-Prediction2
  • Hugging Face1
  • Linear-Regression1
  • Llamav21
  • Llm1
  • Mobilenet1
  • Python ARIMA1
  • R-Cnn1
  • Resnet1
  • Resnet501
  • U-Net3
  • Whisper-Large-V21
  • Yolov82
  • Yolov8n2
Tags
  • Wallaroo SDK1
  1. Inference
  2. Inference Tutorials
  3. High Performance Tutorials

High Performance Tutorials


The following tutorials demonstrate optimizing LLM performance through Wallaroo.


Continuous Batching for Llama 3.1 8B with vLLM

Dynamic Batching with Llama 3 8B Instruct vLLM Tutorial

Dynamic Batching with Llama 3 8B with Llama.cpp CPUs Tutorial

Llama 3 8B Instruct with vLLM

Quantized Llava 34B with Llama.cpp

Continuous Batching for Custom Llama with vLLM

© 2026 Wallaroo Labs, Inc.