2025.1 Product Release Notes
We are pleased to announce the following product improvements in our 2025.1 release:
- Air-gapped Kubernetes Cluster and Single Node Support: Wallaroo is installable on air-gapped Kubernetes clusters and single node servers. This gives organizations the ability to run AI workloads in a fully isolated environment without external connections (including to the internet), and ensure compliance with the highest levels of privacy and security protocols for their data and infrastructure.
- LLMs optimizations with Continuous Batching: Continuous batching provides LLM deployments with NVIDIA GPUs additional optimizations by dynamically grouping incoming requests in real time. This allows LLM-based and agentic AI applications increased flexibility to maximize performance and hardware utilization.
- Edge and Multi-cloud Pipeline Publish Updates: The Wallaroo SDK provides enhancements to edge and multi-cloud publishing methods. These provide developers increased information on the pipeline publish sources, edges associated with the publish, version details, and role based filters.
- Private Package Repository: Wallaroo allows configuring package repositories with custom & private artifacts to be used when uploading AI models and workloads to Wallaroo. This provides AI developers full control, security and privacy over packages & libraries are used in standard and air-gapped Wallaroo Enterprise deployments.
- LLM deployment on QAIC: Wallaroo supports Qualcomm QAIC, providing GenAI/LLM deployment on Qualcomm’s AI 100 accelerator cards for optimized performance and lower costs.
- Feature Documentation:
- Tutorials:
- LLM token streaming with OpenAI compatibility deployment: Wallaroo provides OpenAI compatibility for improved interactive token streaming user experiences with LLM-based applications while taking advantage of Wallaroo’s ability to maximize throughput and optimizing latency. Additionally with OpenAI compatibility, AI developers can seamlessly migrate their applications from OpenAI endpoints to Wallaroo on-prem endpoints, in connected and air-gapped environments, without losing any functionality.
- Feature Documentation:
- Tutorials: