2026.1 Product Release Notes
We are pleased to announce the following product improvements in our 2026.1 release:
- Wallaroo Support for AMD Instinct-series GPUs and the SGLang Runtime: Wallaroo supports LLM models configured for deployment with the SGLang Runtime and the AMD
ROCmlibrary with AMD Instinct-series GPUs. AMD ROCm provides hardware acceleration for Generative AI with powerful x64 support, and adds to the multiple architectures and AI accelerators supported by Wallaroo. - Updated Support for vLLM: This update improves performance for vLLM frameworks with support for vLLM
0.15.1for CUDA. This provides AI developers with more performant inference management and OpenAI compatibility enabled by default.
Technical Release Notes
The following technical details cover updates in SDK, API, libraries, and related updates and upgrades. They are provided so users are aware of changes they need to make to their environments based on updates to Wallaroo.
Sidekick Logging
- Delivered the full SDK sidekick log function and API route for retrieving Sidekick pod logs, which includes error responses.
- The API route (
/v1/api/pipelines/get_sidekick_pod_logs) now requires theworkspace_name,pipeline_name, andsidekick_nameparameters.
- The API route (
Improvements and Enhancements
- Performance and Load Management:
- Completed engine load management enhancements.
- Resolved an issue by changing the memory management behavior used to restrict the number of simultaneous uploads and upload parallelism, preventing out-of-memory (OOM) errors during bulk uploads. The default limit allows for 3 parallel uploads.
- Customer Metrics and SDK
- Delivered performance and cost endpoints (TPS, TTFT, Cost per token) for Custom Model aka BYOP (Bring Your Own Model) with documentation.
- Completed async SDK enhancements.
Bug Fixes
- Model and Platform Compatibility
- Fixed Custom Model aka BYOP Model Upload Validation failures on OpenShift using POWER10 architecture.
- Data and Metrics Accuracy
- Fixed an issue where Assay Results did not show for Input NaN values.
General Fixes
- Fixed an issue where Prod Orchestration was unable to retrieve orchestration info.