Alibaba Cloud's Bailian Cuts Costs: DeepSeek-V4-Pro Implicit Cache Pricing Slashed for Efficient AI

Bailian Platform Announces Pricing Optimization for DeepSeek-V4-Pro

Alibaba Cloud's Bailian, a leading large model service platform, has revealed a significant pricing update. Effective from 23:59:59 Beijing Time on April 29, 2026, the platform will implement revised billing rates for the implicit cache feature of the DeepSeek-V4-Pro model.

Details of the Implicit Cache Adjustment

The new pricing structure sets the implicit cache rate at just 1 yuan per million tokens. This change specifically applies when user requests successfully match cached content, with those tokens billed under the cached_token category.

How the Billing Works

Cache Hits: Input tokens that match existing cached content qualify for the reduced rate
Cache Misses: Requests that don't match cache continue to be billed at standard input_token rates
Base Pricing Unchanged: This adjustment affects only implicit cache costs, with the model's fundamental inference pricing remaining stable

Practical Benefits for Developers

This optimization particularly benefits development teams working with repetitive or similar prompts. By reducing costs for cached requests, Bailian enables more economical AI model deployment in high-frequency usage scenarios. Organizations can expect meaningful cost savings in areas like batch content generation, repetitive task processing, and standardized query responses.