A New Powerhouse in Multimodal AI: Introducing Qwen3.7-Plus
The AI landscape has just witnessed a significant advancement. Alibaba Cloud's Qwen series has officially unveiled its newest member: the Qwen3.7-Plus multimodal large model. This release represents more than a routine update; it signifies a fundamental leap in core capabilities.
Benchmark Dominance: A Dual Mastery of Vision and Language
The true test of an AI model lies in independent evaluation. On the renowned Vision Arena leaderboard, Qwen3.7-Plus has made a striking entrance, securing a position within the global top five and establishing itself as the leading model of its kind in China. This achievement solidifies its integrated prowess in both visual comprehension and linguistic tasks.
From Perception to Action: Closing the Loop on Complex Tasks
Moving beyond the conventional role of passive understanding, Qwen3.7-Plus introduces a paradigm shift. It seamlessly merges advanced visual perception with code generation, tool invocation, and even Graphical User Interface (GUI) control. This integration enables a single AI agent to autonomously plan and execute a sequence of complex actions—not just identifying an issue in an image, but also devising a solution, calling tools, and generating the necessary code to resolve it, completing an end-to-end task loop.
Accessible Innovation: Cloud APIs Now Available
To empower developers and businesses, immediate access is key. Qwen3.7-Plus is now live and accessible via API on its dedicated cloud platform as well as Alibaba Cloud's Model Studio. This eliminates the hurdles of local deployment, allowing users to seamlessly integrate this cutting-edge multimodal capability into their own applications and workflows, dramatically lowering the barrier to leveraging frontier AI technology.
The launch of Qwen3.7-Plus marks a substantial step forward for multimodal AI, transitioning the technology from a passive observer to an active participant in solving real-world challenges.