Command-A-Plus-05-2026-Bf16 Arrives With 128K Context And Agentic Reasoning

A massive translucent sphere core composed of 25 interconnected expert nodes each node a glowing cluster of micro-circuits.

The open-source release of Command-A-Plus-05-2026-Bf16 delivers a massive 25‑billion‑parameter active model (218B total) that processes both text and images. It supports a 128K‑token context window and can generate up to 64K tokens in a single pass. The model is specifically optimized for agentic workflows, multilingual tasks, and heavy reasoning, making it one of the most capable openly available transformers.

CohereLabs developed Command A+ using a sparse mixture‑of‑experts design and published the weights on Hugging Face under an open license. The team focused on enterprise-grade performance, integrating tool‑use, vision, and support for 48 languages. This release brings datacenter‑class capabilities to researchers and teams who want to inspect, fine‑tune, or host their own powerful language model.

A model built for complex tasks

Key Features
  • 128K context length and 64K output.
  • Accepts both text and image inputs.
  • Enables conversational tool use with APIs.
  • Trained on 48 languages.
  • Uses sparse mixture‑of‑experts architecture.
  • Designed for agentic and reasoning workloads.

This release targets organizations with access to multi‑GPU server setups, such as enterprise AI labs and research institutions. Teams that need secure, on‑premise reasoning over multilingual documents or integration with external tools will find the model immediately useful. The open license also encourages the community to explore smaller, quantized deployments that could widen hardware accessibility.

Developer notes and hardware demands

The model was trained in a fully dropless manner and uses a token‑choice router with additive‑bias load balancing, which helps keep expert utilization stable. You need to install transformers from the source repository to handle its custom chat template, thinking tags, and citation spans. While several quantizations are provided, the full bf16 version requires at least four B200 or eight H100 GPUs, putting it firmly in the datacenter tier.

“Command A+ supports a context length of 128K & 64K output length.” — Source: Hugging Face model card