Tencent Youtu-LLM-2B Powers Smart AI Agents

    
        By vramkickedin    
     | 
    
            February 20, 2026 at 11:37 pm        
    
     | 
    
        2 min read

Tencent has introduced Youtu-LLM-2B, a lightweight language model containing 1.96 billion parameters that aims to harmonize computational efficiency with native agentic intelligence. Unlike many small models that rely on distillation from larger systems, this release was pre-trained from scratch using a gigantic 11 trillion tokens.

The model supports a large 128k context window, allowing it to maintain robust reasoning over long sequences, which is crucial for extended agent tasks. It utilizes a Dense Multi-Latent Attention (MLA) architecture, a design choice that facilitates robust long-context reasoning within a minimal memory footprint. This approach enables the model to systematically cultivate reasoning and planning capabilities rather than simply mimicking output behaviors.

Youtu-LLM's core features

Two model variants: Youtu-LLM-2B-Base and Youtu-LLM-2B Instruct
Dense Multi-Latent Attention
1.96B Parameters
130K+ Context Length
128K+ Vocabulary Size
32 Layers

Agentic benchmark performance

A display of Youtu-LLM's agentic performance

Performance evaluations indicate that Youtu-LLM sets a new standard for sub-2B parameter models. On general benchmarks, it achieves competitive performance against larger models, while on agent-specific tasks, it significantly surpasses existing baselines. For instance, in the GAIA benchmark for deep research, the model achieved a score of 33.9%, outperforming the Qwen3-4B model which scored 25.5%.

It also demonstrated strong capability in code-related agent tasks, scoring 17.7% on SWE-Bench-Verified compared to 5.7% for the same competitor. The model also scored 95.9% on HumanEval, showcasing its proficiency in coding tasks. These results suggest that lightweight models can indeed possess strong intrinsic agentic capabilities when trained with the right methodology.

Pre-training methodology & further insights

The development team emphasizes that the model's strength comes from injecting agent-oriented signals early in the training process. The technical report states,

'Unlike typical small models that rely on distillation, Youtu-LLM (1.96B) is pre-trained from scratch to systematically cultivate reasoning and planning capabilities.'

The researchers argue that while existing methods often align output behavior, they fail to systematically cultivate underlying cognitive capabilities. By using a principled training paradigm that enhances native agentic capabilities, the team has provided evidence that agentic pre-training can unlock agent potential in lightweight LLMs. The paper further notes,