HRM-Text-1B Bends Time with Dual Recurrent Loops for Deep Reasoning

Two interlocking translucent rings emits a very faint inner glow along its surface.

Sapient Intelligence has released HRM-Text-1B, a 1-billion-parameter language model that uses a new dual-timescale architecture instead of a standard transformer. The model processes information through two recurrent loops — a slow strategic layer and a fast execution layer — giving it effectively unbounded reasoning depth without adding parameters. This approach allowed the team to train the model from scratch on only 40 billion unique tokens with a budget of $1,500.

The creators at Sapient Intelligence designed HRM-Text to prove that efficient pretraining is possible outside of massive data centers. Unlike most large models that learn from raw internet text, HRM-Text was trained exclusively on instruction-response pairs using a task-completion objective and a custom PrefixLM masking setup. The result is a research checkpoint that matches the performance of models 2 to 7 times its size on key benchmarks.

Trained on structured instruction data, not raw scrapes

Key Features
  • Dual-timescale recurrent architecture for deeper reasoning.
  • Trained solely on instruction-response pairs, not raw web text.
  • Reaches competitive scores on MMLU, GSM8K, and MATH.
  • Uses roughly 100-900x fewer training tokens than peers.
  • Runs on consumer GPUs with 1B parameters.
  • Requires specific condition tags for best results.
  • Open checkpoint for further alignment and fine-tuning.
  • All training data and code are publicly available.

This model is built for AI hobbyists, privacy-conscious professionals, and small teams who want to experiment with language model pretraining on local hardware. Since it was not tuned for chat or instruction following, users must add their own alignment layers, but its small size makes fine-tuning cheap and accessible. Those running local servers or working with sensitive data can take this powerful base and tailor it to specific tasks without sending information to the cloud.

Important notes about the raw checkpoint

The current release is a pre-alignment model, not a finished assistant. It handles English well but performs poorly on coding tasks because no code datasets were used during training. To get coherent answers for reasoning or math problems, you must use the composite condition tag `synth,cot` — plain zero-shot prompts will be much weaker.

“Despite utilizing roughly 100-900x fewer training tokens and 96-432x less estimated compute than standard baselines, HRM-Text performs competitively with 2-7B parameter open models.” — Source: arXiv paper