SupraLabs Stretches Supra-1.5-50M-Base-exp Context Window Fivefold With New Pretraining Mix

    
        By vramkickedin    
     | 
    
            June 28, 2026 at 4:15 pm        
    
     | 
    
        2 min read

SupraLabs has published Supra-1.5-50M-Base-exp, a continued pretraining update for their 50-million-parameter language model that stretches the usable context window fivefold. The release takes the original Supra-50M architecture and uses RoPE scaling with full-weight training to jump from 1,024 tokens to 5,120 tokens. This experimental base model is designed specifically as a foundation for future supervised fine-tuning and reinforcement learning projects.

The team SupraLabs who also released Supra-50m-Reasoning, continued training the model on a fresh 3-billion-token mix rather than starting from scratch. That mix deliberately blends 30% tool calling data, 30% ChatML conversations, 25% factual text from articles and essays, and 15% math and logic problems. The project ships alongside an Instruct fine-tune and GGUF quantized versions, making the entire family immediately usable on consumer hardware.

Context expansion and data mix

Key changes from the original

Context length expanded from 1,024 to 5,120 tokens.
Continued pretraining on 3 billion packed tokens.
Data mix includes tool calling and ChatML.
Same 50M parameter architecture and tokenizer.
GGUF quantizations range from 1-bit to 32-bit.
Instruct version uses Alpaca chat format.
Raw and normalized inference show task-based differences.

Small-scale AI tinkerers and hobbyists can grab the GGUF files and run them locally through llama.cpp with a single command. Developers interested in supervised fine-tuning experiments now have a base model that understands tool calling and conversational formats natively, reducing the need for custom data preprocessing. The tiny footprint means even the full 32-bit version weighs just 208 megabytes, making it viable for embedded projects, rapid prototyping, and resource-constrained environments.

What the developers are noting

The team labels this release experimental and frames it as part of a larger initiative called Project Chimera. The Instruct variant scored a consistent 67.4 on BLiMP evaluations, with an unusual pattern emerging: science and factual questions performed better under raw inference, while math and logic tasks improved with normalized inference. Looking ahead, SupraLabs plans to release Supra-124M and Supra-350M families covering base, chat, reasoning, and coding capabilities, all under the Apache 2.0 license.

"The biggest upgrade is context. Supra-1.5 expands from 1,024 to 5,120 tokens using RoPE scaling, with continued pretraining on a 3B token mix of tool calling data, ChatML conversations, factual text, and math." — Source: Reddit

Project Links