Intern-S2-Preview Packs Trillion-Scale Science Smarts Into A 35B Model

    
        By vramkickedin    
     | 
    
            May 27, 2026 at 9:15 am        
    
     | 
    
        2 min read

Intern-S2-Preview is a 35-billion-parameter scientific multimodal model that analyzes text, images, and time-series data while calling external tools. It continues pretraining from Qwen3.5 and undergoes a full training chain from large-scale task exposure to reinforcement learning. The model matches the output quality of much larger trillion-parameter systems on specialized scientific benchmarks.

The InternLM team built this release to test task scaling—training on increasingly harder, more diverse professional assignments instead of just growing parameter counts. This approach lets a 35B model rival trillion-scale alternatives on tasks like molecular structure generation, physical signal analysis, and agentic reasoning, all while staying compact enough for local hardware. It is fully open-source, giving researchers and developers the freedom to inspect, fine-tune, and deploy it privately.

Efficient scientific reasoning and tool use

Key Features

Scales scientific tasks from pretraining to RL.
Generates material crystal structures from text descriptions.
Handles heterogeneous time-series data of any length.
Switches between deep-thinking and fast-response modes.
Improves agent performance on scientific workflows.

Professionals and serious hobbyists running consumer GPUs can use this model for complex scientific analysis without relying on cloud services. Small agencies benefit from private, local processing of sensitive data like research logs or sensor readings, with strong reasoning kept in-house. Its native tool-calling support also simplifies automating multi-step tasks such as weather data retrieval or seismic event detection.

Developer notes and deployment tips

The team implemented multi-token prediction with a shared-weight design and KL loss to speed up token output while keeping training and inference aligned. For agent-based tasks, the developers recommend leaving the thinking mode on, as turning it off can hurt performance. Time-series inference currently works only through the LMDeploy framework, though general text and image serving also supports vLLM and SGLang.

"Intern-S2-Preview achieves performance comparable to the trillion-scale Intern-S1-Pro on multiple core professional scientific tasks, while using only 35B parameters." — Source: Hugging Face

Project Links