Granite-4.1-30b Empowers Private AI Agents With Multi-Tool Skills

A colossal slab of raw granite roughly hewn but smoothed on one face with the characters granite-4.1-30b etched.

IBM has released Granite-4.1-30b, a 30‑billion parameter instruct model that brings upgraded tool calling and long‑context abilities to the open‑source community. It can summarize text, answer questions, write code, and handle function‑calling tasks across a dozen languages. The model was fine‑tuned from a base version using a mix of public permissively‑licensed datasets and internally generated synthetic data.

The Granite Team at IBM who also made Granite-4.0-3B and Granite-4.1-8B built this release by running the model through an improved post‑training pipeline. That pipeline combines supervised fine‑tuning with reinforcement learning alignment to produce better instruction following and chat behavior. The result is a dense transformer model that works as a foundation for AI assistants, business applications, and LLM agents that need to call external tools.

Stronger tool use and multilingual performance

Key Features
  • 30 billion parameters with 128k context length.
  • Improved tool calling for external API integration.
  • Supports 12 languages including German, Japanese.
  • Strong performance on math and code benchmarks.
  • Fill‑in‑the‑Middle for code completion tasks.
  • Designed for business assistants and agent workflows.
  • Reinforcement learning alignment for safer chat.
  • Open license for fine‑tuning and commercial use.

Small agencies and privacy‑conscious professionals can run this model on a single high‑end GPU to keep sensitive data local while automating business tasks with tool‑calling features. Hobbyists with prosumer hardware get a capable assistant for coding, multilingual projects, and retrieval‑augmented generation. The model’s balance of size and power makes it practical even for users who are not large‑scale cloud operators.

Training setup and known limits

IBM trained the Granite‑4.1‑30B on an NVIDIA GB200 NVL72 cluster at CoreWeave, linked by a fast InfiniBand fabric that scaled the work across thousands of GPUs. The team cautions that multilingual performance can trail English results, though adding a few examples often helps the model produce more accurate answers. It may still generate inaccurate, biased, or unsafe responses, so the developers recommend pairing it with a safety guard model.

Granite 4.1 models have gone through an improved post-training pipeline, including supervised finetuning and reinforcement learning alignment, resulting in enhanced tool calling, instruction following, and chat capabilities.