Multimodal

About multimodal releases

Discover new open‑source multimodal models. This archive covers models that can handle multiple functions such ah text, images, audio, and more.

Latest multimodal models

May 15, 2026
Qwen3.6-27B-MTP-UD-GGUF Makes Your GPU Think Ahead

Havenoammo’s new Qwen3.6-27B-MTP-UD-GGUF package combines Unsloth Dynamic 2.0 XL quantization with grafted Multi-Token Prediction (MTP) layers for the Qwen3.6 27B model. This format enables speculative decoding, where the model predicts […]

Read More
May 15, 2026
MiniCPM-V-4.6 Packs Private Visual AI Into Phones

MiniCPM-V-4.6 is a new open-source multimodal model that brings image and video understanding directly to smartphones and small computers. It answers questions about photos and video clips without a cloud […]

Read More
May 12, 2026
Qwen3.5-9B-DeepSeek-V4-Flash-GGUF Brings Deep Reasoning Home

The Qwen3.5-9B-DeepSeek-V4-Flash-GGUF is a compressed language model that packs DeepSeek-V4’s advanced reasoning into a 9-billion-parameter package for local use. It converts the full model into the GGUF format, so it […]

Read More
May 12, 2026
Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF

The Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF package delivers an uncensored, performance-enhanced version of Qwen’s latest 27B model in highly accurate compressed formats. This release strips away the original model’s refusal behavior, cutting the refusal […]

Read More
May 12, 2026
Google Turbocharges Gemma 4 With Gemma-4-26B-A4B-it-assistant

Google just dropped a new tool that makes its open-source AI models run much faster. The Gemma-4-26B-A4B-It-Assistant is a lightweight draft model that predicts tokens ahead of the main AI, […]

Read More
May 12, 2026
Google Drops Gemma-4-31B-It-Assistant To Triple Local AI Speed

The Gemma-4-31B-It-Assistant is a lightweight draft model built to speed up text generation when paired with Google’s full Gemma 4 31B instruction-tuned model. It uses a technique called speculative decoding […]

Read More
May 10, 2026
Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4 Opens Local Multimodal AI

NVIDIA has released Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4, an open multimodal AI model that simultaneously processes video, audio, images, and text. The 31-billion-parameter system uses a hybrid Mamba2-Transformer design that activates only about 3 […]

Read More
May 7, 2026
Mistral AI Introduces Mistral-Medium-3.5-128B As One Unified Tool

Mistral-Medium-3.5-128B is a dense flagship model designed to handle complex reasoning, coding, and instruction-following tasks. It serves as a unified replacement for several previous models released by the company. The […]

Read More
April 30, 2026
Nvidia Unleashes Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16 Locally

Nvidia recently released Nemotron-3-Nano-Omni-30B-A3B-Reasoning-BF16, an open multimodal AI system that processes video, audio, images, and text in a single workflow. Users can run it locally to summarize lengthy meetings, transcribe […]

Read More