May 2026

The open‑source AI scene delivered an avalanche of new models, tools, and adapters this month. From trillion‑parameter LLMs to tiny on‑device translation apps, there’s something for everyone. Here’s your straight‑to‑the‑point breakdown.

Large Language Models

Heavyweight Reasoning Giants

Ring‑2.6‑1T is a trillion‑parameter model purpose‑built for continuous agent workflows and complex multi‑step tasks. Ling‑2.6‑1T brings another trillion‑parameter option, this time focused on coding and tool calling. Mistral‑Medium‑3.5‑128B arrives as a dense flagship that consolidates several previous Mistral releases into one model.

Command‑A‑Plus‑05‑2026‑Bf16 packs a 25‑billion‑parameter active engine that handles both text and images with a 128K context window. Step‑3.7‑Flash is a 198B vision‑language model that activates only 11B parameters per token thanks to sparse mixture‑of‑experts routing. NVIDIA‑Nemotron‑Labs‑3‑Elastic‑30B‑A3B‑BF16 is a single checkpoint that can serve three different reasoning model sizes on the fly.

Emo introduces a mixture‑of‑experts design where experts self‑organize into topics without human labels. ZAYA1‑8B uses only 760 million active parameters for deep long‑form reasoning tasks. Intern‑S2‑Preview is a 35B scientific assistant that understands text, images, and time‑series while calling external tools.

Uncensored & Modified Models

The refusal‑free movement is stronger than ever. Qwen3.6‑27B‑OBLITERATED surgically reduces safety refusals through direct weight editing, while Qwen3.5‑27B‑uncensored‑heretic‑v2‑Native‑MTP‑Preserved keeps all 15 Multi‑Token Prediction layers intact after censorship removal. Qwen3.6‑35B‑A3B‑Uncensored‑Genesis‑V2‑APEX‑MTP‑GGUF delivers a refusal‑free Qwen MoE as a ready‑to‑run quantized package.

Qwen3.6‑35B‑A3B‑uncensored‑heretic‑Native‑MTP‑Preserved cuts unwanted refusals by 88% while preserving 19 MTP layers. Qwen3.6‑27B‑AEON‑Ultimate‑Uncensored‑BF16 strips the “safety tax” completely for direct instruction‑following. Qwen3.6‑27B‑Heretic‑Uncensored‑FINETUNE‑NEO‑CODE‑Di‑IMatrix‑MAX‑GGUF packages the refusal‑free model in highly accurate compressed GGUF formats.

Gemma 4 models got the same treatment. Gemma‑4‑Ortenzya‑The‑Creative‑Wordsmith‑31B‑it‑uncensored‑heretic cuts refusals while boosting creative writing, G4‑MeroMero‑31B‑uncensored‑heretic strips refusals for storytelling, and Gemma‑4‑Gembrain‑31B‑it‑uncensored‑heretic uses abliteration to remove safety blocks. Gemma4‑26B‑A4B‑Uncensored‑HauhauCS‑Balanced scores zero refusals across 465 test prompts while keeping full capabilities.

Compact & Specialized Models

Small models are punching above their weight. Supra‑50M is a tiny 50M‑parameter model trained from scratch that beats GPT‑2 on specific benchmarks. MiniCPM5‑1B runs entirely on personal devices, switching between a fast assistant and a deeper reasoning mode. Nandi‑Mini‑600M‑Early‑Checkpoint is an early preview of a compact model supporting English and 11 Indic languages.

Translation gets a major boost. Hy‑MT2‑30B‑A3B is an open‑source MoE translator covering 33 languages, while Hy‑MT2‑1.8B offers speedier translation for real‑world text. Hy‑MT1.5‑1.8B‑1.25bit shrinks the translation system to run entirely on a phone offline.

Domain‑specific models also appeared. AntAngelMed is a medical MoE model for clinical reasoning; Leanly_AI supports psychologists working with obesity patients. SmallCode is a terminal‑based coding agent that keeps code private on your hardware. needle is a 26M‑parameter model built exclusively for function calling and tool use. BitCPM4‑CANN‑8B compresses weights to three values, cutting memory by six times. Ettin‑Reranker‑1b‑V1 boosts search quality by scoring text pairs. HRM‑Text‑1B uses a dual‑timescale architecture instead of a standard transformer.

Fara‑7B is an open‑weight computer‑use agent that plans and executes web tasks by seeing screenshots. NuExtract3 extracts structured data from documents into Markdown using a 4B vision‑language model. Qwopus3.5‑9B‑Coder‑GGUF and MiMo‑V2.5‑coder‑Q2 bring compressed coding agents to local machines. Qwopus3.6‑27B‑v2‑MTP‑GGUF delivers a quantized reasoning model using multi‑token prediction.

Vision & Image Models

Text‑to‑Image Generators

Microsoft’s Lens is a 3.8B foundational model that outperforms many 6B+ alternatives with far less training compute. Its distilled sibling Lens‑Turbo generates high‑quality images in just four steps. Walkyrie‑1.3B‑v1.0 was rebuilt from a video model to produce crisp 1024×1024 images from prompts.

HiDream‑O1‑Image creates, edits, and personalizes pictures without needing separate compression tools. Nemotron‑Labs‑Diffusion‑14B can generate text either normally or with a faster diffusion‑based parallel method. Anima Base v1.0 is a 2B model focused on anime‑style and non‑photorealistic artwork.

Adapters & LoRAs

Style control is expanding. Flux.2‑Klein‑Loras packs multiple style adapters for the Flux.2 Klein 9B model. AsymFLUX.2‑klein‑9B lets the same base model generate raw pixel images without a VAE. Qwen‑2512‑portrait sharpens human portraits with natural skin detail, and UltraReal_FineTune_Anima pushes the Anima generator toward realistic photo outputs.

Vision‑Language Models

Vision‑language AI is now faster and more capable. LocateAnything‑3B from NVIDIA marks objects or text in images based on plain prompts. Keye‑VL‑2.0‑30B‑A3B understands long videos and performs agent tasks like web search using sparse attention. MiniCPM‑V‑4.6 brings image and video understanding directly to smartphones with a cloud‑free experience.

SenseNova‑U1‑A3B‑MoT unifies image understanding, generation, and editing without separate visual encoders. Ovis2.6‑80B‑A3B examines high‑res images and long documents efficiently. NVIDIA also released Nemotron‑3‑Nano‑Omni‑30B‑A3B‑Reasoning‑NVFP4, a multimodal AI that processes video, audio, and text in a single pipeline.

Nvidia’s PiD is a pixel diffusion decoder that speeds up high‑resolution image generation by denoising directly in pixel space. Lance is a unified model that handles image and video understanding, generation, and editing in one place. IBM contributed Granite‑4.1‑8b for chat and instruction following, and Granite‑4.1‑30b for upgraded tool calling and long‑context tasks.

Video Models & Tools

Video Generation

SANA‑WM Bidirectional creates minute‑long 720p videos from a single starting image. LongCat‑Video‑Avatar‑1.5 generates talking avatar videos with realistic characters or animations. LTX2.3‑10Eros turns a still image into a short motion clip by merging layers from different training steps.

Pixal3D transforms a single image into a detailed 3D asset. ScreenDiffusion instantly reimagines your desktop as living art through image‑to‑image AI. Phosphene is a free Mac panel that creates video clips with synced audio using LTX 2.3. studiomi300 chains multiple models to produce a 30‑second cinematic reel from a text prompt.

Video Editing & Adapters

The LTX Video 2.3 ecosystem is exploding. LTX‑2.3 Upscale IC Lora turns soft clips into cleaner, sharper footage. VR‑360‑Outpaint‑LTX2.3‑IC‑LoRA converts widescreen to full 360‑degree equirectangular video. SYSTMS‑FLW‑IC‑LORA‑LTX‑2.3 creates smooth shot‑to‑shot transitions.

LTX‑2.3‑Dearchive‑Lora makes vintage footage look like it was shot yesterday. Obscura Remova removes haze, smoke, or foreground objects from clips. LTX‑2.3‑22b‑IC‑LoRA‑LipDub replaces speech and lip motion with synchronized audio. OmniNFT uses reinforcement learning to align audio and video generation better.

Video Understanding

Marlin‑2B extracts structured descriptions and second‑precise timestamps from footage. Causal‑Forcing is a training method that distills large video models into efficient real‑time generators. Vlo v0.2.0 is a timeline‑based video editor with ComfyUI‑powered generative AI.

Audio, Voice & Music

Text‑to‑Speech & Voice Cloning

MOSS‑TTS‑v1.5 upgrades zero‑shot voice cloning with better quality. DramaBox turns scene descriptions into expressive speech with laughs, sighs, and pauses. Scenema‑Audio clones voices and performs scene‑aware emotional speech with ambient sounds.

Supertonic‑3 runs fully on‑device TTS with ONNX, expanding language support. Derpy‑Turtle‑The‑Kokoro‑Trainer blends Kokoro TTS with RVC voice conversion in a Windows GUI. Comfyui‑controlfoley generates synced foley sound effects directly from silent video.

Music Generation

Ace‑Step‑1.5‑XL‑Concept‑Sliders let you push a local music generator toward or away from specific audio traits. Ace‑Step‑1.5‑Api‑server‑UI wraps the model into a full‑featured local studio interface. MusiCue converts songs into timeline‑based cues for driving animation or show‑control software.

The ComfyUI Explosion

Prompt & Style Nodes

ComfyUI‑SmartPromptCrafter auto‑builds optimized prompt pairs for any checkpoint. RebelsPromptEnhancer rewrites short ideas into detailed prompts using a local 4B model. ErniePEUnleashed adds foreground‑background layering and lighting logic to descriptions. ComfyUI‑Anima‑Style‑Nodes lets you visually browse and apply anime artist tags, and Comfyui‑Anima‑Regional‑Conditioning routes attention so masked areas get specific prompts only.

Image Generation & Editing Nodes

Orion4D_generative_paint adds a full painting interface right in the browser. ComfyUI‑Olm‑Liquify brings Photoshop‑style warping with push, twirl, and pinch brushes. ComfyUI‑Angelo merges a sampler and inpaint refiner so you can paint fixes directly on outputs. ComfyUI_KleinTiledUpscaler upscales using tiled inpainting for creative detail.

ComfyUI‑PiD integrates NVIDIA’s pixel diffusion decoder, skipping the traditional VAE. ComfyUI‑FeatherOps accelerates diffusion inference on AMD RDNA3 GPUs with a custom HIP kernel. ComfyUI‑Safe‑Chunked‑Image‑Blend gives explicit control over batch resize and blending. ComfyUI‑ReferenceLatentPlus provides per‑image control over how references influence outputs. ComfyUI‑Untwisting‑RoPE brings training‑free style transfer to diffusion transformer models.

Audio, Video & 3D Nodes

ComfyUI‑Yedp‑Action‑Director puts a full 3D viewport with path tracing into workflows. ComfyUI‑Magos‑Nodes adds a skeleton editor and retargeter for body and face keypoints. Comfyui_VideoCombine_Plus extends video combining with sound volume and extra controls. ComfyUI‑DramaBox brings ResembleAI’s expressive TTS into the node graph. ComfyUI‑XAV‑Google‑Sheets pulls text from a public Google Sheet to drive generations.

Workflow, Utility & Hardware Nodes

WorkflowX‑Configurator switches between workflow profiles without duplicating graphs. ComfyUI‑Workflow‑Finder searches local workflow collections with plain English descriptions. ComfyUI‑lora‑FindingLora replaces the stock LoRA loader with fuzzy search, bookmarking, and trigger word storage. BangtrixToolkit overlays a real‑time hardware monitor onto the canvas and includes a universal prompt translator.

ComfyUI‑ialhabbal bundles eight tools including interactive prompt review and batch loading. ComfyUI_ShowMe lets you draw annotation notes directly on the workflow canvas. ComfyUI‑gonztok_nodes replaces text inputs with visual pickers for images and LoRAs. ComfyUI‑SPEED nearly doubles sampling speed with Spectral Progressive Diffusion. Comfyui‑Mesh splits inference across two Nvidia GPUs using NVENC video encoder chips. ComfyUI‑PlagueKind‑Nodes unifies image and mask resizing in one step.

ComfyUI‑Fayens streamlines face swap pipelines with automatic face crops and masks. Comfyui‑Clippy‑Reloaded pastes images directly from your clipboard into a workflow. comfyui‑artius‑browser adds a fast sidebar asset manager for dragging images, videos, and 3D files. Comfyui‑node‑canvas gives a GUI app to build custom nodes without writing boilerplate code.

Inference Engines & Local Utilities

Running Models on Your Own Hardware

hipEngine is a new ROCm‑native inference engine for AMD RDNA3 GPUs that runs LLMs without PyTorch. MiniCPM‑V‑4.6‑OrangePi combines a from‑scratch C++ engine to bring the vision‑language model to a $100 edge board. Beellama.cpp forks llama.cpp with DFlash speculative decoding, TurboQuant KV‑cache compression, and better memory usage.

ExLlamaV3 introduces the EXL3 quantization format for very low bitrate performance. ds4.pinokio launches DeepSeek V4 Flash on Apple Silicon Macs with a native Metal engine. Deepseek‑V4‑GGUF shrinks the massive model to fit high‑end consumer GPUs. Qwen3.5‑9B‑DeepSeek‑V4‑Flash‑GGUF packs DeepSeek‑V4’s reasoning into a 9B package for local use.

Draft models speed up generation. Gemma‑4‑26B‑A4B‑It‑Assistant and Gemma‑4‑31B‑It‑Assistant are lightweight drafters that predict tokens ahead of the full model. Gemma‑4‑31B‑It‑DFlash works alongside Gemma 4 31B Instruct to accelerate text output. Lucebox‑hub adds DFlash speculative decoding and PFlash speculative prefill for AMD Ryzen GPUs. Kimi‑K2.6‑NVFP4 is a pre‑quantized version of the Kimi‑K2.6 model for Nvidia hardware.

Compression & Quantization Tools

FP16‑FP8‑to‑NVFP4 converts diffusion model files to NVFP4 format for Blackwell GPUs. Torch‑Nvenc‑Compress uses the GPU’s idle video encoder to compress ML data. Shrinking models is key; Qwen3.6‑27B‑GGUF‑MTP keeps multi‑token prediction layers intact in GGUF, while Qwen3.6‑27B‑MTP‑UD‑GGUF pairs Unsloth quantization with grafted MTP layers for speculative decoding.

Everyday Interface & Privacy Tools

TextGen morphs into a no‑install portable desktop app for local LLMs. Tokenspeed helps you feel what different tokens‑per‑second rates actually mean while working. Nexus‑BTA bundles image, video, and 3D generation into one local AI studio with an embedded ComfyUI runtime. EasyUI removes node‑graph editing with a clean open‑source web interface for local tools. somni‑comfyui delivers a polished ComfyUI frontend with a chat mode for quick generations.

OpenReader reads EPUB, PDF, and Markdown files aloud while highlighting words in sync. AI Metadata Viewer lets you drag an AI‑generated image to see all creation data instantly. Streamlined‑HF‑Model‑Search is a single HTML file that explores Hugging Face models and quantizations. Merlin‑community strips repeated text from prompts to improve output quality. Opendesk gives agents direct control over a desktop, including screenshot, mouse, and keyboard actions.

Datasets, Training & Curation

Dataset Preparation & Refinement

IMG‑Dataset‑Refiner turns messy folders into clean, balanced datasets with a visual editor. Caption‑Creator generates high‑quality image captions and tags locally. Diff‑forge automates video dataset preparation for diffusion model fine‑tuning. Cull scrapes, classifies, and sorts AI‑generated images into organized folders. Deepbooru‑tagwalker improves existing tags in image datasets without manual editing.

Training & Fine‑Tuning Helpers

Bracket runs many short training experiments in parallel to find the best fine‑tuning hyperparameters. Anima‑TrainFlow is a single‑page desktop tool for training LoRAs on the Anima 2B model. ControlLight brightens low‑light photos with a simple slider while maintaining quality. FP‑Background_Obliterator pairs AI background removal with a full layer‑based editor. ShrinkComfy compresses ComfyUI PNG outputs to WEBP or JPG while preserving workflow metadata.

A luminous uncensored neural mesh brain made of translucent digital wireframe.

Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF Removes All Refusals

31 May 2026

Qwen3.6-35B-A3B-Uncensored-Genesis-V2-APEX-MTP-GGUF is a fully quantized, refusal-free language model that packages the original Qwen3.6‑35B‑A3B MoE architecture into ready‑to‑run GGUF files. This release combines APEX and MTP‑APEX quantization formats with a numerical […]

Prismatic layered sphere composed of digital wireframe with glowing nodes.

Kezmark Drops ErniePEUnleashed To Craft Cinematic Scene Prompts

31 May 2026

ErniePEUnleashed is a fine-tuned prompt enhancement model that transforms short ideas into richly detailed, spatially structured descriptions for AI image generation. It pays attention to foreground-midground-background layering, lighting logic, camera […]

A large digital camera lens resting on the right side of the frame with polished translucent glass and brushed metal.

Lens Focuses High-Quality Image Creation On Your Home GPU

30 May 2026

Microsoft has released Lens, a 3.8-billion-parameter text-to-image model that generates high-quality images with much lower training compute requirements than larger alternatives. It outperforms or matches 6B+ parameter models on standard […]

A sophisticated singular intricate object comprised of interlocking prisms with frosted glass surfaces and micro-etched circuitry.

Qwopus3.6-27B-v2-MTP-GGUF Puts Faster Stepwise AI on Your GPU

29 May 2026

Jackrong has released Qwopus3.6-27B-v2-MTP-GGUF, a quantized version of the new Qwopus reasoning model that uses multi-token prediction to speed up text generation. The original Qwopus3.6-27B-v2-MTP model was fine-tuned from Qwen3.6-27B […]

Digital gemstone in soft teal and warm amber hues connected by a fine broken silver chain.

Zero Refusals Gemma4-26B-A4B-Uncensored-HauhauCS-Balanced Drops

27 May 2026

Gemma4-26B-A4B-Uncensored-HauhauCS-Balanced is a version of Google’s Gemma 4-26B model with all refusal mechanisms removed while keeping the original capabilities fully intact. This release candidate scored zero refusals across 465 standard […]

A film frame crafted from frosted glass with a subtle digital grid texture embedded within its edges.

Marlin-2B Pins Down Every Second Of Your Video

26 May 2026

Marlin-2B is a new open-source video language model that extracts structured descriptions and second‑precise timestamps from video footage. It answers the two questions developers most often ask about a video: […]

A frosted glass computer mouse with subtle digital circuit lines faintly glowing in muted coral and slate blue hues.

Opendesk Unlocks Direct Desktop Control for AI Agents

26 May 2026

The Opendesk framework gives any AI agent direct control over a desktop computer—screenshots, mouse, keyboard, and app interaction—just like a real person. It works across macOS, Linux, and Windows, turning […]

A luminous translucent small meticulously detailed film reel. The reel is crafted from brushed silver with a matte texture.

Studiomi300 Spins One Prompt Into a 30s Cinematic Reel

26 May 2026

The studiomi300 pipeline turns a single text prompt into a complete 30-second cinematic reel, complete with consistent characters, music, and voice-over. It strings together multiple large AI models—a director, image […]

A single geometric musical note shape composed of thin glowing lines in soft coral and pale blue.

MusiCue Chisels Music Into Frame-Perfect Animation Cues

26 May 2026

MusiCue is an open-source tool that converts songs into typed, timeline-based cues for driving animation and show-control software. Developer cedarconnor built it to break audio into separated stems, beats, drum […]

A stylized low-poly sloth mascot sitting contentedly rendered in soft warm browns and cream tones hold three glowing hexagonal tokens.

Unsloth Drops Qwen3.6-27B-GGUF-MTP For 2x Faster Local AI

19 May 2026

Unsloth has released Qwen3.6-27B-GGUF-MTP, a quantized model file that preserves the multi-token prediction (MTP) layers from Qwen’s latest 27-billion-parameter language model. This GGUF format makes it possible to run the […]

A translucent geometric sheep constructed from soft blue and lavender polygonal facets with tiny semi-transparent documents.

Ovis2.6-80B-A3B Lands Private Visual AI on a Single GPU

19 May 2026

Ovis2.6-80B-A3B is a new multimodal AI that pairs vision and language through a mixture-of-experts design, keeping it fast and efficient. It can examine high-resolution images, long documents, and even videos, […]

Two vertical data block stacks standing side by side on a flat dark slate surface.

Trim 31GB AI Models To 13GB With FP16-FP8-to-NVFP4

15 May 2026

The FP16-FP8-to-NVFP4 tool by developer Thenotrealuser is a Windows-based converter that turns FP16 or BF16 diffusion model files into NVFP4 format for Blackwell GPUs. It targets popular image generation models […]

A minimalist residential desk from a 45-degree angle rests a single unplugged GPU card.

Qwen3.6-27B-MTP-UD-GGUF Makes Your GPU Think Ahead

15 May 2026

Havenoammo’s new Qwen3.6-27B-MTP-UD-GGUF package combines Unsloth Dynamic 2.0 XL quantization with grafted Multi-Token Prediction (MTP) layers for the Qwen3.6 27B model. This format enables speculative decoding, where the model predicts […]

A massive sleek smartphone rendered in translucent frosted glass with ghostly thumbnails of photographs.

MiniCPM-V-4.6 Packs Private Visual AI Into Phones

15 May 2026

MiniCPM-V-4.6 is a new open-source multimodal model that brings image and video understanding directly to smartphones and small computers. It answers questions about photos and video clips without a cloud […]

A monolithic chain link constructed of dark brushed steel with violently cracked open by a golden luminous digital matrix.

Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF

12 May 2026

The Qwen3.6-27B-Heretic-Uncensored-FINETUNE-NEO-CODE-Di-IMatrix-MAX-GGUF package delivers an uncensored, performance-enhanced version of Qwen’s latest 27B model in highly accurate compressed formats. This release strips away the original model’s refusal behavior, cutting the refusal […]

A large smooth matte white face silhouette cutout resembling a clean face mask used for swapping.

ComfyUI-Fayens Brings Cinematic Polish to Face Swaps

11 May 2026

ComfyUI-Fayens is a new collection of custom nodes for ComfyUI that streamlines face swap workflows from start to finish. It automatically extracts clean face crops, generates accurate masks, and prepares […]

A matte white sphere with precise surgical scalpel with a brushed titanium handle is making a delicate exact incision.

AEON-7 Unlocks Qwen3.6-27B-AEON-Ultimate-Uncensored-BF16

7 May 2026

Qwen3.6-27B-AEON-Ultimate-Uncensored-BF16 is a high-precision, uncensored large language model designed to follow instructions without refusal. It removes the "safety tax" found in standard models, allowing for more direct reasoning and compliance. […]