Supertonic-3 is a lightweight text-to-speech system that runs entirely on your device using ONNX Runtime, with no cloud calls needed for synthesis. This open-weight release expands language support from 5 […]
News
HiDream-O1-Image is an open-source image generation model that creates, edits, and personalizes visuals without relying on separate compression tools. It uses a Pixel-level Unified Transformer to process raw pixels, text, […]
MiniCPM-V-4.6 is a new open-source multimodal model that brings image and video understanding directly to smartphones and small computers. It answers questions about photos and video clips without a cloud […]
ComfyUI-XAV-Google-Sheets is a new custom node package for ComfyUI that lets you pull text directly from a public Google Sheet. It loads a shared spreadsheet as a data table and […]
Flux.2-Klein-Loras is a fresh bundle of style adapters for the Flux.2 Klein 9b distilled image model. It packs multiple LoRA files that let users generate or edit pictures in distinct […]
Causal-Forcing is a new training method that distills large autoregressive video models into efficient ones that can generate video in real-time. The approach bridges a structural mismatch between teacher and […]
ComfyUI-ReferenceLatentPlus is a custom node that completely replaces ComfyUI’s original ReferenceLatent tool. It gives creators per-image control over how reference images influence the final output during image and video generation. […]
The Comfyui-Clippy-Reloaded add-on by Shootthesound lets you paste images directly from your computer’s clipboard into a ComfyUI workflow. It grabs whatever image you’ve copied — a screenshot, a browser image, […]
The Qwen3.5-9B-DeepSeek-V4-Flash-GGUF is a compressed language model that packs DeepSeek-V4’s advanced reasoning into a 9-billion-parameter package for local use. It converts the full model into the GGUF format, so it […]