CaptionFoundry Free Captioning Tool

CaptionFoundry is a free desktop application that helps users prepare image datasets for AI model training. It uses local vision AI models to automatically generate captions for images, eliminating the need for manual captioning or cloud services.
Created by whatsthisaithing as a side project, this tool addresses the tedious task of organizing and captioning hundreds of images for LORA and fine-tuning workflows. Everything runs entirely on your own computer, keeping your data private with zero API costs.
What CaptionFoundry Offers
- Folder tracking with drag-and-drop support.
- AI auto-captioning using local Ollama or LM Studio vision models.
- Bulk caption editing with find/replace, prepend/append, and regex support.
- Version history with rollback for all caption changes.
- Smart export with sequential numbering and format conversion.
- 100% non-destructive workflow that never modifies original files.
Hobbyists training custom AI models will find this tool useful for preparing datasets without spending hours on manual captioning. Privacy-conscious users can process their images locally without uploading anything to external servers.
A side project built for real workflows
The developer built CaptionFoundry to solve their own pain points with existing captioning tools. They describe it on reddit as
'vibecoded in a day, work in progress'
and note it may not have every feature competitors offer. As a free side project, the developer offers no guarantees but plans to continue supporting it as time allows.
Future updates may include basic video dataset support, though the developer mentions having a day job and family commitments. The application requires Python 3.10+, Node.js 18+, and either Ollama or LM Studio with a vision model loaded.
CaptionFoundry offers a practical solution for anyone preparing image datasets locally without relying on cloud services or paid APIs.
Get CaptionFoundry on GitHub. Image by (whatsthisaithing).