ImageTagger Debuts to Clean Machine Learning Datasets

Large slate colored metalic tag with the words ImageTagger engraved.

ImageTagger is a desktop annotation tool designed for managing image and text pairs, specifically built for machine learning dataset curation workflows. The application provides a streamlined interface for teams and solo practitioners who need to maintain clean, consistent, and model-ready image caption or tagging datasets.

The tool was created as a replacement for TagGUI, an abandoned project that served as the primary inspiration for the core UI layout. Developer artemyvo built ImageTagger to fill the gap left by TagGUI while adding new capabilities like Ollama integration for AI-assisted annotation workflows.

Core ImageTagger functionality and workflow

  • Fast 3-pane workflow with image thumbnail previews and tag input panels.
  • File-paired annotation editing for image and txt sidecar files.
  • Ollama connectivity for connecting to local or remote vision models.
  • Batch generation for tags and description text at scale.
  • Validation pipeline for identifying issues in existing annotations.
  • Merge-based fixup dialog for safely reviewing and applying corrections.

Machine learning engineers maintaining image-caption datasets and researchers running iterative cleanup before training will find the validation and merge workflow particularly useful. The tool helps identify annotation issues early and supports continuous maintenance for large collections that accumulate noisy or inconsistent descriptions over time.

Developer notes and platform support

The developer notes that ImageTagger is intentionally smaller in scope than TagGUI and does not currently implement advanced tag operations like regex-based filtering or tag replacement flows. These features may be added in future development. The deepest addition beyond the original inspiration is Ollama integration, allowing users to connect to a server and choose models directly within the app. The developer states,

'hooking that to vision-enabled models produces interesting results.'

Based on project usage, the developer recommends Qwen3-VL 8B for the best balance of performance and annotation accuracy, as it fits well on consumer GPUs with 16GB VRAM. ImageTagger is expected to be cross-platform since it is built with Python and PyQt6, though development has focused primarily on Windows with Linux and macOS support presumed but not thoroughly tested.

Learn more about ImageTagger on GitHub.