Unsloth quantizes Z Image GGUF for Creators

Extreme close up of the unsloth logo graphic

unsloth has released Z Image GGUF, a quantized version of the Z-Image foundation model tailored for efficient local execution on smaller GPUs. This implementation utilizes the 'Unsloth Dynamic 2.0 methodology for SOTA performance' while ensuring that 'important layers are upcasted to higher precision' to maintain model integrity.

By leveraging tooling from ComfyUI-GGUF by city96, the project adapts the Single-Stream Diffusion Transformer for broader hardware compatibility. The release provides a 'full-capacity, undistilled transformer' designed to serve as a robust backbone for creative and development workflows.

Core features & capabilities

  • Fits under 8GB VRAM GPU cards, that's less than Tongyi-MAI's original Z-Base model.
  • Supports full Classifier-Free Guidance (CFG) for precise prompt engineering.
  • Operates across a wide aesthetic spectrum including photography and anime.
  • Provides enhanced output diversity for distinct facial identities and lighting.
  • Allows for robust negative prompting to suppress unwanted artifacts.
  • Functions as a suitable base for LoRA training and structural conditioning.
  • Requires 28 to 50 steps for generation, prioritizing quality over speed.

What the developer says

The project documentation emphasizes the importance of a non-distilled approach for professional creative work. The source material states that

'as a non-distilled base model, Z-Image preserves the complete training signal.'

This design serves a specific segment of the user base, as the model is

'designed to be the backbone for creators, researchers, and developers who require the highest level of creative freedom.'

Looking for Z Image GGUF?