Saving Space with Z-Image Base NVFP4 by marcorez8

Wallpaper graphic of text for Z-Image Base NVFP4

Marcorez8 has released Z-Image Base NVFP4, a series of quantized models based off of the Z-Image model designed to reduce the storage footprint of the original 12.3 GB BF16 model down to sizes from 3.5 GB. This particular model targets ComfyUI users, notably designed for the Blackwell RTX 5080 or 5090 GPUS.

The project specifically utilizes the NVFP4 (4-bit NormalFloat) format to compress the 6B parameter diffusion transformer based on the NextDiT architecture. By implementing these quantized versions, users can select specific trade-offs between quality and hardware requirements, though the project notes that the Mixed and Full variants currently yield poor quality results.

Core Features & Technical Capabilities

  • Four distinct model variants: Ultra (~8.0 GB), Quality (~6.5 GB), Mixed (~4.5 GB), and Full (~3.5 GB).
  • NVFP4 (4-bit NormalFloat) quantization format optimized for specific hardware.
  • Selective layer quantization prioritizing attention layer integrity in recommended variants.
  • Based on NextDiT architecture with 30 main transformer layers and 3840 hidden dimensions.
  • ComfyUI integration with specific dependency requirements (comfy-kitchen >= 0.2.7).

Developer Insights

Technical details provided by the developer highlight the sensitivity of specific neural network components during the compression process.

'The key insight is that attention layers (qkv, out) are much more sensitive to quantization than feed_forward layers (w1, w2, w3),'

states the project description, explaining why the recommended variants prioritize keeping attention layers in higher precision.

This approach allows the 'Ultra' and 'Quality' models to maintain structural integrity while significantly reducing the model's footprint on disk. Furthermore, the project warns that this is Z-Image Base, not Turbo, advising users to utilize

'28-50 steps with CFG guidance, not 8 steps like Turbo.'

It is a clear directive intended to prevent user error during implementation.

Market Impact & Hardware Ecosystem

This release specifically targets the emerging hardware ecosystem surrounding NVIDIA's Blackwell architecture, utilizing the exclusive NVFP4 feature set to maximize efficiency. The developer explicitly notes that

'NVFP4 is a Blackwell-exclusive feature,'

limiting the immediate user base to those with specific high-end hardware configurations.

Learn more about Z-Image Base NVFP4?