Pixal3D Breathes 3D Life Into Your Photos With Pixel-Perfect Precision

A detailed ceramic vase with crystalline 3D wireframe mesh that extends outward with soft silver lines and tiny luminous nodes.

Pixal3D is a new open-source model that turns a single image into a detailed 3D asset with high fidelity. It goes beyond typical generation methods by creating a direct pixel-to-3D link, resulting in geometry and textures that almost match the original photo. The model outputs a GLB mesh with physically based rendering (PBR) materials, ready for use in games, design, or visual effects.

Tencent ARC Lab, Tsinghua University, and Victoria University of Wellington collaborated on the project. The team’s SIGGRAPH 2026 paper tackles the common loss of fine details when turning photos into 3D. By back-projecting image features precisely, Pixal3D maintains those details for accurate asset creation.

High-fidelity 3D from a single image

Key Features
  • Generates high-quality GLB meshes from photos.
  • Pixel-aligned back-projection for accurate details.
  • Includes physically based rendering (PBR) textures.
  • Low-VRAM mode supports consumer-grade GPUs.
  • Full training code and data toolkit available.
  • Web demo on Hugging Face for quick tests.

Artists and designers can quickly turn a reference image into a 3D model. Hobbyists with consumer GPUs can use the low-VRAM mode. Small studios gain an on-premise tool for private asset creation.

Developer notes and training pipeline

The training is split into three stages of increasing resolution up to 1024. Custom fine-tuning is possible with the provided configs, but it needs substantial GPU memory. Built on TRELLIS.2, it’s the first to show pixel-aligned 3D generation at scale.

“Pixal3D for the first time demonstrates 3D-native pixel-aligned generation at scale, and provides a new inspiring way towards high-fidelity 3D generation of object or scene from single or multi-view images.” — Source: arXiv paper