ACE-Step 1.5 ComfyUI Generates Songs Locally
ACE-Step 1.5 ComfyUI brings commercial-grade music generation to local machines. This open-source audio model now runs natively in ComfyUI and can create full songs in under 10 seconds using standard consumer hardware.
The ACE-Step team developed this update to provide high-quality AI music generation. It requires the latest version of ComfyUI and can run on less than 4GB of VRAM, making it accessible to users with typical gaming GPUs.
Model Size: Not specified & VRAM GPU: <4GB required
What ACE-Step 1.5 Offers
- Generates full 4-minute songs.
- Supports over 50 languages with strong prompt adherence.
- LoRA fine-tuning allows creators to train personalized styles.
- Chain-of-Thought reasoning guides the diffusion process.
- Hybrid architecture combines a Language Model planner with a Diffusion Transformer.
What's Coming Next
The developers have previewed two features that are not yet supported in ComfyUI. 'Cover' lets users input any song with a new prompt to reimagine it in a completely different style, while 'Repaint' allows fixing specific sections of a generated track without regenerating the entire composition.
Regarding these upcoming features, the team noted that they
'have no doubt the community will figure it out.'
Users should note that the model can produce inconsistent results. Generating multiple samples with batch sizes of 8 or 16 helps find the best output. Starting with 90–120 second durations produces more consistent results than longer tracks.
Read more aobutr ACE-Step 1.5 ComfyUI on the project page and download files from Hugging Face.