Kezmark Drops ErniePEUnleashed To Craft Cinematic Scene Prompts

ErniePEUnleashed is a fine-tuned prompt enhancement model that transforms short ideas into richly detailed, spatially structured descriptions for AI image generation. It pays attention to foreground-midground-background layering, lighting logic, camera angles, and material textures rather than just piling on keywords. The model also weaves in specific art styles from a curated list, giving creators tight control over the final look.
Developer Kezmark built this as a custom fine‑tune of the ERNIE‑Image prompt enhancer from Baidu, using a unique dataset of 3,804 handcrafted composition examples. The goal was to teach genuine compositional reasoning instead of simple prompt expansion. Originally targeted at Qwen‑Image‑2512 workflows, it works with any model that accepts a single, long‑form prompt.
Composition‑aware enhancement with art style control
- Turns short prompts into spatially structured scenes.
- Describes foreground, midground, and background layers.
- Specifies lighting, camera angles, and material textures.
- Infuses one of 42 trained art styles exactly.
- Works as a standalone composition builder.
- Fine‑tuned on 3,804 purpose‑built examples.
- Runs locally in ComfyUI with no system‑prompt fuss.
Pro‑consumer GPU owners and serious hobbyists can use this tool to get cinematic‑quality compositions without relying on cloud services. Privacy‑minded professionals benefit from fully local inference that never leaves their machine. Small agencies and non‑dev power users gain a way to produce studio‑consistent, art‑style‑aware prompts with minimal editing.
How the model was trained and what to expect
Kezmark trained the model using a full 16‑bit LoRA of rank 32 on the Ministral‑3‑3B‑Instruct‑2512 base, keeping it light enough for consumer hardware. The dataset combines manually written and AI‑assisted compositions, built with 24 local and 8 online models under strict rules to maintain structural quality. For the best art‑style results, users should copy the exact full name from the provided list; without a style name the model produces detailed but neutral compositions.
“I trained the ernie‑image prompt enhancer with a custom built dataset of 3804 unique and highly detailed examples of compositions, including art style integrations (about 3 million token's worth total) if anyone wants to try it.” — Source: Reddit