BigStationW Delivers ComfyUi-Untwisting-RoPE For Style Without Copying

BigStationW has released ComfyUi-Untwisting-RoPE, a custom node for ComfyUI that brings training-free style transfer to diffusion transformer (DiT) models. The tool tackles a common problem where shared attention mechanisms accidentally copy reference image content instead of applying only its style. It does this by intelligently controlling the frequency bands inside rotary positional embeddings, a technique described in a recent research paper.
You may recognize ComfyUI-NAG-Extended also from the developer BigStationW, packaged the paper’s method into a ready-to-use ComfyUI node. The original work by Aryan Mikaeili and colleagues analyzed why high-frequency components in RoPE force queries to lock onto spatially matching reference tokens, leading to unwanted content duplication. By selectively modulating those frequency bands, the node lets the model pay attention based on semantic meaning rather than rigid position matching, so stylistic cues transfer without copying subjects or scenes.
A smarter approach to style transfer
- Training-free, no fine-tuning required.
- Works with Flux.2, Qwen-Image/Edit, Z-image, Anima.
- Separates style from content during generation.
- Adjustable frequency modulation for fine control.
- Simple installation as a ComfyUI custom node.
- Based on novel RoPE frequency analysis.
- Prevents unintended reference image copying.
Artists and studios running models on their own hardware can generate styled images without sending any data to cloud services. The node gives privacy-conscious professionals full control over the style transfer process inside ComfyUI, with no extra training costs or complex setup. Hobbyists with consumer GPUs benefit as well, because it works with popular local DiT models they already have installed.
Behind the code and research
The entire node is a direct, lightweight translation of the paper’s mathematical insight and does not require downloading additional models. Its effectiveness depends on the underlying base model, and results may vary across different DiT architectures. Future improvements could expand the list of supported models as the research community continues to explore frequency-based attention control.
“we introduce a method for selectively modulating RoPE frequency bands so that attention reflects semantic similarity rather than strict positional alignment.” — Source: Paper