Qwen3.5 122B A10B Uncensored HauhauCS Aggressive Defies Limits

    
        By vramkickedin    
     | 
    
            March 30, 2026 at 6:04 pm        
    
     | 
    
        2 min read

Qwen3.5-122B-A10B-Uncensored-HauhauCS-Aggressive is a large language model modified to remove all refusal responses while keeping its original capabilities intact. The release achieves zero refusals across 465 tested prompts without any degradation in performance. This means users get the full original model functionality, just without the typical safety restrictions that block certain outputs.

HauhauCS developed this version for users who need unrestricted model access for their work. The project took several weeks of continuous development to achieve lossless uncensoring. It runs on a Mixture of Experts architecture where only about 10 billion parameters are active at any given time, making it more efficient than its full 122 billion parameter count suggests.

Model Size: 122B parameters & VRAM GPU: requirements vary

Aggressive features of this release

122B total parameters with roughly 10B active per forward pass using Mixture of Experts design.
262K context window for handling long conversations and documents.
Multimodal support for text, image, and video inputs.
Zero refusals with no capability loss compared to the original.
New K_P quantizations offering better quality at slightly larger file sizes.
Multiple download options ranging from 40GB to 145GB for different hardware needs.

Professionals running local AI systems who need unrestricted outputs may find this model valuable. The variety of quantization options lets users match file sizes to their available storage and memory, though running larger variants still requires substantial hardware.

What the developer says about the process

HauhauCS described the creation process as demanding extensive work over multiple weeks. The final product shows no looping issues or quality problems in testing. This release also introduces K_P ('Perfect') quantizations, which use model-specific analysis to preserve quality where it matters most. These custom files perform about one to two quantization levels better than standard versions at only 5-15% larger sizes.

'This one was absolutely brutal.....Several weeks of literal nonstop work'

HauhauCS said about the development effort.

Users running llama.cpp should use the --jinja flag for proper template handling. The model's thinking mode is on by default, but users can disable it through specific command settings. HauhauCS also mentioned that Gemma3 is next on their roadmap, responding to community requests.

You can find Qwen3.5-122B-A10B-Uncensored-HauhauCS-Aggressive on Hugging Face.