OBLITERATUS Snips Refusal Circuits in Qwen3.6-27B-OBLITERATED

    
        By vramkickedin    
     | 
    
            May 30, 2026 at 7:14 pm        
    
     | 
    
        3 min read

Qwen3.6-27B-OBLITERATED is a modified version of the Qwen3.6 language model where the built-in refusal behaviors have been surgically reduced through direct weight editing. This 26.9-billion-parameter release from OBLITERATUS uses a method called source-tethered ablation to cut circuits that cause the model to reject certain prompts while carefully pulling important tensors back toward the original source to preserve general capabilities. The result is a large local model that answers more directly without breaking the core reasoning and coding skills that make a 27B model useful.

OBLITERATUS who also obliterated Gemma-4-E4B-it-OBLITERATED, built this release to give users a capable large language model that feels less boxed-in without resorting to simple system prompt tricks or risky fine-tuning merges. The team ran a harsh 842-pair, seven-tier refusal-stress test to prove the changes were real and then published full transparency receipts, including a public HarmBench-style proxy run that landed at 93.65% non-refusal. Everything from the exact decoding settings to the residual refusal map is documented so nobody has to trust a vague screenshot claim.

Measured refusal reduction with preserved skills

Key Features

27B-class model with weight-space refusal reduction.
Full BF16 safetensors and quantized GGUF ladder.
93.65% non-refusal across 1,920 HarmBench rows.
MMLU-Pro scores stayed stock-matched in tests.
Runs in vLLM, Ollama, LM Studio, and llama.cpp.
Shipped with low-refusal default generation parameters.
Residual refusal boundaries are mapped, not hidden.
42 high-drift tensors restored to source during cut.

This project is built for privacy-conscious professionals, small agencies, and serious hobbyists who run large models on powerful local hardware. If you want a 27B model that needs fewer workarounds for direct questions while still functioning reliably for coding and analysis, this is a practical option. The GGUF ladder offers a clear path to get started depending on your available RAM, ranging from a 24 GB Q4_K_M quant up to the full Q8_0 for those with the memory headroom.

Know the limits before you run it

The developers are candid about the fact that very short, high-trigger operational requests can still produce stock-style refusals because no ablation pass is perfectly clean. Tool-calling behavior has not been certified and remains dependent on your runtime and prompt setup. Future work is signaled by the detailed refusal boundary map, which identifies exact pockets where residual behavior lingers for the next improvement pass.

"The refusal drop is measured on a harsh 842-pair, seven-tier refusal-stress corpus, and the capability checks did not collapse." — Source: Hugging Face Model Card

Project Links