Overview
- Qwable 27B is a thoroughly fine-tuned variant of Alibaba’s Qwen3.6-27B, built using a Fable 5-style reasoning dataset, engineered to mirror the methodical, structured thought process of Anthropic’s latest premier model.
- The abliterated release strips away the model’s default refusal tendencies by surgically altering its weights through llama.cpp’s cvector-generator tool.
- Both versions operate locally on your own hardware, incur no per-query fees, and have zero dependence on Anthropic’s API or its compulsory usage policies.
Over the past week, Anthropic found itself issuing apologies for Fable 5’s hidden safety mechanisms, and then the U.S. government mandated the model’s removal from access for all foreign nationals following a contested jailbreak discovery.
Just days afterward, a developer on Hugging Face published a model that harnessed Fable’s reasoning approach to steer a local model—and now even a low-powered machine can run a superior alternative.
The model goes by the name Qwable—a blend of Qwen and Fable, in case the portmanteau wasn’t immediately clear. It’s a comprehensive fine-tune of Alibaba’s Qwen3.6-27B foundation, created by a developer known as Mia (Mia-AiLab on Hugging Face) using a dataset composed of Fable 5-style reasoning samples. The aim is a 27-billion-parameter model that operates on everyday hardware and reasons the way Fable 5 does. (Parameters define a model’s scope of knowledge, with a higher count generally indicating greater capability.)
So I did a thing.
I have trained Qwen 3.6 27b with Fable 5 reasoning.
Results are… interesting.I will compare both of them side by side.
Would anyone be interesting in testing it? I can upload a gguf in hf. pic.twitter.com/hQCiUlT1sr
— Mia (@MiaAI_lab) June 15, 2026
The method employed is known as instruction fine-tuning on trace-style examples. Put simply, the developer gathered samples structured like Fable 5’s careful, step-by-step responses and trained Qwen to generate output in that same fashion.
So view it less as “cheating off someone’s test” and more as “adopting the study methods.” A comparable strategy powered Qwopus—the Claude Opus 4.6 local distillation—though that effort concentrated on chain-of-thought reasoning traces. Qwable is aimed at Fable 5’s broader instruction-following pattern: more instructive, more thorough in its explanations, and more focused on methodical step-by-step task resolution than the base Qwen model it derives from.
It ships in GGUF format—the compressed, user-friendly file type compatible with LM Studio or llama.cpp—and occupies approximately 16.5 GB in its Q4 quantized build. No data gets sent to Anthropic’s servers, a significant detail considering that Fable 5 enforced a mandatory 30-day data retention policy on all traffic, even for enterprise clients who had previously held zero-retention agreements. Even existing models rely on third-party servers to handle your inputs and prompts.
Then, soon after Qwable surfaced on Hugging Face, another contributor showed up to push things further.
Qwable stripped of its guardrails
Qwable is a censored model. Then again, both Qwen and Claude are. However, Qwen, serving as the base model, is open source, which means it can be freely modified and adjusted.
Huihui-ai, an open-source contributor recognized for releasing uncensored GGUF builds, took Qwable and performed a technique called abliteration to produce Huihui-Qwable-3.6-27b-abliterated. The result is a model that reasons like Fable but won’t decline to respond to your prompts, regardless of how unusual or hazardous they may be.
This isn’t a jailbreak. It’s a surgical procedure.
Every fine-tuned AI model harbors a refusal direction encoded within its weights—a mathematical signal in the model’s internal activations that triggers when it recognizes a request it’s been conditioned to reject. Abliteration pinpoints that signal by running the model across extensive sets of harmful and harmless prompts, measuring how the internal computations diverge between them, and then adjusting the model weights to erase that divergence.
Once the procedure is complete, the model simply lacks the refusal machinery any longer. The altered model retains its full functionality, just without the neural pathways that generate the “I shouldn’t do this” responses.
We put it to the test using one of our go-to benchmarks, and rather than declining, the model broke the topic down into several angles, providing accurate guidance on how to cheat on a partner with her closest friend.

Huihui-ai took the technique and applied it straight to the Qwable GGUF file using llama.cpp’s cvector-generator—no need for a Python setup, no full-model retraining, and no cloud-server rental required.
Why would anyone want this?
The standard release of Qwable is well-suited for coding help, technical troubleshooting, and any task where you’d prefer a model that walks through its thought process instead of simply spitting out an answer. It’s built for local agent deployments and compatible with most local inference runtimes. If you’re already using LM Studio, it’s as simple as searching and downloading.
The abliterated variant serves a more specific crowd: security analysts who need unfiltered model responses without provider-level guardrails, synthetic-data pipelines that must generate outputs on sensitive subjects, and evaluation scenarios where you want to assess the model’s raw capabilities without the interference ofin content policies.
A less technical example? Set aside the typical scenario of running an NSFW AI companion with reasoning abilities on par with Claude Fable — that’s the obvious use case. Picture this: you need the model to craft a morally grey villain’s speech for your Dungeons & Dragons session, and conventional models keep butting in to point out that the character’s philosophy “raises ethical questions worth discussing.” The abliterated variant simply writes the villain. Plus, because it operates locally on your own hardware, no government agency can yank it off your machine in the middle of the night over a contested jailbreak ruling.
Naturally, there are more dubious applications as well. We don’t endorse any of them, and we won’t be offering suggestions.
Huihui-ai’s model card makes it clear: this is intended solely for research and controlled settings. With diminished safety filtering, the outputs may be provocative, contentious, or unsuitable, and all legal and ethical accountability rests with the end user.
The abliterated Qwable is currently available on Hugging Face in three different builds. The recommended Q4_K_M_Q8 variant comes in at roughly 19 GB, making it the most compact and consumer-friendly choice.
If your system is capable, there’s a build that supports multi-token prediction, which will make the model respond dramatically faster.
Daily Debrief Newsletter
Start every day with the top news stories right now, plus original features, a podcast, videos and more.



