https://huggingface.co/nightmedia/Qwen3.6-27B-Architect-DS9-1M-bf16
Dear Team Radermacher,
could you please quant this model?
https://huggingface.co/nightmedia/Qwen3.6-27B-Architect-DS9-1M-bf16
'Beam me up': Zeiss ZF-100-T/Nikon D300
This model is a NuSLERP merge using Qwen3.6-27B as a base:
- nightmedia/Qwen3.5-27B-Engineer-Deckard-Claude-TNG-C
- nightmedia/Qwen3.5-27B-Engineer-Deckard-Claude
- DavidAU/Qwen3.5-27B-Deckard-PKD-Heretic-Uncensored-Thinking
- DavidAU/Qwen3.5-27B-Claude-4.6-OS-INSTRUCT
- DavidAU/Qwen3.5-27B-Star-Trek-TNG-DS9-Heretic-Uncensored-Thinking
- nightmedia/Qwen3.5-27B-Engineer-Deckard-Claude
- DavidAU/Qwen3.5-27B-Claude-4.6-OS-INSTRUCT
Brainwaves
arc arc/e boolq hswag obkqa piqa wino
bf16 0.678,0.852,0.911
mxfp8 0.690,0.867,0.909
qx64-hi 0.685,0.855,0.903
mxfp4 0.679,0.858,0.911
Quant Perplexity Peak Memory Tokens/sec
bf16 4.017 ± 0.026 60.75 GB 262
mxfp8 4.026 ± 0.026 34.74 GB 178
qx86-hi 3.917 ± 0.025 32.36 GB 180
qx64-hi 4.036 ± 0.026 25.64 GB 218
mxfp4 4.102 ± 0.027 21.30 GB 221
Component metrics
Qwen3.6-27B-Claude-4.6-OS
arc arc/e boolq hswag obkqa piqa wino
bf16 0.683,0.858,0.910,0.797,0.494,0.820,0.755
mxfp8 0.695,0.869,0.910,0.791,0.504,0.824,0.760
qx64-hi 0.688,0.859,0.903
Quant Perplexity Peak Memory Tokens/sec
mxfp8 4.006 ± 0.026 34.74 GB 187
qx64-hi 4.098 ± 0.027 25.64 GB 208
Qwen3.6-27B-Deckard-Claude-DS9
arc arc/e boolq hswag obkqa piqa wino
mxfp8 0.672,0.845,0.909
qx64-hi 0.685,0.851,0.903
Baseline model
arc arc/e boolq hswag obkqa piqa wino
Qwen3.6-27B-Instruct
qx86-hi 0.637,0.798,0.911,0.775,0.442,0.807,0.737
This model is using the fixed jinja template from froggeric/Qwen-Fixed-Chat-Templates
Thinking toggle
Drop <|think_on|> or <|think_off|> anywhere in your system or user prompt. The template intercepts the tag, removes it from context so the model never sees it, and flips the mode.
Fast answer, no reasoning:
System: You are a coding assistant. <|think_off|>
User: What's 2+2?
Deep reasoning:
System: You are a coding assistant. <|think_on|>
User: Implement a red-black tree in Rust.
The tag syntax (<|think_on|>, <|think_off|>) uses Qwen's control-token delimiters, so it will never collide with real text. Earlier community templates used /think, which broke legitimate paths like cd /mnt/project/think.
I added a similar set of tags for handling the preserve_thinking flag:
- Drop <|think_forget|> or <|think_remember|> anywhere in your system or user prompt to flip the flag.
- The template intercepts the tag, removes it from context so the model never sees it, and flips the mode.
-G
It's queued!
You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwen3.6-27B-Architect-DS9-1M-bf16-GGUF for quants to appear.
