https://huggingface.co/nightmedia/Qwen3.6-27B-Architect-DS9-1M-bf16

#2262
by nightmedia - opened

Dear Team Radermacher,
could you please quant this model?

https://huggingface.co/nightmedia/Qwen3.6-27B-Architect-DS9-1M-bf16

BeamMeUp

'Beam me up': Zeiss ZF-100-T/Nikon D300

This model is a NuSLERP merge using Qwen3.6-27B as a base:

  • nightmedia/Qwen3.5-27B-Engineer-Deckard-Claude-TNG-C
    • nightmedia/Qwen3.5-27B-Engineer-Deckard-Claude
      • DavidAU/Qwen3.5-27B-Deckard-PKD-Heretic-Uncensored-Thinking
      • DavidAU/Qwen3.5-27B-Claude-4.6-OS-INSTRUCT
    • DavidAU/Qwen3.5-27B-Star-Trek-TNG-DS9-Heretic-Uncensored-Thinking
  • DavidAU/Qwen3.5-27B-Claude-4.6-OS-INSTRUCT

Brainwaves

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.678,0.852,0.911
mxfp8    0.690,0.867,0.909
qx64-hi  0.685,0.855,0.903
mxfp4    0.679,0.858,0.911

Quant    Perplexity      Peak Memory   Tokens/sec
bf16     4.017 ± 0.026   60.75 GB      262
mxfp8    4.026 ± 0.026   34.74 GB      178
qx86-hi  3.917 ± 0.025   32.36 GB      180
qx64-hi  4.036 ± 0.026   25.64 GB      218
mxfp4    4.102 ± 0.027   21.30 GB      221

Component metrics

Qwen3.6-27B-Claude-4.6-OS

         arc   arc/e boolq hswag obkqa piqa  wino
bf16     0.683,0.858,0.910,0.797,0.494,0.820,0.755
mxfp8    0.695,0.869,0.910,0.791,0.504,0.824,0.760
qx64-hi  0.688,0.859,0.903

Quant    Perplexity      Peak Memory   Tokens/sec
mxfp8    4.006 ± 0.026   34.74 GB      187  
qx64-hi  4.098 ± 0.027   25.64 GB      208

Qwen3.6-27B-Deckard-Claude-DS9

         arc   arc/e boolq hswag obkqa piqa  wino
mxfp8    0.672,0.845,0.909
qx64-hi  0.685,0.851,0.903

Baseline model

         arc   arc/e boolq hswag obkqa piqa  wino
Qwen3.6-27B-Instruct
qx86-hi  0.637,0.798,0.911,0.775,0.442,0.807,0.737

This model is using the fixed jinja template from froggeric/Qwen-Fixed-Chat-Templates

Thinking toggle

Drop <|think_on|> or <|think_off|> anywhere in your system or user prompt. The template intercepts the tag, removes it from context so the model never sees it, and flips the mode.

Fast answer, no reasoning:

System: You are a coding assistant. <|think_off|>
User: What's 2+2?

Deep reasoning:

System: You are a coding assistant. <|think_on|>
User: Implement a red-black tree in Rust.

The tag syntax (<|think_on|>, <|think_off|>) uses Qwen's control-token delimiters, so it will never collide with real text. Earlier community templates used /think, which broke legitimate paths like cd /mnt/project/think.


I added a similar set of tags for handling the preserve_thinking flag:

  • Drop <|think_forget|> or <|think_remember|> anywhere in your system or user prompt to flip the flag.
  • The template intercepts the tag, removes it from context so the model never sees it, and flips the mode.

-G

It's queued!

You can check for progress at http://hf.tst.eu/status.html or regularly check the model
summary page at https://hf.tst.eu/model#Qwen3.6-27B-Architect-DS9-1M-bf16-GGUF for quants to appear.

Sign up or log in to comment