I have been using Qwen3.5-35B-A3B a lot in local testing, and it is by far the m...

Hugsun · 2026-04-17T11:43:47 1776426227

Unfortunately, llama.cpp quantization technology has been stagnant for two years. The main quantization developer left or was kicked out of llama.cpp due to an attribution dispute. He created his own fork ik_llama.cpp where he has made multiple new and better quants.

unsloth and byteshape are just using and highlighting features that have been available the whole time. I am very invested in figuring out a solution to this dispute, or some way to get the new quants upstreamed.

kanemcgrath · 2026-04-16T23:42:18 1776382938

Now that I have tried out on a few tasks, Qwen3.6 is a huge jump in capability. It can make improvements to a project that qwen3.5 always struggled with.

burgertea · 2026-04-18T01:31:56 1776475916

Could you share more about your config? I've also got a 3060 12gb and 64gb of ram, but I've never got local models running well enough to be useful

edg5000 · 2026-04-17T07:44:50 1776411890

What can and what can't it do compared to Codex and CC?

mettamage · 2026-04-17T11:26:34 1776425194

who do you compare it against qwen3.5 27b?

kanemcgrath · 2026-04-17T20:12:08 1776456728

I haven't ran 27b that much because it only runs at like 2 tokens/sec on my computer.

jadbox · 2026-04-16T23:44:00 1776383040

Which one is best?

kanemcgrath · 2026-04-17T00:24:49 1776385489

I would say byteshape is smaller and faster, I can’t really notice a quality difference. But I haven’t used it as much as I only started using it a few days ago.