Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
GaggiX
63 days ago
|
parent
|
context
|
favorite
| on:
Qwen3.6-27B: Flagship-Level Coding in a 27B Dense ...
At 4-bit quantization it should already fit quite nicely.
Aurornis
63 days ago
[–]
Unfortunately not with a reasonable context length.
regularfry
63 days ago
|
parent
|
next
[–]
I've got 139k context with the UD-Q4_K_XL on a 4090, q8_0 ctk/v. Could probably squeeze a little more but that's enough for me for the moment.
corysama
63 days ago
|
root
|
parent
|
next
[–]
Hey, buddy! Can I bum a command line arg list off ya?
GaggiX
63 days ago
|
parent
|
prev
|
next
[–]
The model uses Gated DeltaNet and Gated Attention so the memory usage of the KV cache is very low, even at BF16 precision.
kkzz99
63 days ago
|
parent
|
prev
[–]
It really depends on what you think a reasonable context length is, but I can get 50k-60k on a 4090.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: