Seems to be marginally better than gpt-20b, but this is 30b?

strangescript · 2026-01-19T15:50:58 1768837858

I find gpt-oss 20b very benchmaxxed and as soon as a solution isn't clear it will hallucinate.

blurbleblurble · 2026-01-19T16:38:22 1768840702

Every time I've tried to actually use gpt-oss 20b it's just gotten stuck in weird feedback loops reminiscent of the time when HAL got shut down back in the year 2001. And these are very simple tests e.g. I try and get it to check today's date from the time tool to get more recent search results from the arxiv tool.

lostmsu · 2026-01-19T16:00:52 1768838452

It actually seems worse. gpt-20b is only 11 GB because it is prequantized in mxfp4. GLM-4.7-Flash is 62 GB. In that sense GLM is closer to and actually is slightly larger than gpt-120b which is 59 GB.

Also, according to the gpt-oss model card 20b is 60.7 (GLM claims they got 34 for that model) and 120b is 62.7 on SWE-Bench Verified vs GLM reports 59.7