Hacker Newsnew | past | comments | ask | show | jobs | submit | mingodad's commentslogin

Interesting enough today I found that https://huggingface.co/openai/gpt-oss-20b when downloaded an running locally and asked:

====

when have you been built?

I’m a GPT‑4‑based model that OpenAI released on March 14 2023. The underlying training data goes up to early 2023, and the model was trained in the months leading up to that release. If you’re asking about the ChatGPT product that ships the model to users, it went live in November 2022 and has since received updates (GPT‑3.5, GPT‑4, etc.) that keep it current.

====

But when supposedly running it from https://huggingface.co/chat/models/openai/gpt-oss-20b:

====

when have you been built?

I’m a language model created by OpenAI. The current generation (GPT‑4) that powers this chat was first released in March 2023 and has been updated and fine‑tuned up through the end of 2024. My training data runs up to the beginning of June 2025, so I’m built on knowledge available up to that point.

====

And that makes me thinking that although https://huggingface.co/chat claims to be using the models available to public at https://huggingface.co , it doesn't seems to be true and I raised this question here https://huggingface.co/ggml-org/gpt-oss-20b-GGUF/discussions... , https://github.com/huggingface/inference-playground/issues/1... and https://github.com/ggml-org/llama.cpp/discussions/15396#disc... .


I did an experiment while learning about LLMs and llama.cpp consisting in trying to use create a Lua extension to use llama.cpp API to enhance LLMs with agent/RAG written in Lua with simple code to learn the basics and after more than 5 hours chatting with https://aistudio.google.com/prompts/new_chat?model=gemini-3-... (see the scrapped output of the whole session attached) I've got a lot far in terms of learning how to use an LLM to help develop/debug/learn about a topic (in this case agent/RAG with llama.cpp API using Lua).

I'm posting it here just in case it can help others to see and comment/improve it (it was using around 100K tokens at the end and started getting noticeable slow but still very helpful).

You can see the scrapped text for the whole seession here https://github.com/ggml-org/llama.cpp/discussions/17600


I've asked to index my project https://github.com/mingodad/parsertl-playground and the result https://deepwiki.com/mingodad/parsertl-playground seems to be reasonable good (still going through in more detail but overall impressive).



Here I've got it to work with recent compilers/OSs https://github.com/mingodad/cfront-3


I'm collecting a collection of PEG grammars here https://mingodad.github.io/cpp-peglib and Yacc/Lex grammars here https://mingodad.github.io/parsertl-playground/playground both are wasm based playgrounds to test/develop/debug grammars.

The idea is to improve the tooling to work with grammars, for example generating railroad diagrams, source, stats, state machines, traces, ...

On both of then select one grammar from "Examples" then click "Parse" to see a parse tree or ast for the content in "Input source", then edit the grammar/input to test new ideas.

There is also https://mingodad.github.io/plgh/json2ebnf.html to generate EBNF for railroad diagram generation form tree-sitter grammars.

Any feedback, contribution is welcome !


This is awesome! I've recently begun diving deeper into working with grammars, using them as part of a new project, and these tools look super useful.


There is also https://github.com/ricomariani/CG-SQL-author that has a powerful stored procedure capabilities that can be transpiled to C/Lua/..., you can try it in your browser here https://mingodad.github.io/CG-SQL-Lua-playground .


There is also https://sciter.com/ that the author tried to find finance to make it opensource but couldn't find enough supporters.


It's not a browser.


I do have a big collection of LALR(1) grammars to test/study/develop/document here https://mingodad.github.io/parsertl-playground/playground/ including sqlite, tidb, vites, postgresql, mysql, ...


Not exactly the same but here https://github.com/robertoraggi/cplusplus there is one person serious effort to create a C++-23 compiler front end.


It does look serious, and is able to parse one simple case I got at hand that openwatcom and orangec failed, thanks for bringing it up.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: