Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

My biggest problem with LLM's at this point is that they produce different and inconsistent results or behave differently, given the same prompt. The better grounding would be amazing at this point. I want to give an LLM the same prompt on different days and I want to be able to trust that it will do the same thing as yesterday. Currently they misbehave multiple times a week and I have to manually steer it a bit which destroys certain automated workflows completely.




It sounds like you have dug into this problem with some depth so I would love to hear more. When you've tried to automate things, I'm guessing you've got a template and then some data and then the same or similar input gives totally different results? What details about how different the results are can you share? Are you asking for eg JSON output and it totally isn't, or is it a more subtle difference perhaps?

> I want to give an LLM the same prompt on different days and I want to be able to trust that it will do the same thing as yesterday

Bad news, it's winter now in the Northern hemisphere, so expect all of our AIs to get slightly less performant as they emulate humans under-performing until Spring.


You need to change the temperature to 0 and tune your prompts for automated workflows.

It doesn’t really solve it as a slight shift in the prompt can have totally unpredictable results anyway. And if your prompt is always exactly the same, you’d just cache it and bypass the LLM anyway.

What would really be useful is a very similar prompt should always give a very very similar result.


This doesn't work with the current architecture, because we have to introduce some element of stochastic noise into the generation or else they're not "creatively" generative.

Your brain doesn't have this problem because the noise is already present. You, as an actual thinking being, are able to override the noise and say "no, this is false." An LLM doesn't have that capability.


Well that’s because if you look at the structure of the brain there’s a lot more going on than what goes on within an LLM.

It’s the same reason why great ideas almost appear to come randomly - something is happening in the background. Underneath the skin.


That’s a way different problem my guy.

have you tried this? this doesnt work because the way inference runs at big companies. its not just running your query in isolation.

maybe it can work if you are running your own inference.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: