Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The one issue is the accuracy of these AI models, which is that you can't -really- trust them to do a task fully, so that makes it hard to fully automate things with them. But the other is cost. Anyone using these models to do something at scale is paying maybe 100X over would it would cost in compute to run deterministic code to do the same thing. So in cases where you can write deterministic code to do something, or build a UI for a user to do it themselves, that still seems to be the best way. Once AI gets to the point where you can fully trust some model, then we've probably already hit AGI and at that point we're probably all in pods with a cable in our brainstems, so who cares...


The thing is that I don't use AI to replace things I can do deterministically with code. I use it to replace things I cannot do deterministically with code - often something I would have a person do. People are also fallible and can't be completely trusted to do the thing exactly right. I think it works very well for things that have a human in the loop, like codeing agents where someone needs to review changes. For instance, I put an agent in a tool for generating aws access policies from english descriptions or answering questions about current access (where they agent has access to tools to see current users, buckets policies etc). I don't trust the agent to do it exactly right so it just proposes the policies and I have to accept or modify them before they are applied, but its still better than writing them myself. And it's better than having a web interface do it because that is lacking context.

I think it's a good example of the kind of internal tools the article is talking about. I would not have spent the time to build this without claude making it much faster to build stand-alone projects and I would not have the agent to do the english -> policy output with LLMs.


>> The thing is that I don't use AI to replace things I can do deterministically with code. I use it to replace things I cannot do deterministically with code - often something I would have a person do.

Nailed it. And the thing is, you can (and should) still have deterministic guard rails around AI! Things like normalization, data mapping, validations etc. protect against hallucinations and help ensure AI’s output follows your business rules.


> Things like normalization, data mapping, validations etc. protect against hallucinations

And further downstream: Audit trails, human sign-offs, operations which are reversible or have another workflow for making compensating actions to fix it up.


Or, you could make a tool that can generate this stuff deterministically every time the exact same way. At least with that situation you can audit the tool and see if it is correct or not. You still leave the point of failure on the user in your situation, even higher because they could get complacent with the llm output and assume it is correct or mistakenly think it is correct.

In my mind you are trading potentially a function that always evaluates the same for a given f(x) for one that might not evaluate the same and requires oversight.


So how would you implement "generating aws access policies from english descriptions" using deterministic code in a way that doesn't require human oversight?


> I think it works very well for things that have a human in the loop, like codeing agents where someone needs to review changes

This is the best case for AI, it's not very different from the level 3 autonomous car with driver in the loop instead of fully autonomous level 5 vehicle that probably requires AGI level of AI.

The same applies to medicine where limited number specialists (radiologist/cardiologist/oncologist/etc) in the loop are being assisted by AI for activities that probably require too much time for experts manually looking at laborious evidences especially for non-obvious early symptom detection (X-ray/ECG/MRI) for the modern practice of evidence based medicine.


> Anyone using these models to do something at scale is paying maybe 100X over would it would cost in compute to run deterministic code to do the same thing

That's fine if the person wouldn't be able to write the code otherwise.

There are lots and lots of people in positions that are "programming adjacent". They use computers as their primary tool and are good at something (like CAD), but can't necessarily sling code. So, a task like: "We're about to release these drawings to an external client. Please write a script to check that all the drawings have author, project, and contract number that matches what they should for this client and flag any that don't." is good AI bait. Or "Please shovel this data from X, Y, and Z into an Excel Spredsheet" is also decent AI bait.

Programmers underestimate how difficult it is to synthesize code from thin air. It is much easier to read a small script than to construct it.


The article kind of addresses that in identifying what are the best type of problems AI can solve




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: