Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Okay I read your post more carefully and it seems like you're attempting to build one central script for a given URL. Assuming on-shot script generation is unreliable and requires iterative improvement this makes sense. Of course I'm biased in favor of local-first, privacy preserving and non-distributed solutions if they exist, so I'd be curious to know if/how you measured the reliability of one-shot script generation for a basket of likely web apps.


One shot is pretty much not going to work, both at single step level or if you ask llm to generate workflow in one shot. We haven't measured it as such but even for static websites like hackernews front page it takes a couple tries of to and fro for the llm to get it right. somehow after all the instructions the llm will still "guess" the selector instead of checking the page/dom contents. And then there are lot of other minor details that need to be captured like "you need to wait a couple of second for the auto complete results to show up". If you tell it to just make a workflow, it will generate some garbage and call it a day.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: