They have browser automation, and a bunch of other agent tools to manage tasks, do things like PowerPoint slides, etc. I find chatgpt agent mode better for most tasks though.
I just did this test with our web QA agent - kodefreeze.com, it was able to test creating an account until it reached the screen that requires email confirmation.
Support for being able to receive email/custom actions is on our roadmap, but would love to see if getting this far would be valuable to you. The test was with the email=test@kodefreeze.com.
How do you build trust in a system like that? The flowchart style have the advantage that you can decide when you want a human to review/approve as well as ensuring actions that need to happen at certain conditions do happen.
Yeah, flowchart style does have that advantage since you can add in approvals and conditions. The tradeoff is you end up limited to simpler logical flows that are easier to verify.
Our take is that trust in agent systems has to be empirical. You start with manual testing and then layer on AI based simulations (we’re adding this in Rowboat soon) to test more scenarios at scale. Splitting work into multiple agents also makes it easier to isolate and test parts separately.
This is really interesting. We've been working on a smaller set of this problem space. We've also found in some cases you need to somehow pass to the model the sequence of events that happen (like a video of a transition).
For instance, we were running a test case on a e commerce website and they have a random popup that used to come up after initial Dom was rendered but before action could be taken. This would confuse the LLM for the next action it needed to take because it didn't know the pop-up came up.
reply