Hacker Newsnew | past | comments | ask | show | jobs | submit | kajman's commentslogin

I dismissed the earlier non-technical blog post as shameless product boosterism for Anthropic. The linked hacks blog (which is a better source than this article) is a welcome release. It's hard to deny there's something real to this now, I think. Mozilla's internal definition of a "vulnerability" is also probably more widely applied than what many would intuit, but it is good that these issues are being taken seriously and fixed.


At the same time other companies like AISLE are matching Mythos on vulnerabilities using older models but their own harnass: https://aisle.com/blog/aisle-matches-anthropic-mythos-on-fre...

So while Mythos certainly is real I think you could do the same with Deepseek pro, GPT 5.5 etc...


I used to work with a guy who would always say "if you're looking for trouble, you are going to find it"

When I hear that "we found X bugs using some new tool", where the standard for bugs is low and doesn't neccessarily require user impact in realistic scenarios, I think to myself- duh! You went looking for bugs, of course you found them.

For a sufficiently complicated product, in my experience, you don't have to look far.


Well it helps if 'looking for bugs' doesn't cost $300 per hour per set of eyes.

how much does it cost? my understanding of Mythos is that it runs a lot to find issues

The things I’ve read from various open source orgs with access to it is that Anthropic is giving them unmetered access for now as part of Glasswing. I’d bet that the corporate partners have to pay though.

> if you're looking for trouble, you are going to find it

That's the "'No Way to Prevent This,' Says Only Nation Where This Regularly Happens" of unsafe languages.

There are huge swathes of problems we know how to categorically prevent, but some people won't do it because they're more comfortable believing it was never preventable than accepting any culpability for not preventing it previously.


As the Hacks.Mozilla article notes: "We began with small-scale experiments prompting the harness to look for sandbox escapes with Claude Opus 4.6. Even with this model, we identified an impressive amount of previously-unknown vulnerabilities which required complex reasoning over multiprocess browser engine code."

Agreed. The earlier blog post did not explicitly claim this, but I think casual viewers were prompted to believe that the Magic of Mythos (TM) went and found (and fixed??) a bunch of vulnerabilities with minimal human guidance, and even contrasted this with their fuzzing infrastructure and made it sound (to me) like it was casting shade on it.

This new post makes it pretty clear that this was all bolted on-top of their existing fuzzing infrastructure, and really just used to get more and better initial hits that a very skilled team is looking at. I assume Anthropic was giving them a very good deal on inference for the positive PR, but I believe these other reports and suspect Mozilla did not really need them.


Wasn't AISLE only able to find the same bugs when it was shown only the known faulty code? The worrying part about Mythos isn't the fact that it can find bugs. The worrying part is Mythos being able to find them on its own across entire code base as vast as Firefox then write exploits for what its found with a very basic prompt.

The skill required to find then create zero days is quickly approaching the floor.


I think they split the codebase in smaller files or modules and then tell the AI there's a bug in this particular file and to go find it.

Then they loop over a codebase like this. This way you always point a model at a 'known' bug. And I assume a smaller context window helps with quality.

Not entirely sure it's obviously proprietary.


I don't know what to call this - a "freelancer launch"? It is the best executed one I've seen, though. Maybe even a black-mark on OSS if it does not go well.

> Maybe even a black-mark on OSS if it does not go well.

No, because realistically, this is the opposite of what corporations want. If a project is only being maintained by one or two people, that’s a risk, pure and simple. So you look somewhere else for something that matches your needs, with a more sustainable story.

Nothing against the author, but what he’s describing is a business model - just one that’s likely to bring in a negligible amount of money. This is less about open source and more about what kinds of projects society is willing to pay people to work on.


Corporations seem to rely on key software that just a few people maintain all the time already, but you're right and the bus factor does not look great. Mise is also currently MITMing my shell, along with presumably many other dev machines, so the threat of compromise is pretty scary.

> So I left Figma to work on these full time.

The Mise website makes way more sense to me now. I suppose some artistic license is justified when you're at the cutting-edge of the CLI aesthetic and what not.


This would not have ever been announced while Lina Khan was running the FCC.

What does the FCC have to do with this?

Anti-trust. They're selling part of the problem (inference via Gemini) and now they're selling a solution. They also dominate web standards by developing the dominant browser. And they control one of two dominant phone platforms that will collaborate to enable this solution.

If this were some smaller company that just did cloud then it'd never even make it to PoC. This can only happen because it's Google Cloud, and they can leverage everything they own all at once. Those not buying into their ecosystem can take a hike.


The FCC doesn't enforce antitrust law. That's the FTC. (The FTC is also the commission that Lina Khan chaired for a while.)

Oops, Yes. I got 2/3 of the letters correct, though. I think that might be a better rate of success than their court cases during those years.

Such a bizarre boondoggle for a company that otherwise seems to have smart and focused offerings. They may as well announce a Slack or Jira replacement next.

Orion is their first product, before Kagi.

I wasn't aware, and I even paid for their search for a time.

I still do not understand the market for a proprietary browser aimed at privacy-conscious power-users. There is a non-proprietary option that is many years ahead, along with another proprietary browser marketing to the same niche demographic. Good luck to them.


Underwriters already have a solution, but there's a national flood insurance program ensuring taxpayers hold the bag instead.

I hope the only reason people are pretending these markdown suggestions are a "workflow" is fear that a more structured approach will be obsolete by the time it's polished. I can't imagine the pace of innovation with the underlying models will stay like this forever.

I hope to see harnesses that will demand instead of ask. Kill an agent that was asked to be in plan mode but did not play the prescribed planning game. Even if it's not perfect, it'd have to better than the current regime when combined with a human in the loop.


I'm confident the world will need more software developers than ever before, no matter where "AI" goes from here.

I don't think most of those jobs will be in the West, though.


Why not? Are there any software-related industries in the West where software engineers are not needed or won't be needed?

As in, full replacement of developers at Western companies by the "AI"? I would be very surprised.

I do suspect that developing software in California in 10-20 years may be looked at as if one were proposing a sweatshop to sew pants there right now.


All of the three sectors you've mentioned are not in a good place right now. Probably much less stressful to be an unemployed programmer than trying to make a hobby-scale farm profitable with soaring fuel and fertilizer prices, along with a labor force that is fleeing.

E: Farm automation probably has some juice though, regardless of how close the androids I keep seeing in demos actually are.


Sure is nice as a user. I get better frames in some games than Windows users do!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: