Hacker Newsnew | past | comments | ask | show | jobs | submit | enraged_camel's commentslogin

More expensive than Sonnet 4.5, but no comparison benchmarks. I think I’ll pass.

I had the same thought. Cursor 1.0 was cheap and blazingly fast. 1.5 seems to keep the speed, but who knows how much better it is, and it's no longer cheap.

We've found it to be a strong mix of speed and intelligence. It scores higher than Sonnet 4.5 on Terminal-Bench 2, maybe we will post more on this later.

You should! This blog post doesn't really give any reason to use it besides "it's better on Cursor's internal benchmark". A full model card would be great.

The way benchmarks for Composer have been presented since v1 feels unusually cautious. To users, that reads as “the model isn’t very good”.

Yeah, please do. Because when the AI labs you are competing with are posting extensive benchmarks and you just say "well we used our own internal benchmark" it is a bit sus, especially given the fact that the price has tripled.

>> Appliances like Microwaves, etc were revolutionary for its time. Only problem: they lasted forever (>20 years). No 1 needed to buy it again = no business. It was deliberately not made to last as long and possibly not exactly cheaper both in cost and retail price.

This is a common myth that was debunked a while back. Essentially people get fooled by survivorship bias: they only see the few old appliances that somehow survived, and that leads them to conclude that things were higher quality back in the day.


> Essentially people get fooled by survivorship bias

It's still a thing today though? It's not survivorship bias. Take the Microwave example. In a lot of countries it is very hard to buy just a traditional (convection) Microwave now. They force these 4-in-1 or an inverter Microwave.

> leads them to conclude that things were higher quality back in the day

It says nothing about "quality". So continuing, yes the inverter Microwaves are "quality", offer more control and costs more but due to all the complexity dies way faster. A lot of them die in <3 years when the traditional 1s last way longer. Back in the day we only had convection Microwaves. The end.

> This is a common myth that was debunked a while back.

By who? By you? Another example - SSDs have been made to not last as long. We went from SLC -> MLC -> TLC -> QLC etc. The writes were reduced. Did the consumers want this? No. There just wasn't much "choice". Top of line Samsung consumer SSDs just changed. During COVID some vendors sneakily adjusted it too. So, yes quality went worse. Deliberately.


There is zero evidence this is the case. You are making up baseless accusation, probably due to partisan motivations.

edit: love the downvotes. I guess HN really is Reddit now. You can make any accusation without evidence and people are supposed to just believe it. If you call it out you get downvoted.


Is there any evidence the opposite is the case?

It doesn’t work like that. The burden is on the person making the claim. If you are going to accuse someone of posting an AI-written article you need you show evidence.

It's a losing strategy in 2026 to assume by default that any questionable spam blog/comment/etc content is written by an actual human unless proven otherwise.

Besides, if there are enough red flags that make it indistinguishable from actual AI slop, then chances are it's not worth reading anyway and nothing of value was lost by a false positive.


Please don't tell me you read that article and thought it was written by a person. This is clearly AI generated.

What evidence are you expecting exactly? It's vacuous AI slop that spends 1000 words just making vague assertions about how incredible OpenClaw is without a single actual example. There's nothing here, it's not real. You are going to struggle going forward if you can't detect AI slop this obvious.

A good way to think about it is that ChatGPT is well on its way to becoming a verb like Google did. Doesn't roll off the tongue as easily but in terms of brand awareness it feels ubiquitous.

This reads like a total joke.

>> The whole point is that you can't 100% trust the LLM to infer your intent with accuracy from lossy natural language.

You can't 100% trust a human either.

But, as with self-driving, the LLM simply needs to be better. It does not need to be perfect.


> You can't 100% trust a human either.

We do have a system of checks and balances that does a reasonable job of it. Not everyone in position of power is willing to burn their reputation and land in jail. You don't check the food at the restaurant for poison, nor check the gas in your tank if it's ok. But you would if the cook or the gas manufacturer was as reliable as current LLMs.


> But you would if the cook or the gas manufacturer was as reliable as current LLMs.

No, in that scenario there would be no restaurants and you would travel by horse.


Good analogy

Incidentally, I've been using AI to deal with the weird bugs, cryptic errors and generally horrendous complexities of a framework we've been using at work (Elixir's Ash). It's really nice to no longer have to read badly organized docs, search the Internet for similar problems and ask around in the developers' Slack/Discord.

So many of my coding agent sessions start with "clone <github URL to relevant dependency> into /tmp for reference" - it's such a great pattern because incomplete or inaccurate decimation matters way less if the agent can dispatch a sub-agent to explore the codebase any time it needs to answer an obscure question.

I don't know how anyone can make this assumption in good faith. The poster did not imply anything along those lines.

That looked like a leading question to me, asking for confirmation but not an outright assumption. Seems like a fair question

I have not been able to switch to Opus 4.5 in XCode. It defaults to Sonnet 4.5 and I couldn't find where to change it (or if it's possible). Anyone know?

Most of this comment was written by an LLM. There are certain tells, such as the tone, as well as usage of “ for quotations instead of the much more common ". I think you added the last couple of sentences.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: