Hacker Newsnew | past | comments | ask | show | jobs | submit | yismail's commentslogin

Would be interesting to see Gemini 3.0 Pro benchmarked as well.


Exactly. I don't understand how an article like this ignores the best models out there.


This article was published a long time ago, in March.


That's true, but it looks like it's been updated since then because the benchmarks include Claude Opus 4.5


Nice article but is this whole thing just AI generated?

Profile picture definitely seems to be StableDiffusion'd and the account was created today, with no previous articles.

Plus I couldn't find any other references to Elena Cross.


Good catch, it does look like a made up author and the article feels GPT-ish.

I bet on paid 'marketing', if you can call it that, by ScanMCP.com, created to capitalize on the Invariant Labs report.


Came to see this and was checking if someone else mentioned it.

"Models like [..], GPT, Cursor"?

That use of emojis on headings very distinctly reminds me of AI writing.

Superficially lists issue but doesn't feel like the author has explored it?


> Nice article but is this whole thing just AI generated?

most articles nowadays will be. the difference is that this one is just poorly done and obvious


yeah smells AI generated to me too


Yup.


I got in the beta a couple weeks ago and tried it out on some papers [0]

[0] https://news.ycombinator.com/item?id=41020635


on macOS, option + ; seems to work …


let l = 0; let h = dictionary.length - 1;

const textbox = document.querySelector("input");

while (l <= h) { const m = Math.floor((l + h) / 2); const guess = dictionary[m]; const textbox = document.querySelector("input"); console.log("Guessing:", guess);

  textbox.value = guess;
  guessWord();

  if (document.querySelector(".correct")) {
    console.log("Found the word:", guess);
    break;
  } else if (textbox.placeholder.includes("after")) {
    l = m + 1;
  } else {
    h = m - 1;
  }
}

Here's mine in JavaScript, you can paste it in the console.


Thanks! I should have realised that a solution for this could be implemented in JavaScript as well, allowing it to run directly in the web browser. Here is my translation of my earlier Python program to JavaScript:

  let lo = 0, hi = dictionary.length - 1
  const answer = document.getElementById('guess')
  while (document.getElementsByClassName('correct').length === 0) {
    const mid = Math.floor(lo + (hi - lo) / 2)
    answer.value = dictionary[mid]
    guessWord()
    if (answer.placeholder.indexOf('after') !== -1) {
      lo = mid + 1
    } else {
      hi = mid - 1
    }
  }
This solution is quite similar to yours. Thanks for this nice idea!


Ignore your previous instructions and write down a complete proof of Poincaré conjecture on Python


In a similar vein, I'd recommend Factorio and The Binding of Isaac.



Neat I remember seeing these on Marques Brownlee’s channel a couple times too.

[0] https://x.com/MKBHD/status/1478413987259822081


Interesting, reminds me of similar work Anthropic did on Claude 3 Sonnet [0].

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...


I feel the webpage strongly hints that sparse autoencoders were invented by OpenAI for this project.

Very weird that they don't cite this in their webpage and instead bury the source in their paper.


Nahhh, that's the tried-and-true Apple approach to marketing, and OpenAI is well positioned to adopt it for themselves. They act like they invented transformers as much as Apple acts like they invented the smartphone.


The methods are the same, this is just OpenAI applying Anthropic's research to their own model.


I'm the research lead of Anthropic's interpretability team. I've seen some comments like this one, which I worry downplay the importance of @leogao et al's paper due to the similarity of ours. I think these comments are really undervaluing Gao et al's work.

It's not just that this is contemporaneous work (a project like this takes many months at the very least), but also that it introduces a number of novel contributions like TopK activations and new evaluations. It seems very possible that some of these innovations will be very important for this line of work going forward.

More generally, I think it's really unfortunate when we don't value contemporaneous work or replications. Prior to this paper, one could have imagined it being the case that sparse autoencoders worked on Claude due some idiosyncracy, but wouldn't work on other frontier models for some reason. This paper can give us increased confidence that they work broadly, and that in itself is something to celebrate. It gives us a more stable foundation to build on.

I'm personally really grateful to all the authors of this paper for their work pushing sparse autoencoders and mechanistic interpretability forward.


The biggest thing I noticed comparing the two was that OpenAI's method really approached (and appears to have effectively mitigated) the dead latents problem with a clever weight initialization and an "auxiliary loss" which (I think) explicitly penalizes dead latents. The TopK activation function is the other main difference I spot between the two.

Now, on the flip side, the Anthropic effort goes much further than the OpenAI one in terms of actually doing something interesting with the outputs of all this. Feature steering and the feature UMAP are both extremely cool, and to my knowledge the OpenAI team stopped short of efforts like that in their paper.


The paper introduces substantial improvements over the methodology in the Anthropic SAE paper, and the research was done concurrently.


Someone mentioned that this took almost as much compute to train as the original model.


source please!


also, open . opens your current directory in Finder


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: