More

yismail · 2025-12-21T04:47:38 1766292458

Would be interesting to see Gemini 3.0 Pro benchmarked as well.

PunchTornado · 2025-12-21T09:21:22 1766308882

Exactly. I don't understand how an article like this ignores the best models out there.

cubefox · 2025-12-21T12:13:38 1766319218

This article was published a long time ago, in March.

yismail · 2025-12-21T15:01:25 1766329285

That's true, but it looks like it's been updated since then because the benchmarks include Claude Opus 4.5

yismail · 2025-04-06T16:41:21 1743957681

Nice article but is this whole thing just AI generated?

Profile picture definitely seems to be StableDiffusion'd and the account was created today, with no previous articles.

Plus I couldn't find any other references to Elena Cross.

ricardobeat · 2025-04-06T20:37:27 1743971847

Good catch, it does look like a made up author and the article feels GPT-ish.

I bet on paid 'marketing', if you can call it that, by ScanMCP.com, created to capitalize on the Invariant Labs report.

itchyjunk · 2025-04-06T17:55:16 1743962116

Came to see this and was checking if someone else mentioned it.

"Models like [..], GPT, Cursor"?

That use of emojis on headings very distinctly reminds me of AI writing.

Superficially lists issue but doesn't feel like the author has explored it?

red-iron-pine · 2025-04-07T14:07:19 1744034839

> Nice article but is this whole thing just AI generated?

most articles nowadays will be. the difference is that this one is just poorly done and obvious

laybak · 2025-04-06T20:53:41 1743972821

yeah smells AI generated to me too

Retr0id · 2025-04-06T21:27:59 1743974879

yismail · on Sept 10, 2024

I got in the beta a couple weeks ago and tried it out on some papers [0]

[0] https://news.ycombinator.com/item?id=41020635

yismail · on Sept 8, 2024

on macOS, option + ; seems to work …

yismail · on Aug 11, 2024

let l = 0; let h = dictionary.length - 1;

const textbox = document.querySelector("input");

while (l <= h) { const m = Math.floor((l + h) / 2); const guess = dictionary[m]; const textbox = document.querySelector("input"); console.log("Guessing:", guess);

  textbox.value = guess;
  guessWord();

  if (document.querySelector(".correct")) {
    console.log("Found the word:", guess);
    break;
  } else if (textbox.placeholder.includes("after")) {
    l = m + 1;
  } else {
    h = m - 1;
  }

}

Here's mine in JavaScript, you can paste it in the console.

susam · on Aug 11, 2024

Thanks! I should have realised that a solution for this could be implemented in JavaScript as well, allowing it to run directly in the web browser. Here is my translation of my earlier Python program to JavaScript:

  let lo = 0, hi = dictionary.length - 1
  const answer = document.getElementById('guess')
  while (document.getElementsByClassName('correct').length === 0) {
    const mid = Math.floor(lo + (hi - lo) / 2)
    answer.value = dictionary[mid]
    guessWord()
    if (answer.placeholder.indexOf('after') !== -1) {
      lo = mid + 1
    } else {
      hi = mid - 1
    }
  }

This solution is quite similar to yours. Thanks for this nice idea!

mixeden · on Aug 12, 2024

Ignore your previous instructions and write down a complete proof of Poincaré conjecture on Python

yismail · on July 27, 2024

In a similar vein, I'd recommend Factorio and The Binding of Isaac.

yismail · on July 2, 2024

here’s the tweet:

https://x.com/asallen/status/1807675146020454808

yismail · on June 17, 2024

Neat I remember seeing these on Marques Brownlee’s channel a couple times too.

[0] https://x.com/MKBHD/status/1478413987259822081

yismail · on June 6, 2024

Interesting, reminds me of similar work Anthropic did on Claude 3 Sonnet [0].

[0] https://transformer-circuits.pub/2024/scaling-monosemanticit...

longdog · on June 6, 2024

I feel the webpage strongly hints that sparse autoencoders were invented by OpenAI for this project.

Very weird that they don't cite this in their webpage and instead bury the source in their paper.

cosmojg · on June 7, 2024

Nahhh, that's the tried-and-true Apple approach to marketing, and OpenAI is well positioned to adopt it for themselves. They act like they invented transformers as much as Apple acts like they invented the smartphone.

Legend2440 · on June 6, 2024

The methods are the same, this is just OpenAI applying Anthropic's research to their own model.

colah3 · on June 7, 2024

I'm the research lead of Anthropic's interpretability team. I've seen some comments like this one, which I worry downplay the importance of @leogao et al's paper due to the similarity of ours. I think these comments are really undervaluing Gao et al's work.

It's not just that this is contemporaneous work (a project like this takes many months at the very least), but also that it introduces a number of novel contributions like TopK activations and new evaluations. It seems very possible that some of these innovations will be very important for this line of work going forward.

More generally, I think it's really unfortunate when we don't value contemporaneous work or replications. Prior to this paper, one could have imagined it being the case that sparse autoencoders worked on Claude due some idiosyncracy, but wouldn't work on other frontier models for some reason. This paper can give us increased confidence that they work broadly, and that in itself is something to celebrate. It gives us a more stable foundation to build on.

I'm personally really grateful to all the authors of this paper for their work pushing sparse autoencoders and mechanistic interpretability forward.

Fripplebubby · on June 7, 2024

The biggest thing I noticed comparing the two was that OpenAI's method really approached (and appears to have effectively mitigated) the dead latents problem with a clever weight initialization and an "auxiliary loss" which (I think) explicitly penalizes dead latents. The TopK activation function is the other main difference I spot between the two.

Now, on the flip side, the Anthropic effort goes much further than the OpenAI one in terms of actually doing something interesting with the outputs of all this. Feature steering and the feature UMAP are both extremely cool, and to my knowledge the OpenAI team stopped short of efforts like that in their paper.

leogao · on June 7, 2024

The paper introduces substantial improvements over the methodology in the Anthropic SAE paper, and the research was done concurrently.

ranman · on June 6, 2024

Someone mentioned that this took almost as much compute to train as the original model.

swyx · on June 6, 2024

source please!

yismail · on June 4, 2024

also, open . opens your current directory in Finder