Hacker Newsnew | past | comments | ask | show | jobs | submit | obiefernandez's commentslogin

Anyone else look at this and think to themselves, "thank god"

Like it's probably a good thing for humanity if the USA does not feel the need to go to war with China over Taiwan.


The RLM framing basically turns long-context into an RL problem over what to remember and where to route it: main model context vs Python vs sub-LLMs. That’s a nice instantiation of The Bitter Lesson, but it also means performance is now tightly coupled to whatever reward signal you happen to define in those environments. Do you have any evidence yet that policies learned on DeepDive / Oolong-style tasks transfer to “messy” real workloads (multi-week code refactors, research over evolving corpora, etc.), or are we still in the “per-benchmark policy” regime?

The split between main model tokens and sub-LLM tokens is clever for cost and context rot, but it also hides the true economic story. For many users the cost that matters is total tokens across all calls, not just the controller’s context. Some of your plots celebrate higher “main model token efficiency” while total tokens rise substantially. Do you have scenarios where RLM is strictly more cost-efficient at equal or better quality, or is the current regime basically “pay more total tokens to get around context limits”?

math-python is the most damning data point: same capabilities, but the RLM harness makes models worse and slower. That feels like a warning that “more flexible scaffold” is not automatically a win; you’re introducing an extra layer of indirection that the model has not been optimized for. The claim that RL training over the RLM will fix this is plausible, but also unfalsifiable until you actually show a model that beats a strong plain-tool baseline on math with less wall-clock and tokens.

Oolong and verbatim-copy are more encouraging: the controller treating large inputs as opaque blobs and then using Python + sub-LLMs to scan/aggregate is exactly the kind of pattern humans write by hand in agents today. One thing I’d love to see is a comparison vs a well-engineered non-RL agent baseline that does essentially the same thing but with hand-written heuristics (chunk + batch + regex/SQL/etc.). Right now the RLM looks like a principled way to let the model learn those heuristics, but the post doesn’t really separate “benefit from architecture” vs “benefit from just having more structure/tools than a vanilla single call.”

On safety / robustness: giving the model a persistent Python REPL and arbitrary pip is powerful, but it also dramatically expands the attack surface if this ever runs on untrusted inputs. Are you treating RLM as strictly a research/eval harness, or do you envision this being exposed in production agent systems? If the latter, sandboxing guarantees and resource controls probably matter as much as reward curves.


The beauty of Suno, at least for me, was the opportunity to turn my original lyrics into listenable music free without having it attached in any way to any of the big labels, who are evil to the core. I really hope they keep the existing user experience intact.


I don’t see how Suno is less evil if you consider the labels evil.


Nothing has “killed Ruby on Rails”.

Ridiculous comment.


I'm honored to have you of all people make that comment.

But with all due respect the excitement and job market for Ruby isn't anything close to what it used to be:

[0]: https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...


Location: Mexico City (US Citizen)

Remote: Yes (Remote Only)

Willing to relocate: No

Technologies: Ruby on Rails, AI

Resume/CV: https://www.linkedin.com/in/obiefernandez/

Email: obiefernandez@gmail.com

Hello, I'm one of the original evangelists for Ruby on Rails and the author of The Rails Way as well as Patterns of Application Development Using AI. Over the past three decades, I’ve led teams and built products at every scale — from early-stage startups to global platforms — combining deep technical expertise with a creative, forward-looking approach to software craftsmanship.

I bring 30 years of hands-on engineering experience, including senior leadership in architecture, AI integration, and product strategy. Whether working as an individual contributor or guiding organizations through transformation, I focus on delivering clarity, velocity, and sustainable innovation. My last gig was leading AI strategy related to Developer Experience at Shopify.

Currently evaluating consulting and permanent opportunities with preference for executive leadership position at a larger company, although will consider consulting and fractional CTO type roles for startups and smaller ventures if the project and team are interesting enough.


Big news in the music AI space. Interesting and potentially worrying implications for Suno, which has pulled far ahead in the race and recently announced $150M ARR milestone.


Tobi Luetke at Shopify too


my biggest TIL takeaway from that article was an "oh wow" moment:

The other sound that ‘ȝ’ once spelled is the “harsh” or “guttural” sound made in the back of the mouth, which you hear in Scots loch or German Bach.4 This sound is actually the reason for the most famous bit of English spelling chaos: the sometimes-silent, sometimes-not sequence ‘gh’ that you see in laugh, cough, night, and daughter. Maybe one day I’ll tell you that story too.


Lachen, Nacht and Tochter (don't know a cognate for 'cough') still have this sound in Standard German.


'cough' could share a root with 'keuchen' (IANAL)


That has a different sound though. But yes, it might be a cognate.


In Dutch there is an even harder 'g' sound.


Is it less hard than the ‘k’ sound?


Yes, more back of the throat. One particularly nasty form is as in 'Scheveningen'. The Scottish version comes close in for instance 'Loch'.


I personally am more fond of provoking an "angstschreeuw" in English speakers by asking them to pronounce "slechtstschrijvend" or "zachtstschrijdend" and watching them recoil in horror at the consonant clusters[0][1].

[0] https://en.wikipedia.org/wiki/Consonant_cluster

[1] https://nl.wikipedia.org/wiki/Medeklinkerstapeling (Dutch wiki page for consonant clusters with more examples)


> [1]

It's funny I just started reading and understanding the first paragraph, before recognizing that this is a foreign language, I don't know at all.


Those are funny!


I've heard that Knecht (servant in German) is the same word as Knight in English


Nonstarter if can’t use it with Max plan


Jose commented on that that elsewhere:

> I'd love to integrate with whatever model subscription is available but it seems using Max outside of Claude products is against their terms. I suggest reaching out to Anthropic and letting them know you would like to use your Max subscription with other coding agents.


Hey if you don't mind updating this, can you please allow the tempo to be as high as 150 bpm?


Hey, sure! I forgot it was limited to 130, it's been a few years! I've just updated it.


That might tickle your tinrib. If you want to stay up forever, maybe go to 160 bpm. Or even some industrial strength 200 bpm.


And also, different tempos per instrument :)


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: