This is truly tremendous to watch. Eleven years from TPP, and we're watching the...

IX-103 · on Feb 27, 2025

A few years ago there was another AI that tried to beat Pokemon. It wasn't a LLM. I think it was an LSTM trained with reinforcement learning. It got stuck in Mt Moon.

Right now, Claude has been stuck in Mt Moon for nearly a day. It keeps forgetting where it has been. It also almost always runs from battles instead of changing Pokemon or fighting.

At one point it got stuck in a Pokemon center when it mistook the character's red hat for the red carpet around the exit. It kept pressing down and wondering why it wasn't working. It only broke out of that when it mistakenly concluded it had successfully exited the Pokemon center. Then it wandered around a bit and only realized it was still in the Pokemon center after talking to Nurse Joy.

Philpax · on Feb 27, 2025

You're thinking of https://www.youtube.com/watch?v=DcYLT37ImBY and https://github.com/PWhiddy/PokemonRedExperiments.

> It also almost always runs from battles instead of changing Pokemon or fighting.

I believe this is because all of its Pokemon are on the verge of fainting, so it's trying to conserve them while it tries to find its way out.

> It keeps forgetting where it has been.

I'm wondering if this could be solved with a better harness; on one hand, that hurts the elegance of having one model dedicated to playing the game, but their existing harness is already cheating a little (they have a second LLM for verification). They're frequently compacting what's in context, which means its visual memory is quite poor - that could potentially be a point of improvement?

Y_Y · on Feb 27, 2025

Or the converse, feed all of twitch chat to Claude and see if it can output the correct button presses.

unification_fan · on Feb 27, 2025

You'd have to feed it all of Twitch chat correlated to whatever frame was being streamed at the time and adjusted for network jitter and buffering.

Good luck