Something I do sometimes is: - Have an AI chat model come up with an answer to a...

ASalazarMX · 2025-04-29T23:58:47 1745971127

Ah, now we know why Spain was out of electricity yesterday.

Cthulhu_ · 2025-04-30T08:42:06 1746002526

Here I was thinking cryptocurrency pre-heated the grids (and GPU manufacturing) for us already.

danparsonson · 2025-04-30T01:35:13 1745976913

Oh that was a good one XD

StopDisinfo910 · 2025-04-29T21:20:37 1745961637

For anything semi-adversarial, I have had good results asking the AI to come up with a plan, then take the side of the opponent coming with counter play/way to defeat the plan, finally asking for a revision of the initial plan given the potential reaction from the opponent.

The final plan you obtain is generally a lot more well rounded and thought out.

I find that amusing because the technique also works when I apply it to me. Picking flaws in your plan before revisiting it actually works.

meander_water · 2025-04-29T23:29:06 1745969346

To be honest, this is what I assumed this repo was doing from the title. It talks about arguing with itself, but it looks like it's just generating multiple alternative responses in parallel and selecting the best one.

Do you find your method handles "sycophancy" well?

StopDisinfo910 · 2025-04-30T05:44:14 1745991854

I don’t really know.

I stopped using ChatGPT at some point because I disliked how cagey it became about a lot of topics. I used to enjoy making write improbable movies mashup when GPT3 was released and at some point it became very touchy about IP rights and violence which was annoying.

I generally use Deepseek nowadays which is not sycophantic and surprisingly doesn’t seem as censored to me especially if you use a version not hosted by Deepseek themselves.

lblume · 2025-04-30T19:37:46 1746041866

Which hosting service would you recommend?

zoogeny · 2025-04-29T22:40:03 1745966403

I do the same, and I have one other technique.

I will often have a few chats going for a project, but with different contexts. For example, one might be tech focused, another marketing focused, another with some context on my personal goals, etc.

So I will take the same question and feed it into the chats with differing context. It is almost like having different perspectives on the same problem. And the conclusions can often differ based on the differing contexts.

odie88 · 2025-04-30T13:17:16 1746019036

This is how I’ve been using Gemini and it’s the first time I’m really seeing consistent value.

I’ll get a context into a solid place with as much information as I can about a project. Usually getting up to 100k tokens.

Then I ask it to give me a summary I can use in a fresh chat, that will maintain the current context. This lets me reclaim space, bring responsiveness back to sane levels, have a baseline chat I use to spin up branches for marketing, design (it’s pretty helpful at trouble shooting Substance Designer graphs), etc.

I’ve found myself going into sub branches from there… like a marketing context that pushes branches into different marketing channels.

jsight · 2025-04-29T22:01:12 1745964072

This reminds me a lot of the YT video that went over using Monte Carlo Tree Search with LLMs to maximize result quality. Link: https://www.youtube.com/watch?v=mfAV_bigdRA&ab_channel=Treli...

It seemed like a pretty good idea, though I'd guess that it would greatly increase token usage. I'd also be concerned that the LLM as a judge might struggle to grade things accurately if it wasn't also able to generate good enough answers to begin with.

looofooo0 · 2025-04-30T06:11:31 1745993491

If you think about marginal cost, such experiments can be run almost at only the cost of extra electricity used for that computation, which in Europe is often zero, at least by the ones who own the compute.

JumpCrisscross · 2025-04-29T20:30:48 1745958648

Kagi’s Assistant feature makes this super easy. Just switch assistants and ask them to check the other’s work.

BOOSTERHIDROGEN · 2025-04-30T00:08:48 1745971728

nativeit · 2025-04-30T00:19:47 1745972387

Ask the AI assistant for instructions.

Pretty soon we'll have new acronyms such as "IDKATFAIA" ["I don't know, ask the f'ing AI already"] as we all succumb to the knowledge soup.

dalmo3 · 2025-04-30T00:46:01 1745973961

factotvm · 2025-04-30T04:32:47 1745987567

Read The Fine Prompt, more or less, right?

BOOSTERHIDROGEN · 2025-04-30T12:04:32 1746014672

Honestly, the AI assistant isn't as smart as I thought - I'm still having to check its work.

subscribed · 2025-04-29T22:18:47 1745965127

I do it all the time in Sillytavern in a group chat - three characters kind of resembling what you just described, and me, participating in the "conversation", them going back and forth until they're satisfied.

With a good model role playing them, works awesome.

hsuduebc2 · 2025-04-29T20:29:49 1745958589

We're there any situation that first conclusion from AI was completely changed? Can you give generally examples of situations where it changed or significantly improved overall result? It sounds cool.

nomel · 2025-04-29T21:13:02 1745961182

I would be interested to know how ofter "oscillations" occur, where they flip flop from being too "agreeable" to challenges (which probably is just a sparse latent space). This happens to me pretty frequently, where you can repeatedly say "no that's wrong" and the LLM will do a 180, explaining why it was "in fact" wrong and you are "right", repeat.

itissid · 2025-04-29T21:57:35 1745963855

Isn't this kind of another way of how Inference Time Scaling works? It will basically produce several chain of thoughts and then pursue one that has maximum reward based on an internal function?

pessimizer · 2025-04-29T23:48:00 1745970480

I've wondered if it might be helpful to randomly "shard" training data between two LLMs; just feed half the training data to one, and the rest to the other, with no overlap.

So instead of using two models, you'd be making two halves of one model do a similar (deliberative) process to yours. I wonder if that would result in a benefit over a single model with the full training set, and if you could continue to do the same thing by sharding the shards.

ijk · 2025-04-30T01:12:51 1745975571

There's some precedent for that: you can do some useful things with the cross entropy of the two models. And k-fold cross validation might also be relevant.

aprilthird2021 · 2025-04-30T04:35:53 1745987753

This takes such a long time to do though, no? What problems does this save you time on?

dustingetz · 2025-04-30T01:14:22 1745975662

i dont understand, is it doing your schoolwork?