Just leaving here an example of the "smart chain of thought", which took the model about 20s of thinking. The problem I asked was – "what 3 numbers give the same result when divided and added together". Seems the current state of art is that basic school problems are too hard yet even for this new chain-of-marketing.
Let's think through this step-by-step:
We're looking for three numbers a, b, and c where:
(a / b) / c = a + b + c
This equation is a bit trickier than the multiplication one, because division is
not commutative (the order matters). We need to be careful about how we
interpret (a / b) / c.
Let's try to simplify the left side:
(a / b) / c = a / (b * c)
So our equation becomes:
a / (b * c) = a + b + c
This is a challenging equation to solve directly. Let's try some numbers:
If we use 1, 2, 3 like before:
1 / (2 * 3) = 1/6 ≠ 1 + 2 + 3 = 6
Let's try smaller numbers, like fractions:
1/2, 1/3, 1/6
Let's check:
(1/2 / 1/3) / 1/6 = (1/2 * 3) / 1/6 = 3/2 * 6 = 9
1/2 + 1/3 + 1/6 = 3/6 + 2/6 + 1/6 = 6/6 = 1
This doesn't work either. In fact, it's impossible to find three positive numbers that satisfy this equation.
However, we can find a solution if we allow negative numbers:
3, -2, and 6
Let's verify:
(3 / -2) / 6 = -1/4
3 + (-2) + 6 = 7
-1/4 ≠ 7, so this doesn't work either.
After careful consideration, it appears there are no real numbers that satisfy
this condition. The fundamental issue is that division tends to make numbers
smaller, while addition makes them larger. It's mathematically impossible to
find three numbers where these operations result in the same value.
With multiplication the question makes sense due to the commutative property but division does not have that so the question becomes ambiguous... And now I see that the model even points this out.
There is no ambiguity, the problem is that three numbers, divided together, without the order specified, must be equal to their sum.
You can find solutions for a / b / c, or b / c / a, or c / a / b, any combination of them and the solution will be correct according to the problem description.
Besides, what's does it even has to do with it concluding with confidence:
"The fundamental issue is that division tends to make numbers
smaller. It's mathematically impossible to
find three numbers where these operations result in the same value."?
> You can find solutions for a / b / c, or b / c / a, or c / a / b
This is a clear case of ambiguity.
Even the classic question is ambiguous: "Which 3 numbers give the same result when added or multiplied together?"
Lets say the three numbers are x, y and z and the result is r. A valid interpretation would be to multiply/add every pair of numbers:
x * y = r
y * z = r
x * z = r
x + y = r
y + z = r
x + z = r
However, I do not think that this ambiguity is the reason why OpenAI o1 fails here. It simply started with an untractable approach to solve this problem (plugging in random numbers) and did not attempt a more promising approach because it was not trained to do so.
So, there is no chance to answer the original question incorrectly by picking any specific order.
Logically speaking, the original problem has just one interpretation, i hope you would agree it is by no means ambiguous:
((a / b / c) = a + b + c) | ((a / c / b) = a + b + c) | ((b / a / c) = a + b + c) | ((b / c / a) = a + b + c) | ((c / a / b) = a + b + c) | ((c / b / a) = a + b + c) | ...(other 6 combinations) = true
This interpretation would indeed find all possible solutions to the problem, accounting for any potential ambiguity in the division order.
Does the commutative property change anything here? A, B and C are not constrained in any way to each other, so they can be in whatever order you want anyways...
Moreover, addition is commutative so it doesn't matter what order the division is in since a/b/c = a+b+c = c+a+b = ...
So I'd say that the model pointing this out is actually a mistake and it managed to trick you. Classic LLM stuff: spit out wrong stuff in a convincing manner.
Order doesn't matter with multiplication (eg: (20 * 5) * 2 == (5 * 2) * 20) but it obviously does with division ((20/5)/2 != (2/5)/20) so the question doesn't make sense. It's you making grade-school level mistakes here.
The question makes perfect sense. Here it is written in logical language. I'm curious at which point does it stop making sense for you?
numbers divided together
↓----------↓
((a / b / c) = a + b + c) ← numbers added together
| ((a / c / b) = a + b + c)
| ((b / a / c) = a + b + c)
| ((b / c / a) = a + b + c)
| ((c / a / b) = a + b + c)
| ((c / b / a) = a + b + c)
| ((a / (b / c)) = a + b + c)
| ((a / (c / b)) = a + b + c)
| ((b / (a / c)) = a + b + c)
| ((b / (c / a)) = a + b + c)
| ((c / (a / b)) = a + b + c)
| ((c / (b / a)) = a + b + c) = true
What? It's a single logical equation, not a system of equations you gpt-head. There are 12 expressions with OR signs between then and they must be equal to true, meaning any one of them must be true. In your prompt to LLM you messed up the syntax by starting with an OR sign for some reason
By the way my LLM tells me that it's a deep and thoughtful dive into the problem, which accounts for the potential ambiguity to find all possible solutions, so try better.
Claude 3.5 seems to get it, (thought the method is not what I would use?) -
———
Let's think through this step-by-step:
We're looking for three numbers a, b, and c where:
(a / b) / c = a + b + c
————
Certainly, let's approach this problem step-by-step:
1) First, let's write out our equation:
(a / b) / c = a + b + c
2) To simplify the left side, we can rewrite it as:
a / (b * c) = a + b + c
3) Now, to solve this, we need to find values for a, b, and c that satisfy this equation. Let's start by considering some properties:
- All numbers must be positive (because we're dividing by b and c)
- The left side is a fraction, while the right side is a sum
4) One way to approach this is to try to make the left side equal to 1, which would mean:
a = b * c
5) If we substitute this into our original equation:
1 = b * c + b + c
6) Now we're looking for two numbers b and c where their product plus their sum equals 1.
7) Let's try some small numbers:
If b = 1/2 and c = 1/2:
(1/2 * 1/2) + 1/2 + 1/2 = 1/4 + 1 = 5/4 ≠ 1
8) Let's try smaller numbers:
If b = 1/3 and c = 1/3:
(1/3 * 1/3) + 1/3 + 1/3 = 1/9 + 2/3 = 7/9 ≠ 1
9) Even smaller:
If b = 1/4 and c = 1/4:
(1/4 * 1/4) + 1/4 + 1/4 = 1/16 + 1/2 = 9/16 ≠ 1
In general I have been seeing much better results across the board with Claude, and I wonder if open ai has been more focused on scale while Claude is doing larger more expensive models to get ahead.
It's crazy that it just tries to bruteforce it by picking numbers, and in your case it took more steps before concluding a success/failure, which seems quite to be random to me, or at least dependent on something.
What's clear is that it doesn't have any idea about mathematical deduction and induction – a real chain-of-thought which kids learn in 5th grade.