This wasn't tested in the original paper, but I've found GPT+4 with Python (Advanced Data Analysis) is often capable of solving these kinds of problems by writing a Python program that finds the solution in the search space:
Here, GPT wrote a Python program that tries all permutations of the expression (A^B)^(C^D), where "^" are one of the four basic arithmetic operators (+,-,*,/) and A, B, C, D are a permutation of the given numbers.
It then found the solution (14 - 8) * (8 / 2) = 24
, which is correct. And this is in a relatively small number of tokens (input = 112 tokens, output = 512 + prompt for Advanced Data Analysis >= 624 tokens), whereas AoT would likely require far more (the openai.logs file in this repo, for instance, is 15 306 tokens).
Pay now to fund the work behind this issue.
Get updates on progress being made.
Maintainer is rewarded once the issue is completed.
You're funding impactful open source efforts
You want to contribute to this effort
You want to get funding like this too