ChatGPT Fails at Basic Math

2 minutes read

As a fan of the BBC television show 8 out of 10 cats does countdown I wanted to see if ChatGPT could re-create the math problem segment from that show.

Contestants are prompted to select 7 numbers, they can select either small or big numbers. Small numbers are 1 to 9 and big numbers are 10, 25, 50, 75, 100. When the numbers have been selected a target number is randomly generated. Then the contestants have to use the four basic calculation methods to get as close to the target number as possible.

Experimenting with Math Problems for ChatGPT

I used the davinci-003 model which should be used for text completion, answering questions and solving problems. I crafted the prompt using several passes just to see that the AI correctly understood the problem I was posing. Starting with defining what Small numbers, Big numbers, and Target number works.

Prompt:
Small numbers: 1,2,3,4,5,6,7,8,9
Allowed: addition, subtraction, multiplication, division
Randomly select 2 Big numbers and 5 Small numbers and call them Selected numbers. Then select one random number between 101 and 899 and call it Target number.

Response:
Selected numbers: 25, 50, 1, 2, 6, 8, 9
Target number: 567

The response is, as you can see, perfect. I continued by adding to the prompt that I wanted a solution to using the small numbers to get as close as possible to the target number. At first it looked like ChatGPT was going to easily solve this problem too.

New Prompt:
Randomly select 2 Big numbers and 5 Small numbers and call them Selected numbers. Then select one random number between 101 and 899 and call it Target number. Then list all possible ways to to get the Target number by using the Selected numbers by using the Allowed. Not all of the Selected numbers has to be used to solve the problem. If you cannot find a solution give a solution to the closest you can get to Target number.

Response:
Selected numbers: 25, 50, 1, 2, 3, 4, 5
Target number: 806

Solution 1: (50 + 25) x 16 = 806 
Solution 2: (50 - 4) x 17 = 806 
Solution 3: (25 + 50 + 5) x 16 = 806

At first glance I thought, wow, it can actually do this too. But as you can see ChatGPT fails pretty bad at basic math. In this instance it even fails to follow the instructions and uses numbers that was not even in the set.

Selected Numbers: 50, 75, 1, 2, 3, 4, 9
Target Number: 567 

Solution 1: 75 + 50 + 4 + 3 + 2 + 1 = 575  (closest solution) 
Solution 2: 75 x 9 - 4 = 681

Here it keeps to the rules but the math is still wrong.

Selected numbers: 50, 100, 1, 3, 4, 8, 9
Target number: 686 

Solution 1: 50 + 100 + 4 + 8 + 9 = 171 * 4 = 684 
Solution 2: 50 x 100 - 1 - 3 - 4 = 4947 - 8 = 4939

This time it actually got it. The math is correct, it took about 20 tries to get this one correct response.

What is interesting is that it seem to follow the instructions flawlessly most of the time but it does not seem to care that it gives incorrect solutions. I might try to give this another go by working on the prompt and trying out other values for parameters such as temperature, but it seems like we cannot trust ChatGPT when it comes to math.

Leave a Reply

Your email address will not be published. Required fields are marked *