Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I like this prompt for testing LLMs as the problem is easy to reason about but likely doesn't come up a lot in training data:

> I'm playing assetto corsa competizione, and I need you to tell me how many liters of fuel to take in a race. The qualifying time was 2:04.317, the race is 20 minutes long, and the car uses 2.73 liters per lap.

GPT-3.5 gave me a right-ish answer of 24.848 liters, but it did not realize the last lap needs to be completed once the leader finishes. GPT-4 gave me 28-29 liters as the answer, recognizing that a partial lap needs to be added due to race rules, and that it's good to have 1-2 liters of safety buffer.

I prompted Bard today and the three drafts gave three different answers: 18.28, 82.5, and 327.6 liters. All of these were wildly wrong in different ways.



I like that you're solving precisely a problem I face on a daily basis! (Still don't understand how ACC doesn't have a built in calculator for this).


Interesting. Regenerating replies does seem to help, for whatever reason. I've noticed the first after model loading seems to have a higher likelihood of being inaccurate or a hallucination.

Regenerating a GPT4xAlpaca 30B model from its initial answer of 243L, I got a variation of the following for the next 5 rerolls:

"Based on your qualifying time of 2:04.317 and the length of the race being 20 minutes, we can calculate the number of laps needed for the entire race. Assuming each lap takes approximately 2 minutes (based on the average speed), there will be 20 / 2 = 10 laps during the race.

Using the information provided about the car using 2.73 liters per lap, we can determine the total amount of fuel required for the race. Therefore, it would be advisable to carry at least 10 * 2.73 = 27.3 liters of fuel for this race."


Can you explain how the problem is solved for someone who doesn't understand racing?


Sure thing! If the race is 20 minutes and each lap takes 2m04s, that means there will be 9.67 laps till the race is over, and you round that up to 10 since partial laps must be finished. You need 2.73 liters per lap, so the 10 laps will use 27.3 liters total. GPT-4 is correct in suggesting a tiny safety buffer above that in case fuel usage differs from expected.


It's a math word problem, in which LLM's would not perform well. I have no idea why people try stuff like this.


People try stuff like this because it's precisely the kind of problem that AI would be useful for. If one of these models turned out to be really good at it, it would signify that they're now useful for a whole class of problems.


If you want to solve math problems, LLMs are very useful for this. Exactly how trained professionals are.

You make the model write code for you to solve it.

Would you ask your dad to compute the correlation matrix between 40 thousand vectors? No? Then don't ask an LLM to do it.


I ask ChatGPT / Bard to do all kinds of things I wouldn't ask my Dad for. This is a weird perspective.


Besides, GPT-4 did solve this question perfectly. I like that rather than just involving math, there’s also some real life knowledge needed to give a practical answer.


Because it exposes accuracy problems as querys often involve implied or implicit math skills.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: