People are focussing on chess, which is complicated, but LLM fail at even simple... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

sceptic123 24 days ago | parent | context | favorite | on: OpenAI o3 and o4-mini

People are focussing on chess, which is complicated, but LLM fail at even simple games like tic-tac-toe where you'd think, if it was capable of "reasoning" it would be able to understand where it went wrong. That doesn't seem to be the case.

What it can do is write and execute code to generate the correct output, but isn't that cheating?

int_19h 24 days ago [–]

Which SOTA LLM fails at tic-tac-toe?

sceptic123 19 days ago | [–]

I don't know, but it's not a hard test, get the LLM to play a perfect game of tic-tac-toe against itself, look at the output and see if it goes wrong.

Consider applying for YC's Summer 2025 batch! Applications are open till May 13
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact