Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Anecdata seems quite valid for LLM comparison when trying to evaluate 'usefullness' for users. The lmsys chat leaderboard is literally just mass anecdata.


Yes, "mass anecdata" + blind collection is usually called "data".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: