Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
simonw
4 months ago
|
parent
|
context
|
favorite
| on:
Jagged AGI: o3, Gemini 2.5, and everything after
Right: we effectively all need our own evals for the tasks that matter to us... but writing those evals continues to be one of the least well documented areas of how to effectively use LLMs.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: