I think GP means that if you internalize the bitter lesson (more data more compu... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		itsalotoffun 23 days ago \| parent \| context \| favorite \| on: My 2.5 year old laptop can write Space Invaders in... I think GP means that if you internalize the bitter lesson (more data more compute wins), you stop imagining how to squeeze SOTA minus 1 performance out of constrained compute environments.

reactordev 23 days ago [–]

This. When we ran out of speed on the CPU, we moved to the GPU. Same thing here. The more we work with (22T) models, quants, and decimating precision - the more we learn and find more novel ways to do things.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact