Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For posterity, GPT-3.5/4's tokenizer was 100k. The benefit of a larger tokenizer is more efficient tokenization (and therefore cheaper/faster) but with massive diminishing returns: the larger tokenizer makes the model more difficult to train but tends to reduce token usage by 10-15%.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: