Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks!

Our paper [1] is kind of a goofy adversarial thing where we thought "here's this cool metric, how can we break it?". The tokenizers we propose are definitely not tokenizers you should use in practice.

The original paper that proposes the metric is, imo, much more interesting theoretically [2].

[1]: https://aclanthology.org/2024.lrec-main.1469/

[2]: https://aclanthology.org/2023.acl-long.284/



Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: