Hi, I work on the Gemma team (same as Alek opinions are my own). Essentially ins...

		canyon289 on June 27, 2024 \| parent \| context \| favorite \| on: Gemma 2: Improving Open Language Models at a Pract... Hi, I work on the Gemma team (same as Alek opinions are my own). Essentially instead of tokens that are "already there" in text, the distillation allows us to simulate training data from a larger model