How much faster (in terms of the number of iterations to a given performance) is...

		jakobov on June 27, 2024 \| parent \| context \| favorite \| on: Gemma 2: Improving Open Language Models at a Pract... How much faster (in terms of the number of iterations to a given performance) is training from distillation?