The mistake you make here is to forget that the training data of the original mo...

bigfudge 4 months ago | parent | context | favorite | on: FOSS infrastructure is under attack by AI companie...

The mistake you make here is to forget that the training data of the original models was also _full_ or errors and biases — and yet they still produced coherent and useful output. LLM training seems to be incredibly resilient to noise in the training set.