They embed biases from the training data, which is taken from the internet at la... | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

goatlover on April 17, 2024 | parent | context | favorite | on: NPR suspends veteran editor as it grapples with hi...

They embed biases from the training data, which is taken from the internet at large. The models themselves aren't inherently biased. They're just trying to generate the next token or scene. And these models aren't from the 50s, or made by researchers in the 1950s. The models have guardrails added to try and prevent bias (and other deemed harmful content) being generated.

LamaOfRuin on April 17, 2024 [–]

None of these data sets are based on "the Internet". They are a specific subset of the internet, and the training, reinforcements, and guardrails are in no way neutral (because that's not a thing).

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact