Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
transcriptase
on June 29, 2023
|
parent
|
context
|
favorite
| on:
Junk websites filled w AI-generated text pulling i...
One of the questions I have is whether models are being trained on the SEO {spam|blogspam|adsense optimized|spun} websites.
duskwuff
on June 29, 2023
[–]
Almost certainly. The web crawl data that GPT (and similar) LLMs are trained on is far too large to be entirely curated.
Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: