> How do they know that these are LLM crawlers and not anything else?
I can tell you what it looks like in case of a git web interface like cgit: you get a burst of one or two isolated requests from a large number of IPs each for very obscure (but different) URLs, like a file contents at a specific commit id. And the user agent suggesting it's coming from IPhone or Android.
I can tell you what it looks like in case of a git web interface like cgit: you get a burst of one or two isolated requests from a large number of IPs each for very obscure (but different) URLs, like a file contents at a specific commit id. And the user agent suggesting it's coming from IPhone or Android.