For me, the kicker: if I'm reading it correctly, over 40% of DNS traffic to the root server they examined is just diagnostic probes from Google Chrome being used to spot malicious DNS servers.
We got hit by this issue in March when our remotr users increased 5+ times and the DNS traffic going through our VPNs was causing a headache to our DNS servers. We pinpointed this to tis Chrome functionality, which includes also other chromium based browers like new Edge, and we had to deploy a relevant GPO to disable this functionality.
Some background, I'm talking about ~200+k remote users. Also while in the office the load is distributed in tenths of DNS servers, when on VPN only a fraction of those are used. Furthermore if I remember correctly this "feature" in chrome was enabled in a version which was distributed to our clients maybe a month before the lockdowns so there was little time to see the effect while clients were still in the office
In a corporate / enterprise network where the DNS servers are Windows servers (domain controllers, in my experience, most of the time), the best thing you can do is stand up a few instances of <insert favorite DNS server here>, running on Linux, set them up as slaves for your internal zones, and point your users at those servers instead of your Windows servers.
You can also use stub zones to forward traffic for a single subdomain to your AD servers, while the other dns server handles recursive queries to the internet.
The last time I saw DNS throughput or performance issues was around 2003 on a network with 200K desktops and servers. That was 17 years ago, and they don't have a problem any more, despite growing in footprint to nearly half a million client machines.
I struggle to understand how DNS can possibly be a performance issue in 2020. In most corporate environments, the "working set" of a typical DNS servers will fit in the L3 cache of the CPU, or even the L2 cache.
The amount of network traffic involved is similarly miniscule. If all 200K of your client machines sent 100 requests per second, each 100 bytes in size, all of those to just one server, that adds up to a paltry 2 Gbps.
If your DNS servers are struggling with that, get better servers. Or networks. Or IT engineers.
Google Chrome (Linux, Mac, Windows) since version 80
Google Chrome OS (Google Chrome OS) since version 80
Chrome 80: February 4, 2020
and as a clarification
When you connect via VPN to the corporate network the DNS queries are not distributes as when you are in the office. You have a X amount of entry points for the VPN which are served by Y DNS servers which is less than the total amount of DNS servers available in the corporate network. Plus the amount of remote users increased vastly, plus the VPN technology used plus the DNS servers used. Not that simple I'm afraid
Also keep in mind that Edge is chromium based now and has the same issue. And is becoming the standard by MS and thus the impact is increased now because of this
Sure, lockdown and increased VPN use makes sense as to why this got painful in march. However I expect GP was quibbling with this part of your statement:
>Furthermore if I remember correctly this "feature" in chrome was enabled in a version which was distributed to our clients maybe a month before the lockdowns so there was little time to see the effect while clients were still in the office
Which claims that the feature was rolled out recently.
Just thinking for a few seconds, I can think of a number of ways to not repeatedly spam DNS servers while still accomplishing the objective. If you were to send 3 random queries to Google every time you opened your browser, you would quickly get hellbanned behind a recaptcha.
Not saying that we should embark on some quest for retribution against Google. It's just sad.
I bet they just made this as a throwaway and it worked fine when it was 1% of traffic and maybe they just haven't looked at it again. I bet if the APNIC people harassed them they would change it.
No, they couldn't. The whole purpose of these probe requests is to assess whether the DNS server used by a particular client is acting normally (responding with NXDOMAIN if a domain does not exist), so these must bde sent to the DNS server of the client, which effectively means that unless this DNS server performs the hijacking that is to be detected, they will inevitably end up on a root DNS server, because no server in the hierarchy will know those domains.
Forcing these probe requests onto Google's DNS would completely defy their purpose in the first place.
Instead of http://asdoguhwrouyh, they could probe something like http://asdoguhwrouyh.google or anything else in a zone owned by them, so the uncachable traffic would hit only their authoritative name servers and not the root servers.
But then a lying DNS server could easily identify those, and NOT lie about http://*.google -- the reason these requests are entirely random domain names is so they're not easily recognized as probes.
It wouldn't usually help to use 8.8.8.8, but they probably could use their own authoritative servers instead of the root servers. Look up <random chars>.dnstest.google.com or <random chars>.dev or something.
The problem with this is, of course, that a malicious resolver could detect this and NXDOMAIN those queries, while passing others through. I don't see what the incentive would be for ISPs to do that, but ISPs are weird.
They're suggesting you install your own local DNS server --- nothing prevents you from doing that, and just talking straight to the roots instead of through your ISP's DNS server.
The real problem is not your ISP, but rather the fact that the most important sites on the Internet have rejected DNSSEC and aren't signed. DNSSEC can't do anything for you with hostnames in zones that haven't been signed by their operators, and, to a first approximation, every zone managed by a serious security team (with a tiny number of exceptions like Cloud Flare, who sells DNSSEC services) has declined to do so.
Do any ISPs intercept upstream requests when running your own recursive resolver? If not, DNSSEC isn’t relevant here and you should be fine just fine with “only” running your own without requiring DNSSEC.