*The Google Fonts API is designed to limit the collection, storage, and use of e...

bscphil · on June 19, 2022

Thank you for saying this. Memory suggested that this was the case; I think one problem that happens on this site is that people distrust Google so much that they will trust some completely unknown organization that they've never heard of before over one (Google) that has presumably made themselves legally liable if they use your data to track you.

(I would also note to everyone that you can simply disable sending referrers third party, which means that even if Google is using this data to track you, they won't know what sites you are visiting unless those sites use very specific combinations of fonts.)

reaperducer · on June 19, 2022

The Google Fonts API is designed to limit the collection, storage, and use of end-user data to only what is needed to serve fonts efficiently.

There's an awful lot of weasel words in there.

If it was a simple "The Google Fonts API doesn't collect or store any user data" that would be good. But there's so much hidden language in that one sentence.

- "Designed" — Well, it was designed to do that, but it doesn't. After we're caught, we'll put out a press release saying We Can Do Better™.

- "Limit" - It limits the collection. It doesn't prevent the collection. It doesn't not collect any data. It just collects "limited" data. And "limited" is defined by us and can be revised whenever we want.

- "collection, storage, and use of end-user data" has so many ways to be abused.

- "efficiently" — Efficient for who? Google? Google's advertising department? Google's profiling department? What if there's an inefficient way? What if there's a more efficient way, but it gives Google less data?

All this may seem unkind, but Google has earned the planet's distrust. In the early years, Google didn't believe that reputation matters. It does. And that's why the legal departments of billion-dollar companies like the one I work for don't allow us to use Google products.

yunohn · on June 19, 2022

There is no such thing as absolute privacy. By virtue of being a web-hosted service, you will need to interact with the end server, and that already has the potential to expose details like IP, referer, user-agent, etc.

The wording around designing and limiting collection is acknowledging this inherent problem and letting the user know that they’ve done their best to prevent malice.

It’s not weasel wording except for anons who like hating on the internet.

kube-system · on June 19, 2022

You can load fonts with absolute privacy from google by not loading fonts from google.

wewxjfq · on June 19, 2022

Does Chrome send the unique identifier with Google Fonts API requests? If so, they don't need cookies.

jefftk · on June 19, 2022

Are you talking about the x-client-data header (which isn't unique, but is relatively high entropy at <= 13-bits)? [1] that is used for evaluating the effect of experiments that Chrome is running on other Google services, which does include ads. But it is not used for personalization (I wish they would say that publicly).

For example, when I look at a Google Fonts request in Chrome developer tools I see:

    x-client-data: CKe1yQEIkrbJAQiitskBCMS2yQEIqZ3KAQiVocsBCOeEzAEIhKvMAQjys8wBCL+1zAE=
    Decoded:
    message ClientVariations {
      // Active client experiment variation IDs.
      repeated int32 variation_id = [3300007, 3300114, 3300130, 3300164, 3313321, 3330197, 3342951, 3347844, 3348978, 3349183];
    }

Each of those numbers represents an experimental treatment that is currently active for my Chrome instance. (It looks like more entropy because it's multiple values, but they're all derived from a single 13-bit per-instance seed.)

[1] https://www.google.com/chrome/privacy/whitepaper.html#variat...

pdkl95 · on June 19, 2022

> is relatively high entropy at <= 13-bits

That is only true if-and-only-if we pretend those 13 bits are the only identifying information being sent to Google when requesting a font. The HTTP request is almost certainly being sent to Google wrapped inside an IP protocol packet. For most[1] requests, there are at least 24 additional bits (why 24? see: [3]) of very-identifying data in the IPv4 Source Address field. More fingerprinting can be probably done on other protocol fields, and IPv6 obviously adds an additional 96 bits. Yes, IP addresses are not unique, but ~13 bits is easily sufficient to disambiguate most hosts on a private network behind a typical NAT. Correlating the tuple {IPv4 Src Addr, x-client-data} received on a font request is trivial: it only requires a user to login to any Google webpage that includes a font request.

>> re: your [1]

    A given Chrome installation may be participating in a number
    of different variations (for different features) at the
    same time. These fall into two categories:

       Low entropy variations, which are randomized based
         on a number from 0 to 7999 (13 bits) that's randomly
         generated by each Chrome installation on the first run.

       High entropy variations, which are randomized using
         the usage statistics token for Chrome installations
         that have usage statistics reporting enabled.

How many users have 'usage statistics reporting' enabled, and are there for a "High entropy variation"? Is it enabled by default and thus will only be disabled by the minority of people that know how to opt-out?

[1] Google reports[2] they currently see about a 60%/40% ratio of IPv4/IPv6.

[2] https://www.google.com/intl/en/ipv6/statistics.html

[3] my previous posts on this topic - re: x-client-data https://news.ycombinator.com/item?id=23562285 re: 24-bits-per-IPv4 https://news.ycombinator.com/item?id=15167059

johnchristopher · on June 19, 2022

> Google Fonts logs records of the CSS and the font file requests, and access to this data is kept secure.

and https://www.theregister.com/2022/01/31/website_fine_google_f...

leads me to believe that Google has PI when people visit sites using google fonts.

Even if they don't use it for advertising purposes long term log keeping is not required to serve fonts.

It doesn't really matter what the service is doing, they didn't ask for consent to log the IP of people downloading fonts.

To be perfectly clear: it wouldn't keep me from sleeping at night and fonts permissions should be bundled with cookie consent or there should be a permission prompt (just like when asking for youtube vid.).

jefftk · on June 19, 2022

"by including Google-Fonts-hosted font on its pages, passed the unidentified plaintiff's IP address to Google without authorization and without a legitimate reason for doing so"

It isn't about whether the IP address was logged, but about whether it was sent. Which is an unavoidable aspect of loading a resource from a server.

johnchristopher · on June 19, 2022

My concern is totally about whether or not the IP is logged though and google's vague language doesn't clear doubts about that. On the contrary:

> Google Fonts logs records of the CSS and the font file requests, and access to this data is kept secure.

Why does it point this data is kept secure if there is no PI in the first place ?

hedora · on June 19, 2022

Secure from whom? The mob? China? The US government? Google?

I'm more worried about the last two than the first two. It'd be illegal for them to secure it against US law enforcement, and they don't claim they're secure the data they log against access from themselves.

rossjudson · on June 20, 2022

Does that mean that any web page that loads a resource from any other location on the web would be fined the same way?

eurasiantiger · on June 19, 2022

The service serves very fine-grained CSS based on device detection. I’m sure there is some fingerprinting going on.