Services like Plausible add time into the mix. So you know that someone visited these 5 pages in 20 min. But you wont know about returning visitors. I think thats pretty significant difference.
But if what you are saying is true then it's impossible to know how many people visited your website unless you have banner. What about logs then? Sounds like everybody is happily using those because they are "legitimate interest" because servers couldn't work without them but its way more identifying data than what Plausible saves.
> Services like Plausible add time into the mix. So you know that someone visited these 5 pages in 20 min. But you wont know about returning visitors. I think thats pretty significant difference.
That doesn't make it any less PII. Also, the 20 minutes thing is just a number you plucked out of thin air - it's actually valid for 24 hours.
> But if what you are saying is true then it's impossible to know how many people visited your website unless you have banner.
No, that's not what I'm saying at all. First of all, that claim is clearly false. If your web server logged only the URL and nothing else, no time, nothing, you would have accurate usage counts for every single part of your site.
For the record, I actually think Plausible attempts to do a good job - it's clear they are trying their best to be privacy focused, not log anything, only provide data in aggregate - that's all good stuff. However, I'm not sure their stance that their don't require consent is valid, because the hash itself is PII. The reason I think the hash is PII is because of how it is being used - to identify an individual user.
Oh, and servers can work perfectly fine without logs. People like logs, but they're by no means necessary.
Logs by themselves aren't necessarily a problem if you have a clear data policy in place, and there is a legitimate use for them. The point is disclosure of the data use, and timely deletion of any data that isn't strictly necessary for the business use. So, you can keep PII around relating to billing for as long as they have a subscription, or as long as you are legally required to keep customer records for. After that, they need to be deleted. Anything like access logs that you can justify a business need for can be kept, perhaps a few days or ideally hours until you extract aggregate data, but again you need to state that in your privacy policy, and they should be promptly deleted as soon as reasonably policy.
And as I said before, all you need to do to comply with the law is to make sure you have the user's consent before tracking them. It isn't really that onerous. The question is, if you don't want the user to know how you're tracking them, why not? What are you hiding?
> And as I said before, all you need to do to comply with the law is to make sure you have the user's consent before tracking them. It isn't really that onerous. The question is, if you don't want the user to know how you're tracking them, why not? What are you hiding?
This is super wierd spin from what i said. I work on content heavy media sites that are not ad driven. Its either from grants like research or journalism or its presentation of commercial work. Architects, design studios, publishers, writers… All of these clients want to have ballpark numbers of how many people visited the site. Nobody processes or sells this data. Its 10s to 100s visitors a day. We try to use the most private way we know of.
Its crazy that because of the sick practices of this industry i am suddenly the one suspicious. Some kind of nothing to hide fallacy huh? No we are not hiding anything. We just dont want annoying consent because of visitor counter. The ones hiding something are the ones with tricky psycho designed multi step consent banners. We just dont want to be in same bunch just because few basic stats.
> All of these clients want to have ballpark numbers of how many people visited the site.
You don't need cookies for that.
Again, as I've said before, you can for instance log data for technical reasons, e.g. wanting to post-mortem a failure or attack, as long as the data is deleted promptly as a matter of course. You shouldn't use the PII in that log for analysis without the user's consent (so for a log file, that means you probably should never use the IP address except for endpoints that are only accessible to logged in users), but the URL they accessed isn't PII (unless you start putting identifying tokens in it).
If you just want ballpark numbers, just extract the URL field only, and count how many times each appears. Obviously, this will give you metrics on how popular each page / asset is, not how many unique users you have. To do that, you have to identify unique users, and to do that you need to have their consent.
> We just dont want annoying consent because of visitor counter.
But the law requires you to get their consent.
> The ones hiding something are the ones with tricky psycho designed multi step consent banners.
To be fair, I agree with you. They are deliberately designed to be awful in the hopes that the user will just take the least path of resistance and accept their terms. However, it is still a choice. In the cases when I see such a consent form, I either just close the window or I re-open it in incognito mode so I won't get a persistent cookie if it's something I really want to read.
The point is that the regulatory line needs to be drawn somewhere. The law at the moment says the line is: If PII is required for your site to function, then must ensure the user knows you're doing it. If PII isn't strictly required for your site to function, but it provides a benefit to your company (usually re-framed as how to ultimately helps the customer), then you must request consent. Both of these cases are covered by the usual kind of popup, but that's why you'll see some that you can disable (like sharing data with partners) and some you can't (like cookies for logging in). But you still need consent.
> We just dont want to be in same bunch just because few basic stats.
Then just collect basic stats like how many hits each page got. That's fine, you don't need cookies or PII for that. Number of active users isn't a basic stat though, as it clearly requires you to distinguish between different users and any process you use to do that creates PII.
Perhaps you should consider just explaining why you want the cookie in your popup. If you word it in such a way that explains that you're only using daily active users as a metric to justify continued funding, you'll probably find most people are totally happy to click accept. A message plus simple ACCEPT / DECLINE is fine, as long as the message makes clear what you're doing. Note that you can set an "essential cookie" in response to them clicking DECLINE as long as you've explained that the website uses essential and non-essential cookies, but obviously it shouldn't contain anything other than a simple accept/decline result.
Nobody is setting any cookie. You know these services are cookieless instead use their ip+salt+time hash they send from client.
Problem with server side metrics (why google analytics became so popular) is first it generates lots of noise visits from bots. But more importantly its often not possible to implement them because the hosting is handled by unable/unwilling third party.
I will not jump the gun just yet. We will keep being in this gray zone until i see the authorities have problems with approach of matomo/plausible. I have seen the opposite. If they did we would remove the analytics entirely because there is nothing worse than cookie banner which instantly annoys users and puts you on level with any other mainstream site that does fingerprinted tracking.
It's not a clear and cut case with IPs. As you say, if your servers logs IPs that seems to be classified as "legitimate interest" (for security reasons). But if you use that data to track unique users for product dev, marketing etc. reasons, that's not "legitimiate" interest anymore. At least, this is my understanding.
For example, it would make stopping a DDoS attack much harder if you would need to anonymize IPs.
Yeah, great point. It's how you process and store the data that's important.
One of the key rights individuals have is to request that ALL PII about them is deleted from all of your records, and you have to comply with this request within a certain timeframe, and a maximum of 30 days. This includes backups, logs, everything.
Obviously, it's impractical to try to edit old backups to remove PII, so you have to be careful how you deal with logs in the first place - you might want them to be backed up on another machine with a maximum lifetime of a few days, you might want to not back them up at all and only backup your aggregated data, etc.
But keeping logs for a few days can be justified for as you saying DDOS mitigation, post-failure root-cause-analysis, etc, but the defaults for that data should be to delete that data as soon as it's no longer useful for that purpose, which for most companies will be a couple of days, maybe another couple for the weekend. You can keep it still further, for instance for active analysis, but the default should be to delete it as soon as possible.
But if what you are saying is true then it's impossible to know how many people visited your website unless you have banner. What about logs then? Sounds like everybody is happily using those because they are "legitimate interest" because servers couldn't work without them but its way more identifying data than what Plausible saves.