For me, the best Google analytics replacement has been nothing. Just don’t do analytics at all. Your web site will still work without it. In fact, it will work better!
> Your web site will still work without it. In fact, it will work better!
It objectively won't.
Analytics tell you where your website isn't working, so you can fix it. Buttons you thought were obvious that users are blind to. Pages where nobody scrolls because they didn't realize there was more content. Figuring out where users get stuck because they don't understand the navigation you designed. Etc etc etc.
If you have a hobby website, then sure maybe analytics don't matter. But the idea that sites work better without analytics makes as much sense as saying you'll see better when you wear dark sunglasses.
Once upon a time we did analytics and error analysis by running shell scripts executing awk, sed and grep over a apache or nginx access log or error log.
What I am trying to say is that you can still do analytics, even pretty advanced stuff with some more elaborate scripting, if you want. The only thing you need is the access log.
Something which has been largely forgotten ever since tools like Urchin became a thing :)
Except if any of your pages are cached between eyeball and your server and so your server logs don't capture everything that is going on. You can get fancy with web server logs, but depending on what you're trying to understand it may not be the data you need.
<source: did fancy things with logs over the last 25 years, including running multiple tools on the same site in parallel to do comparisons (Analog, AWStats Urchin, GA, Omniture, homegrown, etc...)>
If you control the cache layer, log it there. If you don't control the cache layer, does a read from the end user cache really count as a separate visit anyway?
There are plenty of situations where someone visiting a page once and someone repeatedly looking at that page over a period of days (even if it is pulled from their browser cache) is an important difference. Obviously it depends on what you're using the data to try to understand.
One of the greatest jobs I ever had from a technical perspective had terabytes of structured access logs hosted on prem inside of a VPN, with a few small bespoke tools to search through them (and many more pages of commands for common tasks not yet implemented in a UI).
Not a single line of tracking or analytics on the front end, we just tracked everything we cared about at the server level.
That place didn't have any European operations so no GDPR concerns¹, but for what its worth it was completely.. pseudonymous I think is the term we want? You couldn't link a server entry to an actual user account by any means² but you could group distinct server calls together as coming from the same person. These weren't "server logs" in the same of IPs or user agents or that kind of thing. More like application logs w/ scrubbed/obfuscated user data just stored in gigantic text files.
¹ To those who would say it doesn't matter, I'd say that laws aren't laws if they can't be enforced and there's no enforcement mechanism for some EU bureaucrat to fine a company with no operations outside of the US.
² I'm sure the technical means existed to do it especially if you already had access to the logs but the point is we weren't explicitly storing any PII or data that was linked to a real account. Just actions throughout the apps.
However, if you do this, you will still need to comply with all relevant privacy laws.
For example, in the EU, you need user consent to use server logs that include IP addresses for analytics. You also need to provide post-consent opt-outs and privacy statements and audit logs and all off a sudden you're building another analytics tool.
In the EU, IP addresses are personal data and you need a legal basis for each form of processing. You could make an argument that Fail2Ban falls under legitimate interest, but there is now precedent that analytics must have user consent and another legal basis will not be accepted.
Urchin was acquired by Google and was ultimately sunset in favor of Google Analytics. It supported local and hybrid analytics models, the later arguably evolved into Google Analytics.
That's just not realistic though. People with marketing departments need analytics. Otherwise, they atrophy and reveal to everyone they are not as necessary as led to believe. People without marketing departments probably never look at the logs like you.
True, but for personal/hobby sites you probably are just better off just not knowing. Nothing good comes of tying your self-worth to how much attention you think you're getting.
There is nothing to suggest that people who want to measure (and perhaps increase) their publishing reach are “tying [their] self-worth to how much attention [they] think [they’re] getting”.
This is sort of like assuming everyone who is taking photos at a tourist attraction is doing so to show off their holiday for social status.
If your site or content is truly valuable, it is a public good to monitor, analyze, and improve upon its reach and usability.
> Otherwise, they atrophy and reveal to everyone they are not as necessary as led to believe.
In my experience, when analytics and the related ads tracking tools break, Marketing departments are revealed to be much more important than generally believed in the business.
Product people need analytics too. You need to know how many people use each feature to make informed decisions on what needs to be invested in, what should be cut, etc.