That those practices are bad does not mean that something as simple & proper as ...

akshatpradhan · on March 17, 2018

>The GDPR is trying to do a good thing, but it goes too far

By asking for:

* Data Classifications?

* Privacy Impact Assessments?

* Access Controls?

* Breach Escalations?

If your business is collecting and processing data on individuals, you should already have these Security 101 basics in place.

stupidcar · on March 17, 2018

It also requires you explain the logic of any algorithmic decision you make.

I.e. If you use any sort of machine learning model, such as a neural network, you have to be able explain every decision it makes. Given that there's currently no know method to fully explain the outcome of a neural network decision, GDPR would apparently make it illegal for any EU business to use a neural network in any user facing capacity.

GDPR advocates answer to this seems to be that, while the regulations might read as such, it hasn't actually been tested in court yet and, who knows, maybe whatever judge it eventually comes up with will decide to keep neural networks legal!

So, if you're a company using machine learning in Europe, you just have to wait a few months and keep an eye on court news to determine if you entire strategy is permitted or not. Thank God for the stability and confidence provided to businesses by the single market!

geofft · on March 17, 2018

Why are you storing IP addresses in your logs? The web server my college computer club ran made a point of not storing IP addresses. We would respond to legitimate requests from campus police / the deans / other people who could kick us out of the college (usually when someone was, like, making threats of violence in the comments of a WordPress blog or whatever), but we would often answer with, "No, we don't have that data, it never got logged."

If the GDPR is forcing businesses to abandon dangerous logging practices that they don't really need, it is hardly going too far.

twunde · on March 17, 2018

Most applications will log ip addresses by default. Why? 1) Many security products rely on IP addresses in order to blacklist known malicious users (including fail2ban) and/or detect hacking attempts. Monitoring for stolen credentials b also will typically check IP addresses ie why is John showing up as being logged in from an unknown IP address when he's in the office with me? 2) It can be useful for debugging. Are those requests going to the right server? Is there one server where something isn't working correctly ie is that server misconfigured? 3) Some businesses are required to prevent screenscraping of certain data. Solutions to prevent that typically use and store IP address information.

geofft · on March 17, 2018

1 can be done by checking but not logging IPs, which we certainly did (we'd tcpdump traffic and throw it in iptables) or logging IPs of malicious requests but not of normal behavior.

2 is what this regulation is intended to stop: you shouldn't be trading off "it might be useful in the future" for "it can be misused by authorized users, or exfiltrated by hackers".

3 seems reasonable, but does that require a retention policy of more than a couple of hours?

tombrossman · on March 17, 2018

> Why are you storing IP addresses in your logs?

Both Apache and Nginx (and others?) log IP addresses by default. The expected result of this is that storing IP addresses in server logs is widespread, useful for troubleshooting, and entirely normal.

geofft · on March 17, 2018

This seems like a very good reason to have a law about it: if the only reason you log IP addresses is that it's on by default and you haven't thought about the risks of storing it, you should think about it for your production servers. Otherwise it's purely a risk and has no benefit. (And the risk is to your users, not you, so it makes sense that a law would be needed to give your users the ability to complain about that needless risk.)

If you make the active decision that you specifically want the IP addresses, great, you can do that. Just have a strategy for keeping that sensitive data secure and getting rid of it at some point.

jimktrains2 · on March 17, 2018

What's the risk of storing an IP address. IP addresses aren't people and most people have a carrier natted or rotating address.

The real question is what value does storing the IP address even hold?

alam2000 · on March 18, 2018

The IP address is like a temporary phone number. It could not pinpoint the person persistently but can be tracked down with the help of ISP.

Other information you can retrieve from IP address is geolocation information such as https://www.ip2location.com

jimktrains2 · on March 18, 2018

It's not a phone number because it doesn't need to resolve back to anything in particular. With carrier grade nat for instance, it could be shared by many people. At best it's shared by a house.

With regards of tracking you with the help of the ISP, if you have someone with those resources, it's not you storing an IP address that's their biggest concern.

Geoinformation can also change daily. Figuring it out afterwards isn't reliable.

hotwire · on March 17, 2018

- You can identify trends and patterns in your traffic

- You can figure out where in the world your traffic is coming from

- It can be helpful in responding to security incidents

jimktrains2 · on March 17, 2018

- arbitrary user, session and request IDs

- this can be done and then discard the address

- how so? What does knowing the address months later help? Something like fail2ban and other automated systems, sure, but long term logging?

chopin · on March 19, 2018

If it doesn't have value, why storing it then?

mcintyre1994 · on March 17, 2018

I just watched Wes Bos' Learn Redux course, which was sponsored by Sentry - and he had a little video showing their service. By default it logs the user's IP (or at least did when he recorded) on all events - any client-side error, any messages the developer raises to Sentry from the client side code, any feedback form powered by Sentry the developer uses. I'm sure you can turn it off, but I imagine that kind of thing is all over some companies.

geofft · on March 17, 2018

My new favorite thing is libraries that track every mouse move you make, every scroll action, every key you type, etc., including things you pasted into a textbox by mistake, and send it back over a websocket or something to a third-party service so that the website's UX people later can see how real people interact with their website and optimize it. Google for "website mouse tracking" or "website session replay" to see a bunch of startups that do this (many of whom have ads at the top).

I am incredibly excited for the GDPR to make these products too much of a regulatory burden to be worth considering.

FridgeSeal · on March 18, 2018

They are like privacy-invading rent-seekers of the internet and I too will be pleased to see a lot of these trash companies go under.