>> In that sense any opt-in choice given to another is yet another privacy breac...

9wzYQbTYsAIc · on June 26, 2021

> As I understand it, the commenter's point does not rest on 'contact' linking being present. Their point is that any kind of data linking provides a reindentification risk.

It appears that the parent commenter revised its content to indicate that the concern was indeed “your data getting mixed with my data, when browsing Facebook”, to paraphrase.

My response there was essentially: ethical review would have to determine if all data must be provided through informed consent of all the originating humans.

Held to the gold standard of ethics, an IRB would likely have to contraindicate a research design if it did not provide a way for every individual human involved to provide informed consent. If any single individual in a data set indicated that they did not consent, then that data set would need to be reshaped to not include that individual. In lieu of that, the entire data set would have to be excluded from study.

Of course, that has some complex implications, when it comes to broad categories of data sources for browser usage: social networking sites would be a minefield. Did the website author provide consent for their content to be machine analyzed for sentiment, etc., if one really wanted to get down to it. You’d have to consider each and every resource location. Can’t assume that all browser traffic is open web traffic - someone could have left their Rally extension running while navigating to a corporate confidential network, complex copyrights, etc.

My understanding is that the US Supreme Court is about to decide on whether “if you can read it, you can keep it” as a consequence of Microsoft/LinkedIn vs. hiQ Labs, so don’t forget the “arms race” of justice, either.

> Many smart, well-meaning efforts have fallen prey to linkage attacks. They are insidious.

Indeed, even just basic double-blind medical studies are hard to defend when you consider operational security, let alone information security.

9wzYQbTYsAIc · on June 25, 2021

Thank you for your PS feedback, it is appreciated and will be incorporated.

My overall point is that if you don’t want data being captured that may provide data about your contacts, then dont opt-in to providing it.

Informed consent is the bedrock upon which social science ethics rests.

xpe · on June 25, 2021

Sure, I understand your point. Have you dug into the problems of data linkage attacks? (see questions above)

9wzYQbTYsAIc · on June 25, 2021

Not yet! I’m vacuously familiar with the basics, but I’m curious about the details and their relation to research design.

Will comment more fully as I find the energy.

xpe · on June 26, 2021

In case it is of interest, here is a fairly short article with a short historical look at data de-identification. If nothing else, it is one jumping off point.

"Data De-identification: Possibilities, Progress, and Perils". 2019. https://forge.duke.edu/blog/data-de-identification-possibili...