The Data Transfer Project

rdiddly · on July 21, 2018

Dupe - https://news.ycombinator.com/item?id=17574707

I must be pretty naive or I've been working with databases too long, but I seem to be in the minority on HN that's actually sort of excited about this project.

larkeith · on July 21, 2018

As the web matures, I suspect better inter-service compatibility is necessary, and this potentially sets the stage for that. I'm optimistic for the future this lays the groundwork for, but am cautious with regards to this project specifically, simply due to the names involved; hopefully more FOSS leaders will become involved, or this version will become superseded by a later framework, as I do not trust the current members to design with sufficient privacy safeguards.

On a side note, this will likely continue the trend of increasing the barrier to entry for more of the internet, hurting startups. Though that's dependant on how the framework is designed.

d4l3k · on July 21, 2018

I feel like this will actually lower the barrier to entry. Currently there's no easy way to migrate from the big 4 websites to other providers. If you were a new startup competitor, having to only implement one import format seems to be a good thing

larkeith · on July 21, 2018

That's a really good point! I had been thinking of it in terms of something else a new service is expected to support before going live, but if they do support it it definitely should make it easier to get new users. I'd still be wary of it being overly complex and bloated, but if not it may be a boon in disguise.

5thbootloader · on July 21, 2018

How might this increase the barrier of entry? I would believe that if a transfer protocol was standardized, the startups could focus on innovating rather than convincing people to give up on software they've grown dependent on.

larkeith · on July 21, 2018

As I mentioned elsewhere, I could see it being something that new services are essentially required to support - if you can easily transfer your data between almost all services, the cost of using one that doesn't support the framework is higher, relatively speaking. However, I hadn't really considered how much easier it may potentially make it to move to a new service, so it may well end up doing the reverse, lowering the barrier.

5thbootloader · on July 21, 2018

I'm not sure about the rest of HN but I'm definitely excited!

0xBA5ED · on July 21, 2018

>We want all individuals across the web to be in control of their data.

...by having us control it for them.

Seriously though, the more they repeat this line the less it convinces me. It simply isn't in their interest to give people _actual_ control over their data, so why would they?

thg · on July 21, 2018

> It simply isn't in their interest to give people _actual_ control over their data, so why would they?

GDPR demands it and it's good PR.

cududa · on July 21, 2018

I think it’s a bit more nuanced - they’re all giving people a full copy of their data, but you can’t do much with that on say FB> Twitter.

This to me smells like preemptive action against legislation. Sort of the companies choosing which limb or organ to give up instead of dictated by legislation

yumraj · on July 21, 2018

They should add a feature which allows me to delete my data from all the supported platforms.

d4l3k · on July 21, 2018

That feature already exists for all the supported platforms.

Google: https://support.google.com/accounts/answer/32046?hl=en

Twitter: https://help.twitter.com/en/managing-your-account/how-to-dea...

Microsoft: https://support.microsoft.com/en-us/help/12412/microsoft-acc...

Facebook: https://m.facebook.com/account/delete

sinnoh · on July 22, 2018

Did you read those links? He said to delete the data, not just the accounts.

politician · on July 21, 2018

Hey folks, a DTP team member responded to a lot of similar comments in the other thread from earlier today. Look for posts from bwillard.

https://news.ycombinator.com/item?id=17574707

joeblau · on July 21, 2018

What data does Microsoft have that one would want to "transfer"?

d4l3k · on July 21, 2018

The current "data models" appear to be Calendar, Contacts, Mail, Photos and Tasks. That fits pretty well with the data from Microsoft Office 365/GSuite. Interestingly enough it doesn't have a model for Tweets/Facebook posts so I'm not really sure those companies are part of this unless there's something more planned.

https://github.com/google/data-transfer-project/tree/master/...

joeblau · on July 21, 2018

You're the MVP for digging through that package list :). Thanks for the answer, I guess that I was thinking about the other three companies social media platforms and not necessarily the data you mentioned.

rvense · on July 21, 2018

Aren't there already perfectly good interchange formats for all these?

cududa · on July 21, 2018

Formats, yes. But if you want to do a massive migration between g suite and o365 you’re solution is hire a vendor or write up some annoyingly complicated script to perform it

kjeetgill · on July 21, 2018

They own LinkedIn.

Disclaimer: I work for LinkedIn but have heard nothing about this.

joeblau · on July 21, 2018

AH! Another very good point — I didn't think about that either.

manojlds · on July 21, 2018

And GitHub!!!

cududa · on July 21, 2018

Perhaps migrating between O365 and G apps (a nightmarish process I had the misfortune of experiencing earlier this year)

e9 · on July 21, 2018

That looks pretty scary actually. If mishandled, it will enable someone to steal all of your information from all the platforms.

d4l3k · on July 21, 2018

I assume it requires credentials for each platform, so it doesn't enable anything that they wouldn't already have access to if they could log in to every single one of your accounts.

c0p · on July 21, 2018

It requires credentials, but with a standard format/protocol it means extraction is easier for someone who might happen to acquire such credentials with nefarious intent.

lifeisstillgood · on July 21, 2018

Recently the EU laid out the PSD2 - where banks MUST allow third party access to customers accounts / statements etc.

The data transfer project is a small, glimmer of the same sort of thing - we own our data, and data providers MUST design themselves around that.

Large data holders cannot live behind walled gardens for ever - specialist data analysers are going to spring up - some will offer budget planning for my accounts - others will track my twitter feed letting me know of things of interest - the news feed will be almost as much interest as my bank account - but overall, third party data wranglers are a good thing

kyleperik · on July 21, 2018

This sounds odd. What's really the incentive for all of them behind this?

arendtio · on July 21, 2018

My best guess is the GDPR:

> To further strengthen the control over his or her own data, where the processing of personal data is carried out by automated means, the data subject should also be allowed to receive personal data concerning him or her which he or she has provided to a controller in a structured, commonly used, machine-readable and interoperable format, and to transmit it to another controller. Data controllers should be encouraged to develop interoperable formats that enable data portability. [...] Where technically feasible, the data subject should have the right to have the personal data transmitted directly from one controller to another. [1]

To me this sounds pretty similar to what we can see on Github.

[1]: https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX%3A...

stevenicr · on July 21, 2018

I am guessing self regulation to avoid Cali and Euro style for the people regulation.

Also our monopolies can be ported easily so there is no need to call us a monopoly or keep mentioning things like make us a public utility that has to follow anyone's rules but our own.

Some positive PR in the sea of "they are evil for this and that and that and..." - hey we're open.

The GDPR thing has a mandate in it that personal data must be exportable doesn't it? So, two birds, one stone, add some positive PR.

Just guessing..

Lets hope this doesn't turn out how G and Fb bragged about xmpp or whatever until they didn't need it any more and then your data and comms stopped being as interchangeable as it once was.

tannhaeuser · on July 21, 2018

I think the GDPR mandates that you can view your personal data, but doesn't require a specific format. I'd expect the motivation for this to be more like generating growth, or as a token initiative for data exchange standards.

larkeith · on July 21, 2018

Off the top of my head:

The pessimist in me says it's so each of them can create a more complete user profile for marketing, advertising, and tracking.

My more optimist side says that Facebook, Google, MS, and Twitter are all tech giants filled people intimately familiar with the web and its immediate future, and the majority of them are genuinely trying to improve the internet.

As usual, the answer probably includes both factors, and likely others.

tannhaeuser · on July 21, 2018

I was going to post that exact question. It's also surprising they're using Java as modelling language.

anonytrary · on July 21, 2018

> The Data Transfer Project makes it easy for people to transfer their data between online services. We provide a common framework and ecosystem to accept contributions from service providers to enable seamless transfer of data into and out of their service.

You use product A and B. You tell B it's OK to import data from A in order to "improve the quality of services of B" and similarly for A. Now A and B both have more insight on your personal preferences, so they both win. The user "wins" because they get "improved quality of services".

TangoTrotFox · on July 21, 2018

It's not just more insight on your personal preferences. It's the 100% certainty that one user on two different platforms is the same person. It enables companies to trade and sell your data behind the scenes with much higher levels of reliability. For instance imagine there was another TangoTrotFox on Reddit or whatever, and AI parsing of his statements led to a strong overlap in self identifying statements, vernacular, etc. You still couldn't be 100% sure it was the same person and so collating data would not necessarily be as valuable for advertising companies as it might otherwise be.

And yeah, I completely agree that this is a spot where the user "wins." Congrats, you get spammed with even more heavily targeted ads, political engineering, and more. What happy days.

anonytrary · on July 21, 2018

> It's not just more insight on your personal preferences. It's the 100% certainty that one user on two different platforms is the same person.

That's what I was talking about, but thanks for explaining it to anyone who may have missed the point.

zouhair · on July 21, 2018

Where is the catch?

dotcoma · on July 21, 2018

Is this done in the interest of users, or in the interest of the Big Data / Big Tech military-industrial complex ?

anonytrary · on July 21, 2018

Probably both -- not only is it more convenient for users to be able to swim between products, but having normalized and portable data is a godsend to anyone who needs to do data analysis. I'd imagine much more insight is possible when you can associate a user's activity on one platform to their activity on another. Imagine if your IP was shared between services; they'd know a lot more about you. On the surface, it sounds like a potential work around to GDPR.

politician · on July 21, 2018

So, bwillard, a member of the DTP project team, answered some questions I had along those lines in quite a lot of detail earlier today.

https://news.ycombinator.com/item?id=17577353

troymc · on July 21, 2018

Probably both, and also in the interest of GDPR compliance (e.g. the "right to data portability").

jaimex2 · on July 21, 2018

What are the chances of Apple joining?

scarface74 · on July 21, 2018

What data does Apple keep that would be worth transferring?

d4l3k · on July 21, 2018

iCloud has a lot of similar data to what Office 365/GSuite has, so there's probably a lot that would be worth transfering.

navs · on July 21, 2018

I'd be curious about LinkedIn actually.

larkeith · on July 21, 2018

LinkedIn is owned by Microsoft, so they may already be in.

troymc · on July 21, 2018

Calendar, Contacts, Photos, Mail...