Recommender systems like Twitter are as much about data as they are about code. ...

sokoloff · on April 16, 2022

Exactly. “Here’s the code for our matrix multiplication, dot product, and sorting; you’re welcome to check it for bugs.”

sobkas · on April 16, 2022

> Recommender systems like Twitter are as much about data as they are about code. Without the dataset and the derived statistics and models that are used for ranking and recall, the code is not going to help much with transparency.

And also other way around, data depends on algorithms and tools that created them. You can't have insightful understanding of data without knowing how and why they were created.

gfodor · on April 16, 2022

True - however the thrust of the arguments for opening Twitter seem to be focused on opening up the code - so the “code without data” scenario is the one worth honing in on.

cycrutchfield · on April 16, 2022

Right. It’s a bit worrisome that people (even people in this very thread) know nothing about how recommendation and ranking work but think they know everything. “Release the code” is nonsensical.

nopenopenopeno · on April 16, 2022

This is a ridiculous claim. While you won’t be able to make use of the code, it would still provide insight of the inputs in relative terms, and evidence methods that do not apply.

Juliate · on April 16, 2022

> and provide insight as to some things that are not being done.

That's not an algorithm, that's product strategy & management.

gfodor · on April 16, 2022

It would provide insight but if the goal is to increase transparency to the point of meaningfully improving trust in the centralized actor that is Twitter, I don’t see it moving that needle much. Given the right data you can kind of coax a lot of these algorithms into arbitrary outputs, so if you’re starting from a place of distrust then if you can’t replicate the outputs yourself you wont have evidence to alter priors significantly with just code imo.

nopenopenopeno · on April 16, 2022

The point is to increase transparency, period. And you are the one arguing against that. Nobody is saying it’s enough, but it’s objectively better than nothing, and it would make it possible to ask more specific questions and make more specific arguments for the value of more transparency. Again, you are the one arguing against it but you have not provided a reason why.

gfodor · on April 16, 2022

I’m not arguing against it - I’m arguing that people who are focused on the idea of releasing the code are miscalibrated on the marginal impact of doing so given a lack of domain knowledge.

The net suggestion is to demand more, and start describing solutions, today, to get data visibility without compromising other things, to help offset the inevitable objections that will come once people are forced to demand it when they realize the code didn’t really solve the fundamental problem.

nopenopenopeno · on April 20, 2022

Ironically, the best way to prove your own point is to open the source code.

gopher_space · on April 16, 2022

The point I'm stuck on in this whole conversation is why you'd continue to use Twitter if you felt they were untrustworthy. Why are we discussing freedom of speech when you can just load another app and get all the freedom you want?

gfodor · on April 17, 2022

The reason people are complaining is we have found ourselves in a situation where Twitter is arguably already or indisputably will become what amounts to a global public square.

The early philosophical developments around freedom of speech didn’t foresee this but now that it is here, we have to ask the question of what ought we want if we are to uphold the principle that people should be free to speak unpopular ideas without censorship or fear of excommunication, given that a key ingredient in progress, revelation of the truth, and finding compromise to avoid violence.

Twitter is of course a private company, and can do what they want. But what those who wish to see free speech principles upheld say is we ought to want those in twitters position to be as lenient as possible so as to not stifle the free exchange of ideas. And, perhaps failing that, we ought to want to see another mechanism that is less susceptible to widespread censorship and overreach. Too many people seem counter someone explaining that they feel the status quo is undesirable with an is/ought fallacy - it’s our job in liberal democracies to continually raise and debate issues regarding basic freedoms and the ability for our society to continue to evolve according to shared principles.

dhzhzjsbevs · on April 16, 2022

I dunno, I deleted my twitter when they banned the sitting president for being too hyperbolic.

A social network complaining about hyperbolism and outrageous claims being made on it's platform is like complaining people are breathing too much air.

Twitter is not a place grownups should be participating.

soheil · on April 16, 2022

I think that's what everyone is talking about. They're just using the term algorithm colloquially.

gfodor · on April 16, 2022

I don’t think so - because releasing the data is a massive hill comparatively due to the immense privacy and legal issues.

soheil · on April 16, 2022

Hmm Twitter already sells its "firehose" to the highest bidder.

gfodor · on April 17, 2022

I’m sure twitters algorithms take advantage of all interaction data, like impressions, scrolls, and clicks. Agree tho it’s good a lot of the data is public.

soheil · on April 17, 2022

Publishing user scrolls is a "massive hill" and a legal/privacy nightmare?

gfodor · on April 17, 2022

Sure seems like it could be - I wouldn’t opt-in to allowing mine to be distributed, why would I? And I don’t think Twitter has the rights to do so. So you’d at least need to solve the anonymization problem, or you’d have to package a data release such that you can replicate the ranking algorithm without it.

soheil · on April 18, 2022

User clicks in form of "likes" are already public so why would scroll data be so sacred?

djanogo · on April 16, 2022

Oh, you mean the matrix came up with rule to ban people talking about lab leak? but matrix can't figure out simple crypto bots and need humans to report and manually ban them?

shalmanese · on April 17, 2022

That's moderation which is a completely separate topic from recommendation.

11111 · on April 16, 2022

At least according to this Twitter both agrees with you and is in the process of exploring transparency for both.

https://blog.openmined.org/announcing-our-partnership-with-t...