I can hardly find a healthy torrent for an obscure feature film that I care about. How am I supposed to find a healthy torrent for a random web page from the aughts?
If we want it to be distributed across laymen, we need something easier than opening torrent files (or inputting magnet URI) over a thousand times. Perhaps https://github.com/ipfs/in-web-browsers?
You're correct, but even then you've still the problem of storage - the torrents are only useful (and there's a lot of them) if a sustainable number of seeds remain available.
> You can distribute less popular websites with more used ones to avoid losing it?
So long as this distributed protocol has the concept of individual files, there _will_ be clients out there that allow the user to select `popular-site.archive.tar.gz` and not `less-popular.tar.gz` for download.
And what one person doesn't download... they can't seed back.
Distributed stuff is really good for low cost, high scale distribution of in-demand content. It's _terrible_ for long term reliability/availability, though.
More concretely, nobody wants to donate anything. They just want it to exist. Charity has never been a functional solution to normal coordination problems. We have centuries of evidence of this.
Maybe there needs to be a torrentable offline-first HTML file (only goes online to tell you if there's a new torrent whatsoever with more files), that lets you look through for more torrents (Magnet links are really tiny).
I miss when TPB used to have a CSV of all their magnet links, their new UI is trash. I can't even find anything like the old days, pretty much TPB is a dying old relic.
I want it to protect all sorts of random obscure documents, mostly kind of crappy, that I can't predict in advance, so I can pursue my hobby of answering random obscure questions. For instance:
* What is a "bird famine", and did one happen in 1880?
* Did any astrologer ever claim that the constellations "remember" the areas of the sky, and hence zodiac signs, that they belonged to in ancient times before precession shifted them around?
* Who first said "psychology is pulling habits out of rats", and in what context? (That one's on Wikiquote now, but only because I put it there after research on IA.)
Or consider the recently rediscovered Bram Stoker short story. That was found in an actual library, but only because the library kept copies of old Irish newspapers instead of lining cupboards with them.
The necessary documents to answer highly specific questions are very boring, and nobody has any reason to like them.
You could let users choose what to mirror, and one of those choices could be a big bucket of all the least available stuff, for pure preservationists who don't want to focus on particular segments of the data.
Sort of like the bittorrent algorithm that favors retrieving and sharing the least-available chunks if you haven't assigned any priority to certain parts.
Since the IA had a collection of emulators (some of them running online*), and old ROMs and floppies and such, it could probably help with that one too.
* Strictly speaking, running in-browser, but that sounded like "Bowser" so I wrote online instead.
Aren't torrents terrible at handling updates in general? If you want to make a change to the data, or even just add our remove data, you have to create a new torrent and somehow get people to update their torrent and data as well.
In practice, that's mostly how they're being used.
But the protocol does support mutation. The BEP describing the behavior even has archive.org as an example...
> The intention is to allow publishers to serve content that might change over time in a more decentralized fashion. Consumers interested in the publisher's content only need to know their public key + optional salt. For instance, entities like Archive.org could publish their database dumps, and benefit from not having to maintain a central HTTP feed server to notify consumers about updates.
How would preservationists go about automatically updating the torrent and data they seed? Or would they need to manually regularly check, if they are still seeding the up-to-date content?
Perhaps a naïve question, but hasn't this problem been solved by the FreeNet Project (now HyphaNet) [0]? (the re-write — current FreeNet — was previously called Locutus, IIRC [1]).
Side note: As an outsider, and someone who hasn't tried either version of FreeNet in more than almost 2 decades, was this kind of a schism like the Python 2 vs. Python 3 kerfuffle? Is there more to it?
Hi, Freenet's FAQ explains the renaming/rebranding here: [1]
Neither version of Freenet is designed for long-term archiving of large amounts of data so it probably isn't ideally suited to replacing archive.org, but we are planning to build decentralized alternatives to services like wikipedia on top of Freenet.
You could say the same thing about perpetual motion. Being realistic about why past efforts have failed is key to doing better in the future: for example, people won’t mirror content which could get them in trouble and most people want to feel some kind of benefit or thanks. People should be thinking about how to change dynamics like those rather than burning out volunteers trying more ideas which don’t change the underlying game.
There are certainly research questions and cost questions and practicality and subsetting and whatnot. Addressed by some ideas and not by others.
What there isn't is a currently maintained and advertised client and plan. That I can find. Clunky or not, incomplete or not.
There are other systems that have a rough plan for duplication and local copy and backup. You can easily contribute to them, run them, or make local copies. But not IA. (I mean you can try and cook up your own duplication method. And you can use a personal solution to mirror locally everything you visit and such.) No duplication or backup client or plan. No sister mirrored institution that you might fund. Nothing.
Torrents have a bad reputation due to malicious executables, I have never met someone who genuinely saw piracy as stealing, only as dangerous. In fact, stealing as a definition cannot cover digital piracy, as stealing is to take something away, and to take is to possess something physically. The correct term is copying, because you are duplicating files. And that’s not even getting into the cultural protection piracy affords in today’s DRM and license-filled world.
What does this have to do with torrents? If you get an executable from the internet it is widely known not to execute it if not trusted.
You can get malicious executables from websites too.
If this is what people think we need to work on education...
Piracy also is not unique to torrents, and yet that was what GP used.
The average person, in my experience, can barely work a non-cellphone filesystem and actively stresses when a terminal is in front of them, especially for a brief moment. Education went out the window a decade ago.
It doesn't really, you can host a server off a raw IP.
Downloading from example.com is just peer to peer with someone big. There's lots of hosting providers and DNS providers that are happy to host illegal-in-some-places content.
That's not unique, not a product, and not the part I use most.
Well, OK, maybe other webpage archives don't work as well, I haven't tried them, but there are others. And they're newer, so don't have such extensive historical pages.
Large numbers of Wikipedia references (which relied on IA to prevent link rot) must be completely broken now.
This kind of talk is simply modern politik-speak. I can't stand it and the people who fall for their deception. Stretch the truth to disarm the constituents
In what way? Torrents are used all over for content delivery. Battle.net uses a proprietary version of BitTorrent. It’s now owned by Microsoft. There’s many more legitimate uses as commented by many others.
Criminals using tools does not make the tools criminal.
This precedent is problematic I think. It seems like the populist way of addressing issues. Always just following the biggest outcry instead of the symptoms.
Just because there are currently more illegitimate users for a a thing we shouldn't prevent legitimate uses I think. The ratio might just be skewed because in the legitimate world you grow your audience with marketing investing tons of capital, while for illegitimate use cases, the marketing is often just word of mouth because of features.
That precedent was and still is legally used to federally regulate marijuana harsher than fentanyl, a precedent I strongly disagree with, so you'll have to forgive me for believing that the degree to which something causes harm matters more than the amount of misuse
My understanding is that that court case did not show that operating a torrent tracker is illegal, but specifically operating a (any) service with the explicit intent of violating copyright... huge difference IMO.
To me that's not even related to it being a torrent tracker, just that they were "aiding and abetting" copyright infringement.
Ok. But what is the case law in hosting illegal content? Sure you may operate a torrent, but if your client is distributing child porn, in my view, you bear responsibility.
Trackers generally do not host any content, just hashcodes and (sometimes) meta data descriptions of content.
If "your" (ie let's say _you_ TZubiri) client is distributing child pornography content because you have a partially downloaded CP file then that's on _you_ and not on the tracker.
The "tracker" has unique hashcode signatures of tens of millions of torrents - it literaly just puts clients (such as the one that you might be running yourself on your machine in the example above) in touch with other clients who are "just asking" about the same unique hashcode signature.
Some tracker affiliated websites (eg: TPB) might host searchable indexes of metadata associated with specific torrents (and still not host the torrents themselves) but "pure" trackers can literally operate with zero knowledge of any content - just arrange handshakes between clients looking for matching hashes - whether that's UbuntuLatest or DonkeyNotKong
We agree in that if my client distributes illegal content, I am responsible, at least in part.
On the other hand I also believe that a tracker that hosts hashes of illegal content, provides search facilities for and facilitates their download, is responsible, in a big way. That's my personal opinion and I think it's backed in cases like the pirate bay and sci hub.
That 0 knowledge tracker is interesting, my first reaction is that it's going to end up in very nasty places like Tor, onion, etc..
A tracker (bit of central software that handles 100+ thousand connections/second) is not a "torrent site" such as TPB, EZTV, etc.
A tracker handshakes torrent clients and introduces peers to each other, it has no idea nor needs an idea that "SomeName 1080p DSPN" maps to D23F5C5AAE3D5C361476108C97557F200327718A
All it needs is to store IP addresses that are interested in that hash and to pass handfuls of interested IP addresses to other interested parties (and some other bookkeeping).
From an actual tracker PoV the content is irrelevant and there's no means of telling one thing from another other than size - it's how trackers have operated for 20+ years now.
Trackers can hand out .torrent files if asked (bencoded dictionaries that describe filenames, sizes, checksums, directory structures of a torrents contents) but they don't have to; mostly they hand out peer lists of other clients .. peers can also answer requests for .torrent files.
A .torrent file isn't enough to determine illegal content.
Pornography can be contained in files labelled "BeautifulSunset.mkv" and Rick Astley parody videos can frequently be found in files labelled "DirtyFilthyRepubicanFootTappingNudeAfrica.avi"
Given that it's not clear how trackers could effectively filter by content that never actually traverses their servers.
Oh ok, it seems to be a misconception of mine then.
Mathematically a tracker would offer a function that given a hash, it returns you a list of peers with that file.
While a "torrent site" like TPB or SH, would offer a search mechanism, whereby they would host an index, content hashes and english descriptors, along with a search engine.
A user would then need to first use the "torrent site" to enter their search terms, and find the hash, then they would need to give the hash to a tracker, which would return the list of peers?
Is that right?
In any case, each party in the transaction shares liability. If we were analyzing a drug case or a people trafficking case, each distributor, wholesaler or retailer would bear liability and face criminal charges. A legal defense of the type "I just connected buyers with sellers I never exchanged the drug" would not have much chance of succeding, although it is a common method to obstruct justice by complicating evidence gathering. (One member collects the money, the other gives the drugs.)
> A user would then need to first use the "torrent site" to enter their search terms, and find the hash, then they would need to give the hash to a tracker, which would return the list of peers?
> Is that right?
More or less.
> In any case, each party in the transaction shares liability.
That's exactly right Bob. Just as a telephone exchange shares liability for connecting drug sellers to drug buyers when given a phone number.
Clearly the telephone exchange should know by the number that the parties intend to discuss sharing child pornography rather than public access to free to air documentaries.
How do you propose that a telephone exchange vet phone numbers to ensure drugs are not discussed?
Bear in mind that in the case of a tracker the 'call' is NOT routed through the exchange.
With a proper telephone exchange the call data (voices) pass through the exchange equipment, with a tracker no actual file content passes through the trackers hardware.
The tracker, given a number, tells interested parties about each other .. they then talk directly to each other; be it about The Sky at Night -s2024e07- 2024-10-07 Question Time or about Debbie Does Donkeys.
Also keep in mind that trackers juggle a vast volume of connections of which a very small amount would be (say) child abuse related.
I'll restate the principle of good usage to bad usage ratio, telephone providers are a well established service with millions of legitimate users and uses. Furthermore they are a recognized service in law, they are regulated, and they can comply with law enforcement.
They are closer to the ISP, which according to my theory has some liability as well.
It's just a matter of the liability being small and the service to society being useful and necessary.
To take a spin to a similar but newer tech, consider crypto. My position is that its legality and liability for illegal usage of users (considering that of exchanges and online wallets, since the network is often not a legal entity) will depend on the ratio of legitimate to ilegitimate use that will be given to it.
There's definitely a second system effect, were undesirables go to the second system, so it might be a semantical difference unrelated to the technical protocols. Maybe if one system came first, or if by chance it were the most popular, the tables would be turned.
But I feel more strongly that there's design features that make law compliance, traceability and accountability difficult. In the case of trackers perhaps the microservice/object is a simple key-value store, but it is semantically associated with other protocols which have 'noxious' features described above AND are semantically associates with illegal material.
> I'll restate the principle of good usage to bad usage ratio, telephone providers are a well established service with millions of legitimate users and uses
Over 10 million torrents tracked daily, on the order of 300 thousand connections per second, handshaking between some 200 million peers per week.
That's material from the Internet Archive, software releases, pooled filesharing, legitimate content sharing via embedded clients that use torrents to share load, and a lot of TV and movies that have variable copyright status
( One of the largest TV|Movie sharing sites for decades recent closed down after the sole operator stopped bearing the cost and didn't want to take on dubious revenue sources; that was housed in a country that had no copyright agreements with the US or UK and was entirely legal on its home soil.
Another "club" MVGroup only rip documentaries that are "free to air" in the US, the UK, Japan, Australia, etc. and in 20 years of publicaly sharing publicaly funded content haven't had any real issues )
> the ISP, which according to my theory has some liability as well.
The world's a big place.
The US MPA (Motion Picture Association - the big five) backed an Australian mini-me group AFACT (Australian Federation Against Copyright Theft) to establish ISP liability in a G20 country as a beach head bit of legislation.
The alliance of 34 companies unsuccessfully claimed that iiNet authorised primary copyright infringement by failing to take reasonable steps to prevent its customers from downloading and sharing infringing copies of films and television programs using BitTorrent.
That was a three strikes total face plant:
The trial court delivered judgment on 4 February 2010, dismissing the application and awarding costs to iiNet.
An appeal to the Full Court of the Federal Court was dismissed.
A subsequent appeal to the High Court was unanimously dismissed on 20 April 2012.
It set a legal precedent:
This case is important in copyright law of Australia because it tests copyright law changes required in the Australia–United States Free Trade Agreement, and set a precedent for future law suits about the responsibility of Australian Internet service providers with regards to copyright infringement via their services.
It's also now part of Crown Law .. ie. not directly part of the core British Law body, but a recognised bit of Commonwealth High Court Law that can be referenced for consideration in the UK, Canada, etc.
> but it is semantically associated with other protocols which have 'noxious' features described above AND are semantically associates with illegal material.
Gosh, semantics hey. Some people feel in their waters that this is a protocol used by criminals and must therefore by banned or policed into non existance?
The list I gave was of some public trackers, I made no claim that they were zero knowledge trackers, I simply made a statement that trackers needn't be aware of .torrent file manifests in order to share peer lists.
I also indicated above that having knowledge of .torrent manifests is problematic as that doesn't provide real actual knowledge of file contents just knowledge of file names ... LatestActionMovie.mkv might be a rootkit virus and HappyBunnyRabbits.avi might be the worst most exploitative underage pornography you can think of.
Some trackers are also private and require membership keys to access.
I was skating a lot as TZubiri seems unaware of many of the actual details and legitimate use cases, existing law, etc.
I don't think TPB ever hosted any copyrighted content, even indirectly by its users. Torrent peers do not ever send any file contents through the tracker.
The IA has tried distributing their stores, but nowhere near enough people actually put their storage where their mouths are.