Hacker News new | past | comments | ask | show | jobs | submit login

I wasn't really commenting on the legal sufficiency of the indictment, just the rhetorical dishonesty of accusing someone of "steal[ing] well over 4,000,000 articles from JSTOR" (quote from the indictment) when JSTOR didn't own those articles. They could've just alleged violation of JSTOR's TOS and thereby theft of network services. I suspect JSTOR or people sympathetic to them had a hand in writing the indictment, though; JSTOR has a long history of attempting to spread the misinformation that it somehow "owns" its archive.



Well, if someone stole $100,000 of property from a storage facility, that wouldn't mean the storage facility claimed ownership of the property, just that it was the location of the theft. Maybe you're overthinking this a bit.


Well this is more akin to breaking into the storage facility and making a copy of all the Paintings stored there. The value of the goods being stored has not been reduced.


Not to torture this analogy any further, but would you feel safe storing your stuff at such a facility after something like that? No, you'd probably look elsewhere for your storage needs. Breaking in is still bad and would be the subject of criminal charges. If the US attorneys decide that use of the word 'stole' is somewhat over the top, then guess what? they can amend the indictment - just as the defense can amend their motions.

My point is not that the government is correct or morally justified in bringing this indictment, but that getting hung up on terminology like this obscures the legally problematic issue of having (allegedly) bypassed the security systems to download material he was not supposed to have access to, regardless of who actually owns said material.


But: (1) JSTOR isn't a storage facility in that sense; the copyright holders do not pay JSTOR to store their items, so this is a bad analogy.

(2) If the outrage is supposed to be about bypassing security systems, why is the government hung up on the "theft" terminology? Especially when JSTOR, the party arguably injured (in some way not specified), has asked the government not to prosecute?

No, this is clearly a convenient way to get a politically inconvenient person labeled a felon.


1. The analogy is only to point out that a 3rd party repository can be negatively affected by a break-in event if it doesn't have an ownership interest in the materials it stores.

2. The government is not hung up on the 'theft' terminology. The words 'steal' or 'stole' only appear three times in the 15 page indictment and the actual offenses he is charged with are wire fraud, computer fraud, unlawfully obtaining information from a protected computer, and recklessly damaging a protected computer.


Except that copyright infringement is definitely not theft. The owner doesn't lose his documents. It's fine if your opinion is that copyright infringement is wrong, but let's not call it by inappropriate names.


He is accused of stealing bandwidth from JSTOR, not the documents. "Theft of services" not theft of property. Theft of bandwidth is almost as absurd as theft via copying. JSTOR apparently isn't interested in free transmission of knowledge


If you read the indictment you'll see that they very much are not interested in free transmission of knowledge.

They charge >$50k/yr for access: " For a large research university, this annual subscription fee for JSTOR’s various collections of content can cost more than $50,000."


That price actually seems pretty reasonable for a large research university.

The real question is how much they charge individuals who want to get an article. My first google search (http://www.jstor.org/pss/27757488) results in $12/article. This is very steep when you're trying to do research and don't even know if the article is what you're looking for.


Well, you wouldn't want any old rabble getting access to valuable knowledge. Far better for that access to be safely controlled by the major research institutions, who can clearly be trusted to pursue knowledge in a responsible manner.


How is that reasonable? Sounds like Mr. Swartz was willing to host them for free! And he would have gotten away with it too if it wasn't for those meddling police.

But seriously, $12/article is ludicrous. That must be way above cost recovery or they're not doing a very efficient job of running JSTOR. Perhaps the co-founder of Reddit would do a better job...


Most public libraries have relationships with JSTOR that allow members to access the articles online. I use the Boston Public Library and look up articles via Google Scholar. All free.


Some public libraries do, but the vast majority of public libraries in the world do not.


Are you sure? Maybe not in the world, but I'm pretty sure all large public libraries in the US do subscribe to these kinds of databases.


I admit that I don't have statistics [edit: on libraries], but most libraries in the world are not large or in the US, and JSTOR's prices for a "small" library in "the rest of the world" are much, much larger than [edit: wrong — comparable to or perhaps a bit larger than, but not much, much larger than] their entire budget. Check out http://support.jstor.org/csp/PriceCalculator/. This code (for Chrome) gives me a yearly price of $81162.70, although it hangs the browser for a while first:

    function mouseEvent() { var event = document.createEvent("MouseEvents"); event.initMouseEvent("click", true, true, window, 0, 0, 0, 0, 0, false, false, false, false, 0, null); return event; }
    function each(list, thunk) { list = Array.prototype.slice.call(list); for (var ii = 0; ii < list.length; ii++) { thunk(list[ii]); } }
    each(document.getElementsByClassName('expand'), function(link) { link.dispatchEvent(mouseEvent()) })
    each(document.getElementsByClassName('e-only'), function(link) { link.dispatchEvent(mouseEvent()) })


It's sad that you have to write javascript code to do that! (But also cool that you did. :)

"Complete Current Scholarship Collection" for 22751.90 is a duplicate of all the things above it. So I think some of the entries have been double counted.

The real price for most libraries may about 1/2 or less of your estimate (they won't be interested in everything). And 20,000 to 40,000 is (well, shouldn't) be a lot of money for a public library.

That's the salary for a single employee! I would expect a library to have at least 5 employees, plus a budget to buy books.

Also I would expect a small library to have only a subset of the papers, and for serious research you would need to "go into the city".


Oh, thanks for finding that error!

I think you're thinking very much of US salaries. $40,000 a year shouldn't be a lot of money for a public library in the US, because it's the salary for a single employee (or the total costs for half an employee!), and the wonderful public library system in the US does indeed have multiple libraries. But world GDP per person is about US$10k per year, compared to the US's US$47k — and the bulk of that GDP comes from a few rich countries with only a small fraction of the population. An average country is something like Jamaica, Thailand, or the Dominican Republic, where the per-capita GDP is something like US$8.8k.

So US$40k per year is the salary for almost five employees. Except that within Jamaica or Thailand (or, to a lesser extent, the US) the median salary is much lower. And it's probably not the prime minister's niece who's working the librarian job. So maybe it's more like eight to ten employees.

So, yeah, most libraries — even measured numerically, but especially measured by the number of people who rely on them — are a lot poorer than what you're used to.

I haven't checked yet to see if the National Library here in Buenos Aires has JSTOR access.


I don't know this for sure, but I suspect that if you contacted JSTOR from a low income country they may give a better deal.

BTW, if you really do need JSTOR, it's not hard to find a library card number from a US library and use that for access anywhere. (Well, I don't know JSTOR specifically, but all the other databases I've used from my library are available to me at home after I put in my library card number.)


Their price schedule divides "Public Library – Small" into "US", "Canada", and "Rest of the World". It's possible that someone phoning them up from Senegal or Paraguay would be able to negotiate a lower price, but it's not as if their existing price list doesn't recognize the existence of different countries. (Still, lumping Switzerland and Malawi into the same category might not represent a deep level of consideration of the issues.)

For what it's worth, I was using their web site from my house here in Argentina, which is usually classified as a "middle-income country," but where you can hire a full-time employee illegally for US$4000 per year.


The prices are the same for all the versions (size or location), so I don't know why they ask.

The only thing that seems to change the price is the organization type.


So a Mercedes should cost 1/10th in Zimbabwe of what it does in the West, if people make 1/10th there?


I was rebutting a factual claim ("Most public libraries have relationships with JSTOR that allow members to access the articles online"), not a normative one. An analogous factual claim might be that most Zimbabweans drive Mercedes. Even without having access to Mercedes's sales figures by nation, that ought to appear unlikely to you?


Yes, let's agree on and further reason from the the premise that it is not currently true that most Zimbabweans drive Mercedeses ;)

My point was: your argument seems to be based on refuting the argument that the JSTOR subscription is not expensive for the average library because it is only about one yearly salary of the average rank-and-file employee, by saying that that only holds for the libraries in the US (maybe some parts of Europe, but let's say the US for the sake of this argument), and that in many other countries salaries are lower and therefor the relative cost of a JSTOR subscription higher.

So, my (perhaps naive) interpretation of this is that your ulterior argument is that JSTOR is too expensive for many libraries outside of the US, and that they therefore don't have access to its contents.

I further deduce from that, from the context in which you bring it up, is that you don't find it a problem that people take the content from JSTOR and redistribute it to people who don't have easy access to libraries who do have a subscription. Now I'll grant that this is a fairly big leap to make, and maybe you're not holding that position; but within the given context (of people arguing pro and con the actions of the Reddit guy what's-his-name), I think it's not unreasonable of me to assume so, either.

So, to close the circle, my 'question' was (but of course it is a 'question' that is, in the end, a way of stating my position in the discussion...) if it is reasonable to hold that when something is too expensive for people, it is OK to circumvent the rights holders' restrictions on the use of something. (I'm deliberately being vague on issues like 'moral ought' vs 'legal ought', if JSTOR really has a common-law variation of a database right on their collection, jurisdiction etc. - I don't really think they're important for the question at hand).


relationships = they pay the institutional fee (possibly reduced) to JSTOR


> somehow "owns" its archive

It does own its archive. They just may not own the exclusive copyright to the contents of the archive... there is a subtle distinction.


There is no non-exclusive copyright. J-STOR is not the copyright owner, period.

They do own the right to the composition of their collection, so someone who got the whole collection would be liable to infringing their right on the composition of the collection; in contrast, a random sample of articles would infringe on the publishers' IP rights rather than J-STOR's.

The subtle point is that J-STOR is absolutely not interested in the original copyright owners having to hunt down abusers, because that would (in all likelihood) appear like an additional, avoidable hassle to the latter and would make them less likely to agree to have J-STOR distribute their content. [edit: apparently it's the US Attorney General more than J-STOR who is pushing this case forward]

In comparison: if someone sneaks into a cinema to see a movie, you would accuse him of cheating them of the entrance fee, and not of "stealing the movie". If someone sneaks into a cinema and uses his camcorder to record the movie, he is cheating the movie theater of the entrance fee and misappropriating the production company's movie (with the suspicion that he might pirate it later), but he did not steal the movie from the theater. That would involve something like walking away with the movie theater's copy of the movie, which would fulfill the criterion that what's stolen is not there afterwards.

Misappropriation of IP is not stealing. It's unauthorized copying - that certainly has the potential to harm the bottomline of the copyright owner, but with an impact that is much harder to quantify than the stealing of an actual physical thing.

IP owners and friends of them who use the word 'stealing' want to frame the situation in such a way that appeal to the nonexistence of monetary loss is excluded - mostly because these same owners are investors in, and not creators of, the IP and do not have any other perspective than squeezing whatever value they can out of their investment.

(The authors of the original articles probably couldn't care less about some punk illegally downloading their texts, because they don't see any money from it anyways).


Notice on the first page of every JSTOR pdf:

Your use of the JSTOR archive indicates your acceptance of JSTOR's Terms and Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp. JSTOR's Terms and Conditions of Use provides, in part, that unlessyou have obtained prior permission, you may not download an entire issue of a journal or multiple copies of articles, and you may use content in the JSTOR archive only for your personal, non-commercial use.

Please contact the publisher regarding any further use of this work. Publisher contact information may be obtained at .http://www.jstor.org/action/showPublisher?publisherCode=hyi.

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission. JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range of content in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new forms of scholarship.


"you may not download an entire issue of a journal"

I didn't know this. It's as if the New York Times told paid subscribers they can only read 90% of any one issue.


Yeah, fortunately they don't appear to actually enforce that regularly through technical measures. As a researcher with legitimate paid access (via my institution) to JSTOR, it would be absurd if this were enforced. If there is a special issue of a journal exactly in my research area, I pretty much need to read all the articles in it, or at least skim them. To comply with the terms, do I really have to choose an article to avoid reading, so I only download (N-1) of the articles in the issue?


I agree with most of your comment, but there are a couple of points where I wanted to add some commentary.

Compilation copyright only applies to compilations where some creativity is employed in selecting the items to be included. There is no "sweat of the brow" database right under US law. JSTOR almost certainly does not have a compilation copyright on their collection, since any creativity being employed in selection is being employed by the journals they archive, not JSTOR employees.

At any rate, the indictment does not include any charges of copyright infringement.


> They do own the right to the composition of their collection

This was the point I was trying to make.

They own their database, even if they don't own the articles in it. The GP poster was trying to claim that they didn't "own their archive".


> the right to the composition of their collection

Is this a right recognized under US law?


§ 103. Subject matter of copyright: Compilations and derivative works

(a) The subject matter of copyright as specified by section 102 includes compilations and derivative works, but protection for a work employing preexisting material in which copyright subsists does not extend to any part of the work in which such material has been used unlawfully.

(b) The copyright in a compilation or derivative work extends only to the material contributed by the author of such work, as distinguished from the preexisting material employed in the work, and does not imply any exclusive right in the preexisting material. The copyright in such work is independent of, and does not affect or enlarge the scope, duration, ownership, or subsistence of, any copyright protection in the preexisting material.

http://www.copyright.gov/title17/92chap1.html#103


IANAL, but I found some testimony from the US Copyright office regarding this.

http://www.copyright.gov/docs/regstat092303.html

Excerpt: In the terminology of the copyright law, a database is a “compilation.” The Copyright Act defines a compilation as “a work formed by the collection and assembling of preexisting materials or of data....” (1) Compilations were protected as “books” as early as the Copyright Act of 1790.


By that logic nothing was stolen. They still own their archives.


I hate to quibble over words, but a painting can be stolen "from the Louvre" even if it's there on loan from a private collection, just as you could say, "The necklace was stolen from my jewelry box," without implying that your jewelry box was the legal owner of the necklace.


This is more like taking a photograph of a painting on loan to the Louvre, though--- they're alleging that a copy was made, in violation of their terms of service, of a document that they don't even own (but do host). In that case, I would think that you might be violating the Louvre's camera policy, and you might even be colloquially "stealing" something from the painting's author (e.g. if you go on to publish illicit copies from your photo), but you aren't plausibly stealing anything from the Louvre.


The argument over whether something digital can be stolen at all is a different argument than the one you made in the comment I replied to. The question of taking photographs of artwork (which are not exact copies, but which have their own cultural, educational, and commercial value) is a third question which is interesting in its own right. At this pace, I'm having a hard time keeping track of what we're arguing about.

In any case, my intention was not to signal my support for one of two predefined sides in a battle over the concept of intellectual property, it was just to point out stealing "from" doesn't have to have the meaning you read into it.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: