The Open Source AI Definition (OSAID) is a slap in the face to anyone who has been part of the open source community. Allowing companies to redefine "Open" to allow closed components is a complete betrayal of everything the OSI should stand for, and it was done purely so large companies can pretend their closed models are open.
To be explicit I believe your concern is the fact that they are not requiring that the training data and training methodology they used to generate the open source model be made accessible so that anyone can essentially build the model themselves from raw ingredients right? In other words imagining for a moment that folks have access to the kind of compute necessary to do that. Right?
Nevertheless giving people a building block that they can do what they want with certainly seems like free as in freedom to me. So I personally sympathize with the OSI approach but in general I'm not a big on the zealotry around the open source community.
It's almost like we have a third category here: free as in freedom but you can't necessarily rebuild it yourself.
In practice I would argue that intellectual talent has always been a hidden part of this anyway and therefore we're being intellectually dishonest to imply that this hasn't always been a de facto reality even for traditional software.
It's not just about reproducibility (although I do think that's important), it's about analysis of the model. With traditional software you have a pretty well defined "this code does this", but with machine learning models one of the only ways to validate that bias or propaganda hasn't been inserted during training.
Nobody owns their data. They just scrape the internet, or pirate massive troves of books. Just forcing companies to get a license to all the data they use, let alone an open license, would be a massive impediment to the development of open models.
It is definitely doable to get openly licensed data, you just have to do it via voluntary participation of crowdsourced data acquisition programs. For example the RNNoise model was retrained from such crowdsourced data.
Honest question; what does OSI actually do? I am involved with a number of OS projects and not once has OSI come up in any context, be it compliance, governance, education and so on.
They own the trademark of "Open Source" and use it to exercise a right to define which licences are truly open source. Now, I guess they are becoming involved in the question of what it means for an AI model to be open source, hence the politicking
Previously, if your project used one of the main OS licences you were good as far as they were concerned. They mainly existed to avoid lawyers coming up with licenses that water down the rights an open source license provides.
This is trivial to look up. They do not in fact own the trademark "open source". Apparently I can't share a direct link to uspto search results, but you can search by owner and see they have 7 trademarks, none of which are for the term "open source".
They own the trademark "Open Souce Initiative," which they say you can use with no advanced written permission if you follow all specific guidelines, including:
> the use of the term “Open Source” is used solely in reference to software distributed under OSI Approved Licenses. [1]
So you can refer to any software as "Open Source," regardless of their definition. But, if you call a piece of software "Open Source" alongside the use of the Open Source Initiative's trademark, then you must also use their definition of "Open Source," unless you otherwise have written permission.
They do not own the Open Source trademark. They tried to trademark "open source", but the USPTO denied the application. Since then, they've worked at convincing the public that OSS means anything with a license approved by the OSI. This too is not so. For example, SQLite, arguably the most successful OSS tool ever built, is not covered by an OSI license and doesn't intend to be.
SQLite has been dedicated to the public domain, ostensibly removing all copyright restrictions. Technically, it has no license for the OSI to list as an OSI license.
In more concrete terms: they're the stewards of the Open Source Definition (OSD), which is a rather explicit, but still subject to interpretation, list of criteria to decide if a particular software license is, or is not, "really Open Source". This is very important in the context of "Open Source washing" that is still a thing, and was even more important a decade or two ago, when there was a Cambrian explosion of licenses which claimed to be Open Source.
They review licenses, and act as a sort of PR team for the Free Software movement. The whole point is to make Free Software not seem too scary to businesses.
In that context it is important to differentiate Free from Open Source software.
The OSI is specifically built with a different vision from the FSF.
Free software, shall always be free, with almsource and ideally all derived works.
Open Source wants the code to be spread and for that allows inclusion with commercial software. (i.e. Microsoft was able to take open source TCP/IP stacks from BSD (BSD License) and integrate with Windows 95. That wouldn't have worked with a GPL Free Software implementation. (Even LGPL)
The supporting argument there is: By allowing that Microsoft's implementation was fully compatible to the rest of the world instead of having "bugs" (purposely?) in their own implementation, which would limit interoperability.
The free software argument is that they now took the code and closed it, not giving users a freedom to review (verify) and fix themselves. Which allowed Windows to play in TCP world instead of being an outsider.
No. "Free Software" is a term created by RMS/FSF. "Open Source" was later "formalized" by OSI to differentiate.
FSF puts it this way:
> Another group uses the term “open source” to mean something close (but not identical) to “free software.” We prefer the term “free software” because, once you have heard that it refers to freedom rather than price, it calls to mind freedom. The word “open” never refers to freedom.
And yes, the term "open source" predates OSI, but till OSI didn't have any specific definition and was slightly different for everybody. OSI created a mostly accepted definition whoch is distinct from FSF's Free Software definition.
If so, I think they made their point alr? I mean, this list is completly riddled to the brim with companies that use open-source! https://www.apache.org/foundation/sponsors
As a reminder, the OSI was formed as a corporate-friendly foil to Stallman's FSF. This is how the OSI once described its own history on its website:
> The conferees decided it was time to dump the moralizing and confrontational attitude that had been associated with "free software" in the past and sell the idea strictly on the same pragmatic, business-case grounds that had motivated Netscape. They brainstormed about tactics and a new label. "Open source", contributed by Chris Peterson, was the best thing they came up with.
Given that the OSI exists to water down a distinctly moral framework like Free Software into a version that is less "moralizing" and "confrontational" so as to be more appealing to corporations, the path that Open Source has taken over the last few years is hardly surprising.
I've become convinced that the cure for what has been ailing us in the FOSS movement is going to come only as we buck the corporate elements and return to something more closely resembling the original Free Software ethics-based movement. The GPL and AGPL are some of the only licenses not to get totally sucked up in corporate interests, and that's not a coincidence: they were founded on the deeply and sincerely held principle that it is an ethical imperative to advance the good of software's individual human users.
A few things on the "internal use loophole"--first, I'm not sure a semantic debate over the meaning of "distribution" functions as a loophole. Courts are quite capable and ready to consider questions such as these, so I'm not sure it really matters very much. Outside of the academic exercises playing out like this one, actual fact and circumstance relevant to any given case play critically into its ultimate legal determination(s), and 8-12 person juries and/or appeals courts exist to handle things otherwise.
All that said, I'm not sure what your calling this out was meant to imply in the context of the person you quoted?
You can write whatever you want in a license, though. I don't see any issue with writing a license which says 'this may only be used by an individual, not an organisation', though of course there are likely to be significant grey areas which would need to be resolved in court if it came to that.
It would have made for a better discussion if you had stated openly that you are into some fringe self-made anti-org software license. That would have made it clear what you actually want to discuss.
> Be kind. Don't be snarky. Converse curiously; don't cross-examine. Edit out swipes.
> Comments should get more thoughtful and substantive, not less, as a topic gets more divisive.
> Please don't post shallow dismissals, especially of other people's work. A good critical comment teaches us something.
> Please respond to the strongest plausible interpretation of what someone says, not a weaker one that's easier to criticize. Assume good faith.
> Please don't post insinuations about astroturfing, shilling, brigading, foreign agents, and the like. It degrades discussion and is usually mistaken. If you're worried about abuse, email hn@ycombinator.com and we'll look at the data.
I don't think you need to distinguish individuals from corporations in order to advance human freedom. You can make the licenses such that any corporation that uses them to provide services to users must provide the users with required freedoms. If businesses can make money while respecting those terms, all the better!
The same goes for the internal use loophole: the corporation should be required to provide its internal users with certain freedoms, and if they do that then mission accomplished.
reply