Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Another reason scientists often don't release code is that code is not considered the valuable artifact produced by research--in more methods-focused fields in particular, it's instead the mathematical model, as embodied in some latex in the published paper. "Reproducing" the results then doesn't consist of trivially rerunning the same code, but in reimplementing it, possibly in a different language. This could maybe be seen as valuable in that the reproduction will surface both implementation errors and logic/modeling errors (though, really, having the source facilitates both).

In fact, more applications-focused researchers (those who combine real data with established models, for instance) tend to write higher-level scripts stringing more-engineered tools. In this case, open sourcing the scripts would be both easier and more pointless, since they will be almost the same as what is stated in English, tables, and plots in the paper. Epidemiology is usually this way, in my experience, though the linked repository seems to have some of both flavors.

The underlying misunderstand in that issue thread seems to me to be a disagreement on what the main valuable byproducts of scientific programming are. Professionally programmers will naturally think it's the code, but, traditionally, the code has been seen in academia as only a temporary vehicle for the model and it's parameters, which is the real product. (Also, the "program" might really need to include the lab, its papers, and its institutional knowledge, which is harder to package and open-source.)

Right or wrong, the assumption is that any competent grad student could reproduce the result in a week or so of (admittedly unnecessary) greenfield coding. But this is clearly not ideal, and newer work does strive for more professionalism in coding, open-source-by-default, and therefore faster replication. The project in question clearly predates this trend.

(Of course, a third reason academics don't open source is that some secrecy is required before publication in competitive fields. On a months-long project, you might not want to be committing publicly before publication. But this isn't much of an excuse.)



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: