Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

“Running a paper” is honestly challenging these days because of the resource requirements of a lot of scientific code, or the size of certain datasets. One group might have access to a beefy cluster and there’s no pressure for very performant code when they can parallelize the work across a few dozen xeons or have access to tbs of memory. Another group might be running their code on a laptop. Maybe if your data is much larger than the authors data, their code doesn’t even work since it was designed for much smaller datasets.

Tools like nextflow or snakemake help with respect to having a one liner to generate all data in a paper potentially, handle dependencies, list resource expectations, use your own profile to handle your environment specific job scheduling commands and parameters. However, this still doesn’t do anything for whether you have access to the resources needed.



Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: