Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I see more and more posts on HN that rely on ArXiv for scientific articles. While many researchers in CS publish their research on preprint servers, these are (a) only one part of the scientific publications out there and (b) articles often not (yet) peer reviewed.

I get that ArXiv is a convenient way to download scientific papers, especially without access to universities or libraries' access, but one should always be careful with non-peer reviewed research.



what a weird criticism. first of all in CS/math/physics I would wager >90% of publications are on arxiv as preprints before they're accepted into journals/conferences. so while it's only "one part" it's the definitely the biggest part. second of all what should one be careful about wrt prereview CS/math papers? that you implement some algo that doesn't work? this isn't cancer research that informs treatment regimes nor social science that informs public policy. it's code and theorems (please no imaginative extrapolation to algos that encode racism or something like that as a very very low likelihood risk).

but I agree with you that it's unfortunate that all papers aren't open. good things there's sci-hub.tw and libgen.io though :)


It's not that cut and dry. Only a small subset of arXiv preprints ever make it into a journal or conference proceedings.

On balance it's good that more researchers these days will publish preprints before submitting for peer review. But there are still two specific dangers with relying on preprints:

1. Most people reading papers don't try to implement or test them. Unless there is a glaring error, it's hard to tell if a paper is critically incorrect without a lot of effort.

2. Even if most people did try to implement papers, that would still make for a poor heuristic on the paper's merit. In an ideal world every valid paper could be implemented. But even in conference proceedings, it's extremely common for papers to be missing details critical to their implementation. In many cases you can't do the implementation because it requires a vast amount of computation or proprietary data available only to the company whose researchers wrote the paper.


Peer review doesn't really address those problems, though. Reviewers are just people reading papers, if it's hard to reproduce a paper's results, they can't verify that they are correct. Peer review only helps to decide whether a paper looks worthy of attention, and if you found a paper on arXiv, you probably already have some other reason to think that it might be worth looking at.

If the paper comes with code you can simply run to reproduce the results, that's a stronger signal for correctness that whether it went through peer review or not.


> If the paper comes with code you can simply run to reproduce the results, that's a stronger signal for correctness that whether it went through peer review or not.

1. Most peer reviewed papers do not come with code.

2. Many that do come with code don't work out of the box.


Exactly. That's why it's a stronger signal.


i still don't understand the force of the criticism? what is the danger in reading a "bad" paper in our domain?

the person i was responding to

>but one should always be careful with non-peer reviewed research.

what do i need to be careful about? developing the wrong intuition?


> this isn't cancer research that informs treatment regimes nor social science that informs public policy. it's code and theorems

This is a dangerous line. Being a reviewer in CS, I don't think peer review is easy. The work in this field is complex and require a lot of time to correctly evaluate the claims.

First, CS/maths is not just code and theorems. There is indeed a lot of research which requires data and empirical validation, or propose new models.

Second, it's not easy to verify claims, even for code or theorem. This is why peer review requires many reviewers, often from 2 to 6, to evaluate one paper for a conference or journal. No one can evaluate in 5min whether a 40 pages paper on ArXiv makes sense or not. When I read a 70 pages complex crypto paper, I prefer a peer reviewed, community-backed work than a PDF not verified by other researchers. When you're a reviewer, you realize that many papers are often butchered, or contain (partially) false statement, not always easy to spot.


OTOH a paper on arxiv that receives attention is likely to be better reviewed by the community than it would be by 1-2 reviewers at some point in the future.


> OTOH a paper on arxiv that receives attention is likely to be better reviewed by the community than it would be by 1-2 reviewers at some point in the future.

Anecdotally, I've occasionally received feedback in response to my posting a manuscript on arXiv which is as useful (or more) than what I've received from the peer review process for the same paper.


it may not make sense to wait like a year for the research to get submitted to a conference, rejected by someone doing competing work, submitted to another conference, rejected by a reviewer who didn’t have time to read it, then submitted to a third conference and accepted.


it would be useful though if arxiv added a field where the authors could add the journal/conference where the actual was published


#peerReview << #myMachineResults




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: