Replace peer review with “peer replication” (2021)

fabian2k · on Aug 6, 2023

I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

This of course depends a lot on the specific field, but it can easily be months of effort to replicate a paper. You save some time compared to the original as you don't have to repeat the dead ends and you might receive some samples and can skip parts of the preparation that way. But properly replicating a paper will still be a lot of effort, especially when there are any issues and it doesn't work on the first try. Then you have to troubleshoot your experiments and make sure that no mistakes were made. That can add a lot of time to the process.

This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

If someone cares enough about the work to build on it, they will replicate it anyway. And in that case they have a good incentive to spend the effort. If that works this will indirectly support the original paper even if the following papers don't specifically replicate the original results. Though this part is much more problematic if the following experiments fail, then this will likely remain entirely unpublished. But the solution here unfortunately isn't as simple as just publishing negative results, it take far more work to create a solid negative result than just trying the experiments and abandoning them if they're not promising.

kergonath · on Aug 6, 2023

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

They also tend to over-estimate the effect of peer review (often equating peer review with validity).

> If someone cares enough about the work to build on it, they will replicate it anyway. And in that case they have a good incentive to spend the effort. If that works this will indirectly support the original paper even if the following papers don't specifically replicate the original results. Though this part is much more problematic if the following experiments fail, then this will likely remain entirely unpublished.

It can also remain unpublished if other things did not work out, even if the results could be replicated. A half-fictional example: a team is working on a revolutionary new material to solve complicated engineering problems. They found a material that was synthesised by someone in the 1980s, published once and never reproduced, which they think could have the specific property they are after. So they synthesise it, and it turns out that the material exists, with the expected structure but not with the property they hoped. They aren’t going to write it up and publish it; they’re just going to scrap it and move on to the next candidate. Different teams might be doing the same thing at the same time, and nobody coming after them will have a clue.

techdragon · on Aug 6, 2023

This waste of effort by way of duplicating unpublished negative results is a big factor in why replicated results deserve to be rated more highly than results that have not been replicated regardless of the prestige of the researchers or the institutions involved… if no one can prove your work work was correct… how much can anyone trust your work…

I have gone down the rabbit hole of engineering research before and 90% of the time I’ve managed to find an anecdote or subsequent research footnotes or actual subsequent research publications, that substantially invalidated the lofty claims of the engineers in the 70s or 80s (which is amazing still despite this, a genuine treasure trove of research unused and sometimes useful aerospace engineering research and development) and unfortunately outside the few proper publications, a lot of the invalidations are not properly reverse cited research material and I could have spent a week cross referencing before I spot the link and realise the unnamed work they are saying they are proving wrong is actually some footnotes containing the only published data (before their new paper) on some old work that has a bad scan copy on the NASA NTRS server under some obscure title and no related keywords to the topic the research is notionally about…

Academic research can genuinely suck sometimes… particularly when you want to actually apply it.

aeternum · on Aug 7, 2023

Publishing positive/noteworth results only does seem like an embarrassingly obvious major flaw in academia and the greater scientific community.

A research assistant would quickly be thrown out if he/she refused to record negative experimental results, yet we somehow decide that is fine when operating as a collective.

godelski · on Aug 7, 2023

Alternatively, maybe researchers should be encouraged to publish all kinds of results and not just "novel" work. Successes, failures, insights, etc. This idea of how we evaluate researchers is often silly. We're asking people to push the bounds of knowledge and we don't know what will come from it or how long it will take. But we are hyper focused on this short term evaluation.

vibrio · on Aug 6, 2023

“They also tend to over-estimate the effect of peer review (often equating peer review with validity).“

In my experience, scientists ate comfortably cynical about peer review- even those that serve as reviewers and editors- except maybe junior scientists that haven’t gotten burned yet.

jakear · on Aug 6, 2023

It's the general public that equates "peer reviewed" with "definitely correct, does not need to be questioned".

kibibu · on Aug 7, 2023

There is genuine merit in peer review though. A lot of the marketing "papers" from OpenAI and Google would benefit from going through a peer review process instead of scholarwashing and uploading straight to Arxiv.

One simple example, (from memory) the Bard paper doesn't include results for experiments in which GPT-4 outperforms it. As a result, people come away from these works with an inflated understanding of their capabilities. This wouldn't pass peer review.

somenameforme · on Aug 7, 2023

You can often see a p-hacked study from a mile away because they measure large numbers of unnecessary variables. They then pick those 4 variables that yield a probably false signal and publish on it. One would think these would never pass peer review, but they're regularly peer reviewed, published, and then, shockingly enough, fail to replicate. Hypothesizing after the results are known falls in the same bucket here. This is why pre-publishing kicked off, yet it's also hardly a savior.

The point I make is that peer review can not be guaranteed to 'fix' science in any way we might like. The Sokal Affair [1] has now been replicated repeatedly, including in peer reviewed journals. The most recent one even got quite cheeky and published it under the names "Sage Owens, Kal Avers-Lynde III" - Sokal III. [2] It always preys on the same weakness - bias confirmation.

[1] - https://en.wikipedia.org/wiki/Sokal_affair

[2] - https://www.nationalreview.com/news/academic-journal-publish...

jakear · on Aug 7, 2023

Industry doesn't need peer review. The proof of the pudding is in the sales.

kibibu · on Aug 7, 2023

Then why does it dress its marketing up in academic paper format?

jakear · on Aug 7, 2023

In want of sales.

kergonath · on Aug 7, 2023

It gives a nice patina and some legitimacy. They have no need to publish quasi-scientific articles.

ModernMech · on Aug 7, 2023

This is the final stage of capitalism, where market success is conflated with scientific rigor.

jakear · on Aug 7, 2023

Engineering, not scientific.

High Sales = A large number of people can attest to the engineered good or service being of high enough quality that they will exchange hard earned money for the ability to use it.

Peer Review = Some folks who derive self-worth from citations ask you to add a citation to their work and you do it because you'll probably need to ask them for the same some time later.

depereo · on Aug 7, 2023

This doesn't really account for fraud or abuse. Plenty of profitable companies whose product is a scam, or has other factors that make it successful.

Market success isn't the same thing as validity.

jakear · on Aug 7, 2023

Peer review isn't the same thing as validity either. Both are proxies, sales are simply a much better proxy. Especially long term sales from an established company.

Look at it like this: a Peer Reviewed (tm) article comes out saying "Foos cannot Bar, it is impossible". The same day, Apple releases "Bar for Foos, $8/month". Over the next year you see media outlets discussing how well Foo Barring works. Online reviewers talk about how they've incorporated Barring their Foos into everyday life and it has benefitted them in all these great ways. Your colleagues at work mention their fruitful Foo Barring adventure over the past weekend. Routine posts on HN come up where hackers describe how they've incorporated the Foo Barring API's into their own products in some novel way.

Your mother then calls you up to ask if she should get involved with this new Foo Barring thing. What do you say, can Foos Bar?

tsimionescu · on Aug 7, 2023

A quick trip around audiophile companies should quickly disabuse you of the notion that high sales implies any kind of worth to a product. There are several companies making lots of money on selling gold HDMI, ethernet etc cables "to improve sound quality", and plenty of rubes buying them and "hearing the difference".

memefrog · on Aug 7, 2023

Real Hi-Fi companies like Sennheiser and Bose make far more money than people selling gold HDMI cables. And they've been making much more money for decades and will continue to make money for decades.

Grifters don't win in the long-term.

Qwertious · on Aug 8, 2023

>Grifters don't win in the long-term.

Plenty of religions suggest otherwise.

(Dear reader: I'm not referring to your religion - I'm referring to the other religions.)

memefrog · on Aug 8, 2023

Wow so edgy and cool. Are you a teenager?

optionalsquid · on Aug 7, 2023

You frequently see this kind of reasoning in medical quackery:

> Butt-candling[1] must work, just look all these happy customers!

But history is replete with ineffective or downright harmful treatments being popular long after the evidence showed them to be ineffective or harmful. Homeopathy is a prime example of this, seeing as those concoctions contain either no active ingredients or (in cases of low dilutions still labeled "homeopathic") contain ingredients picked based on notions of sympathetic magic ("like cures like").

[1] A hopefully fictional example.

jakear · on Aug 7, 2023

This is a perfect example. In the real world, some placebos work. You take them, believe that they will work, and they do. Fantastic! Your problem has been solved, you share your story with friend(s), the cure spreads based on how well it works for them.

It's only in academia that they must instead work in a sterile environment that some person who has never stepped foot off a school campus thinks is "more authentic" for them to be seen as legitimate.

kjkjadksj · on Aug 7, 2023

What is honestly a lot more compelling than peer review is multiple supporting pieces of evidence from different sources. One result might be spurious, but if you can find a couple independent studies showing as much its probably a real phenomenon.

PaulHoule · on Aug 7, 2023

Except for the ones that conclude it is all a scam.

godelski · on Aug 7, 2023

It's because we all know it is a con to just get money from universities and governments. We had a long time without this system and we perverted it from its original intent. 3-4 people reading a paper are incapable of proving something is valid (the inverse is true, but also remember there's 3 options: true, false, indeterminate). It's even more silly that we think we can rank them (i.e. top tier venues). Rejection rate isn't a ranking, and it is silly to think it could be (way too easy to hack). We just go with the momentum and because that's what the admins require of us. Otherwise, let's me honest, we're all communicating our work outside the structure of journals and conferences (at least they offer meetups). It's very hard for me to justify their existence if we're being honest. And honestly, I don't see how abandoning these systems wouldn't just save us time and money. Like god, the number of hours I spend reworking things which got no actionable feedback from the previous round, is insane. It is just too easy to get non-actionable rejections and that just wastes everybody's time and money, and frankly holds back science.

kergonath · on Aug 6, 2023

Yes, because we know how the metaphorical sausage is made: with unpaid reviewers who have many other, more interesting things to do and often an axe to grind. That is, if they don’t delegate the review to one of their post-docs.

aftoprokrustes · on Aug 6, 2023

Post doc? In what kind of utopian field did you work? In my former institute virtually all papers were written by PhD candidates, and reviewed by PhD candidates. With the expected effect on quality (due to lack of experience and impostor-syndrome-induced "how can I propose to reject? They are likely better than me"). But the Prof-to-postdoc-to-PhD-ratio was particularly bad (1-2-15).

godelski · on Aug 7, 2023

> "how can I propose to reject? They are likely better than me"

Funny enough, I see exactly the opposite. I've seen this in both reviews I've done and reviews I've received. Just this week I reviewed and saw one of my fellow reviewers write in their justifications: I am not familiar with X, but I am skeptical that the method can scale to a more complex application. Their weaknesses section was extremely generic and it was very clear they didn't understand the work. They gave a weak reject. In fact, when I first started reviewing, I was explicitly told to _only_ accept if I was confident that the work was good. So in my experience, the bias goes the other way that you are proposing.

Btw, I've even seen undergrads acting as reviewers. I was asked to review in in my first year of grad school. I don't think I was qualified then, but I was always a junior reviewer rather than a full so idk.

kelipso · on Aug 6, 2023

I was reviewing papers starting second semester of grad school with my advisor just signing off on it, so not even PhD candidates, and it was the same for my lab mates too.

Initially we spent probably a few hours on a paper for peer review because we were relatively unfamiliar with the field but eventually I spent maybe a couple of hours doing the review. Wouldn't say peer review is a joke but it's definitely overrated by the public.

godelski · on Aug 7, 2023

> with unpaid reviewers who have many other, more interesting things to do

It's kinda funny. A journal doesn't make the product it sells (the papers that it copywrites). It doesn't pay for the service it performs ("vetting" and editing). And both of these would be done regardless of their existence. I can understand distribution, but that hasn't been useful for over a decade now. What even do these things do anymore?

(btw, I've seen profs delegate to undergrads. And it is quite common for post-docs AND grad students to be reviewers. Trust me, I am one)

kergonath · on Aug 7, 2023

> What even do these things do anymore?

Networking, mostly, in the sense that an article in a high impact journal has a higher probability to be integrated in citations networks. The fact that there is some gate keeping means that it’s valuable to be in rather than out, and that’s something you can use to get a position. Also, better journals (which are not necessarily the highest-impact ones) tend to have more thorough peer review (such as 3 reviewers by default instead of 1 or 2, editors who are not afraid to ask for more reviews if the 3 are not conclusive, etc).

> (btw, I've seen profs delegate to undergrads. And it is quite common for post-docs AND grad students to be reviewers. Trust me, I am one)

I am lucky not to have been in that situation when I was a student, and I did not delegate any further when I got the occasional review from the prof when I was a post-doc. But I am unfortunately not surprised.

godelski · on Aug 7, 2023

I am not asking "what do they do" in the sense of bureaucratic circular logic, but rather what do they serve us, as scientists, that is a unique thing that is not solved by other things that we do.

What field are you in where post-docs aren't getting calls to review directly from the conference? I'm in ML and it isn't my advisor assigning me reviews, it is the conference.

renonce · on Aug 6, 2023

I don't know how scientists handle peer review but aren't they fighting with peer review to get their papers published and apply for PhD and tenure and grants etc with these publications?

sebzim4500 · on Aug 6, 2023

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I think it would be fine to half the productivity of these fields, if it means that you can reasonably expect papers to be accurate.

dmarchand90 · on Aug 6, 2023

I believe that, contrary to popular belief, the implementation of this system would lead to a substantial increase in productivity in the long run. Here's why:

Currently, a significant proportion of research results in various fields cannot be reproduced. This essentially means that a lot of work turns out to be flawed, leading to wasted efforts (you can refer to the 'reproducibility crisis' for more context). Moreover, future research often builds upon this erroneous information, wasting even more resources. As a result, academic journals get cluttered with substandard work, making them increasingly difficult to monitor and comprehend. Additionally, the overall quality of written communication deteriorates as emphasis shifts from the accurate transfer and reproduction of knowledge to the inflated portrayal of novelty.

Now consider a scenario where 50% of all research is dedicated to reproduction. Although this may seem to decelerate progress in the short term, it ensures a more consistent and reliable advancement in the long term. The quality of writing would likely improve to facilitate replication. Furthermore, research methodology would be disseminated more quickly, enhancing overall research effectiveness.

matthewdgreen · on Aug 6, 2023

In the current system scientists allocate reproduction efforts to results that they intend to build on. So if you’ve claimed a breakthrough technique for levitating widgets — and I think this widget technique can be used to build spacecraft (or if I think your technique is wrong) — then I will allocate precious time and resources to reproducing your work. By contrast if I don’t think your work is significant and worth following up on, then I allocate my efforts somewhere else. The advantage is that more apparently-significant results (“might cure cancer”) tend to get a bigger slice of very limited resources, while dead-end or useless results (“might slightly reduce flatulence in cats”) don’t. This distributed entrepreneurial approach isn’t perfect, but it works better than central planning. By contrast you could adopt a Soviet-like approach where cat farts and cancer both share replication resources, but this seems like it would be bad for everyone (except the cats.)

SkyBelow · on Aug 7, 2023

>In the current system scientists allocate reproduction efforts to results that they intend to build on. So if you’ve claimed a breakthrough technique for levitating widgets — and I think this widget technique can be used to build spacecraft (or if I think your technique is wrong) — then I will allocate precious time and resources to reproducing your work.

Wouldn't this imply that worthwile results are already being replicated, so the primary cost has been paid and we just need to give some method to disseminate this work and then have it factor into the credibility of science? pre-print < peer reviewed < peer replicated, with the last step having internal rankings depending upon how much it has been replicated?

And it'll also show when someone is building on things but not replicating it, which I guess is an issue in some fields more than others.

GravelRocks · on Aug 7, 2023

"Cat fart" research might also be incredibly expensive to replicate compared to the "might cure cancer" research. In that case it would effectively get a bigger slice of resources because we're treating all research the same!

rkangel · on Aug 7, 2023

It is analogous to spending half your time writing tests when developing software. Yes it slows you down in the very short term, but in the medium and long term it speeds you up as you have confidence in the code and there aren't bugs lurking everywhere.

advisedwang · on Aug 6, 2023

It would be more than just half productivity. Not only do you have to do the work twice, but you add the delay of someone else replicating before something can be published and built upon by others. If you are developer, imagine how much your productivity would drop going from a 3 minute build to a 1 day build.

wilde · on Aug 6, 2023

You can easily look up non peer reviewed papers on arxiv. Why would this be different with replication?

orangepurple · on Aug 6, 2023

Terrible analogy. It might take months to come up with an idea but another should be able to follow your method and implement it much more quickly than it took you to come up with the concept and implement it.

cycomanic · on Aug 6, 2023

I think you don't understand how much work is involved in just building the techniques and expertise to pull some experiments off (let's not even talk about the equipment).

Even if someone meticulously documents their process, it could still take months to replicate the results.

I'm familiar with lithography/nanofabrication and I know that it is typically the case that a process developed in one clean-room can not be directly applied to a different clean room and instead one has to develop a new process based on what the other results.

Even in the same lab it can often happen that if you come back to a process after a longer time, that things don't work out anymore and quite a bit of troubleshooting ensues (maybe a supplier for some chemical changed and even though it should be the same formula it behaves slightly different).

RoyalHenOil · on Aug 6, 2023

Months. Haha.

I previously worked in agricultural research (in the private sector), and we spent YEARS trying to replicate some published research from overseas. And that was research that had previously been successfully replicated, and we even flew in the original scientists and borrowed a number of their PhD students for several months, year after year, to help us try to make it work.

We never did get it to fully replicate in our country. We ended up having to make some pretty extreme changes to the research to get similar (albeit less reliable) results here.

We never did figure out why it worked in one part of the world but not another, since we controlled for every other factor we could think of (including literally importing the original team's lab supplies at great expense, just in case there was some trace contaminant on locally sourced materials).

hnfong · on Aug 7, 2023

> We never did figure out why it worked in one part of the world but not another

Doesn't that indicate further research is needed? It sounds fascinating to me. (I know it isn't interesting for the people who couldn't get it working.) It also might indicate that the original research was incomplete in the sense that it might be a fluke due to specific conditions in the original country which isn't universal.

faeriechangling · on Aug 7, 2023

You are making an argument for replication not against it by stressing how meticulously documenting your process is an “if”, strewing how things like a supplier for a chemical can change and render reproduction impossible, and even if an observation can only be replicated in one clean room that means you effectively only have as long as that clean room remains opens to replicate it.

You are almost stressing all the ways we are producing garbage rendered non-reproducible with deficient documentation of processes, changes in supply, and changes in the environment. All three can be minimized through peer replication.

wholinator2 · on Aug 7, 2023

Well there isn't a way to fix this replicative difficulty for many fields. So you're suggesting that the theoretical understanding gained from having produced a specific material just never be added to scientific understanding at all, ever. In an ideal world with ideal budget allotment there would be time and money to build entire replicating labs that can be moved and modularly reconfigured with absolute control over air particulates, pressure, flow, sufficient to replicate any room possible anywhere on earth. But if a lab in the himalayas was able to make a magnet that fundamentally reconfigures our understanfing of physical phenomena, we should simply abandon that potential deeper truth into were able to build a second lab in the Himalayas, or in the middle of the Atlantic.

Reproduction is hard, really really fucking hard. Just saying, that means we should replicate before trying to understand, means essentially cutting off the understanding.

And like others have said, if someone wants to build from it they'll depend on that information being correct, if no one can manage to ever build from it then the idea dies.

Also there's a huge difference between, replicate this study on infant response to stimulus, or spider colony behavior, and, replicate this incredibly intricate semiconductor that took years of configuration to correctly produce.

evandrofisico · on Aug 6, 2023

Usually coming up with a idea is the easy part. For example, in my PhD project, i started with an idea from my advisor that he had in the early 2000.

Implementing the code for the simulation and analysis of the data? four months, at most. Running the simulation? almost three years until I had data with good enough resolution for publishing.

tnecniv · on Aug 6, 2023

It’s also very easy to come up with bad ideas — I did plenty of that and I still do, albeit less than I used to. Finding an idea that is novel, interesting, and tractable given your time, skills, resources, and knowledge of the literature is hard, and maybe the most important skill you develop as a researcher.

For a reductive example, the idea to solve P vs NP is a great one, but I’m not going to do that any time soon!

PaulHoule · on Aug 7, 2023

I'd say it is a bad idea because you're not going to succeed at it.

tnecniv · on Aug 7, 2023

Sure because that’s an easy example and kind of my point. When you’re starting your PhD, it can be hard to determine what questions you can and cannot answer given your skills and surrounding literature. That comes from experience and trying (and failing) a lot.

magimas · on Aug 6, 2023

horrible take. Taking the LK99 situation as an example: simply copying and adapting a well described growth recipee to your own setup and lab conditions may take weeks. And how would you address situations where measurement setups only exist once on the earth? How would you do peer replication of LHC measurements? Wait for 50 years till the next super-collider is built and someone else can finally verify the results? On a smaller scale: If you need measurements at a synchrotron radiation source to replicate a measurement, is someone supposed to give up his precious measurement time to replicate a paper he isn't interested in? And is the original author of a paper that's in the queue for peer replication supposed to wait for a year or two till the reviewer gets a beamtime on an appropriate measurement station? Even smaller: I did my PhD in a lab with a specific setup that only a single other group in the world had an equivalent to. You simply would not be able to replicate these results.

Peer replication is completely unfeasible in experimental fields of science. The current process of peer review is alright, people just need to learn that single papers standing by themselves don't mean too much. The "peer replication" happens over time anyway when others use the same tools, samples, techniques on related problems and find results in agreement with earlier papers.

crote · on Aug 7, 2023

Where are you going to get the budget to build a second LHC solely for replication? How are you going to replicate a long-term medical cohort study which has been running for thirty years? What about a paper describing a one-off astronomical event, like the "Wow!" signal? What if you research the long-term impact of high-dose radiation exposure during Chernobyl?

There is plenty of science out there which financially, practically, or ethically simply by definition cannot be replicated. That doesn't mean their results should not be published. If peer review shows that their methods and analysis are sound, there is no reason to doubt the results.

philwelch · on Aug 7, 2023

I think your examples are more amenable to replication than you assume:

> Where are you going to get the budget to build a second LHC solely for replication?

In cases like this you could simply have a second, independent team time-sharing the LHC and using it to replicate experiments run by the first team. (And vice- versa). It’s not a perfect replication but it’s probably still an improvement over the “just trust me bro” status quo.

> How are you going to replicate a long-term medical cohort study which has been running for thirty years?

Run two independent studies in parallel from the beginning.

> What about a paper describing a one-off astronomical event, like the "Wow!" signal?

There was a ton of effort invested into trying to replicate that observation! Since nobody else ever managed to do so, we can’t draw any conclusions from it.

> What if you research the long-term impact of high-dose radiation exposure during Chernobyl?

That doesn’t preclude replication unless, for some reason, you’re the only researcher researching the long-term impact of high-dose radiation exposure during Chernobyl.

crote · on Aug 7, 2023

> In cases like this you could simply have a second, independent team time-sharing the LHC and using it to replicate experiments run by the first team. (And vice- versa). It’s not a perfect replication but it’s probably still an improvement over the “just trust me bro” status quo.

That's not a true replication, and it isn't going to avoid issues like the OPERA experiment measuring neutrinos going faster than the speed of light due to a loose connector. It would not be any different from having the second team just run their own analysis on the data from the first team - at which point the second team can just as well simply validate the first team's analysis like peer review is currently doing.

> Run two independent studies in parallel from the beginning.

So all currently-running long-running research has to be thrown out? What if the two studies find very small differences, are you allowed to publish either of them? Are the two teams allowed to collaborate at all?

> There was a ton of effort invested into trying to replicate that observation! Since nobody else ever managed to do so, we can’t draw any conclusions from it.

You can't "replicate" an observation of a freak astronomical event because you can't trigger a freak astronomical event. At best you can do observations and hope it happens again. We indeed cannot draw any conclusions from it, but that doesn't mean you can't publish papers about it. If replication is mandatory, you would not be allowed to do anything with it at all.

> That doesn’t preclude replication unless, for some reason, you’re the only researcher researching the long-term impact of high-dose radiation exposure during Chernobyl.

It cannot be reproduced because it would be unethical to expose people to near-fatal levels of radiation simply for reproduction. Simply reusing data from the original test subjects isn't a reproduction, after all.

philwelch · on Aug 7, 2023

> That's not a true replication

As you already quoted me as saying, it's not a perfect replication. Which is fine! I'm advocating a position of "replicate findings as much as reasonably possible", not a position of "we need to build redundant copies of every multi-billion-dollar research megaproject". My whole point is that this doesn't need to be an absolutist true-or-false sort of thing.

> and it isn't going to avoid issues like the OPERA experiment measuring neutrinos going faster than the speed of light due to a loose connector

Maybe not. I never claimed this would solve every problem in all of science forever.

> So all currently-running long-running research has to be thrown out?

No. I think it's reasonable to propose more rigorous standards for future studies without throwing out every in-progress study that didn't follow those same standards. After all, there are literally centuries of published science that didn't even follow the contemporary standards of peer review, and we haven't thrown any of that out.

> What if the two studies find very small differences, are you allowed to publish either of them?

That's an extremely broad question. You might as well ask, "what does it mean for a finding to be replicated?".

If you and I each independently go out to measure the length of the Golden Gate Bridge in millimeters, there are likely to be very small differences in the result you get and the result I get. There's an expected margin of error here where we can agree that our results are consistent with each other. Sometimes the differences are reasonable and can be explained, and sometimes they can't be explained.

Regardless, I also think it might even be valuable for the studies to be published in some form even if they don't replicate at all; just not necessarily with the imprimatur of some credible or prestigious journal.

> Are the two teams allowed to collaborate at all?

I would suggest keeping the two teams independent at least until they both have results. Afterwards, it might be valuable for the team to collaborate in terms of trying to reconcile their results.

> You can't "replicate" an observation of a freak astronomical event because you can't trigger a freak astronomical event. At best you can do observations and hope it happens again. We indeed cannot draw any conclusions from it, but that doesn't mean you can't publish papers about it. If replication is mandatory, you would not be allowed to do anything with it at all.

Yeah, I guess I'm fine publishing a paper that just says "here's this anomalous observation we had" if it's an especially interesting anomalous observation like that.

> It cannot be reproduced because it would be unethical to expose people to near-fatal levels of radiation simply for reproduction.

Obviously. I think you know that's not what I'm suggesting at all here.

> Simply reusing data from the original test subjects isn't a reproduction, after all.

At this point, I agree we're mostly stuck with whatever data we managed to get 37 years ago. But if a similar incident happened in the future, you could have independent teams collecting redundant sets of data.

mapt · on Aug 6, 2023

We could easily 10x the funding and 5x the manpower we throw at STEM research if we actually cared what they produced.

NSF grants distribute 8.5 billion dollars a year, which is less than Major League Baseball (and its Congressionally granted monopoly) makes. The US Congress has directed 75 billion dollars in aid to Ukraine to date.

dkqmduems · on Aug 7, 2023

That would be reasonable, which isn't the point.

harimau777 · on Aug 6, 2023

The issue that I see is: even if halving productivity is acceptable to the field as a whole; how do you incentivize a given scientist to put in the effort?

This seems particularly problematic because it is already notoriously hard to get tenure and academia is already notoriously unrewarding to researchers who don't have tenure.

ImPostingOnHN · on Aug 6, 2023

half would only be possible if, for every single paper published by a given team, there exists a second team just as talented as the original team, skilled in that specific package of techniques, just waiting to replicate that paper

hoosieree · on Aug 6, 2023

Half is wildly optimistic.

sqrt_1 · on Aug 6, 2023

FYI there is a at least one science journal that only publishes reproduced research:

Organic Syntheses "A unique feature of the review process is that all of the data and experiments reported in an article must be successfully repeated in the laboratory of a member of the editorial board as a check for reproducibility prior to publication"

https://en.wikipedia.org/wiki/Organic_Syntheses

jamesash · on Aug 7, 2023

Started in 1924 and still going strong 100 years later. The gold standard for organic chemistry procedures.

"If you can't reproduce a procedure in Org Syn, it's YOUR fault" - my PhD supervisor

ebiester · on Aug 6, 2023

It's simple but not easy: You create another path to tenure which is based on replication, or on equal terms as a part of a tenure package. (For example, x fewer papers but x number of replications, and you are expected to have x replications in your specialty.) You also create a grant funding section for replication which is then passed on to these independent systems. (You would have to have some sort of randomization handled as well.) Replication has to be considered at the same value as original research.

And maybe smaller faculties at R2s pivot to replication hubs. And maybe this is easier for some sections of biology, chemistry and psychology than it is for particle physics. We could start where cost of replication is relatively low and work out the details.

It's completely doable in some cases. (It may never be doable in some areas either.)

tnecniv · on Aug 6, 2023

Your proposal has a whole slew of issues.

First, people that want to be professors normally do so because they want to steer their research agenda, not repeat what other people are doing without contribution. Second, who works in their lab? Most of the people doing the leg work in a lab are PhD students, and, to graduate, they need to do something novel to write up in their dissertation. Thus, they can’t just replicate three experiments and get a doctorate. Third, you underestimate how specialized lab groups are — both in terms of the incredibly expensive equipment it is equipped with and the expertise within the lab. Even folks in the same subfield (or even in the same research group!) often don’t have much in common when it comes to interests, experience, and practical skills.

For every lab doing new work, you’d basically need a clone of that lab to replicate their work.

majormajor · on Aug 6, 2023

> First, people that want to be professors normally do so because they want to steer their research agenda, not repeat what other people are doing without contribution.

If we're talking about weird incentives and academia you hit on one of the worst ones right here, I think, since nothing there is very closely connected to helping students learn.

I know that's a dead horse, but it's VERY easy to find reasons that we shouldn't be too closely attached to the status quo.

> For every lab doing new work, you’d basically need a clone of that lab to replicate their work.

Hell, that's how startup funding works, or market economies in general. Top-down, non-redundant systems are way more fragile than distributed ecosystems. If you don't have the competition and the complete disconnection, you so much more easily fall into political games of "how do we get this published even if it ain't great" vs "how do we find shit that will survive the competition"

ebiester · on Aug 7, 2023

I was thinking that this would be more of a postdoc position, or that most labs would be hybrid: that is, they are doing some portion of original research and replication research. If they are replicating, they are made authors on the paper as a new position on the paper. They get credit, but a different type of credit.

A Ph.D would be expected to perform some replication research as part of their package.

Finally, we would start with the fields that are easy to replicate and move up. We couldn't replicate CERN if we tried. But we could implement this in psychology tomorrow, for example.

glial · on Aug 7, 2023

I agree that asking psychology grad students to register & replicate a result as part of their training would be a boon to the field -- and useful experience for them.

rapjr9 · on Aug 6, 2023

Another approach I've seen actually used in Computer Science and Physics is to make replication a part of teaching to undergrads and masters candidates. The students learn how to do the science, and they get a paper out of replicating the work (which may or may not support the original results), and the field benefits from the replication.

harimau777 · on Aug 6, 2023

I think that there's also a lot of psychological/cultural/political issues that work also need to be worked out:

If someone wins the Nobel Prize, do the people who replicated their work also win it? When the history books are written do the replicators get equal billing to the people who made the discovery?

When selecting candidates for prestigious positions, are they really going to consider a replicator equal to an original researcher?

Eddy_Viscosity2 · on Aug 6, 2023

It's not easy because it isn't simple. How do get all of the universities to change their incentives to back this?

ebiester · on Aug 6, 2023

We agree - the "simple not easy" turn of phrase is speaking to that point. It is easy once implemented, but it isn't easy to transition. (I am academia-adjacent by marriage but closer to the humanities, so I understand the amount of work it would take to perform the transition.)

MichaelZuo · on Aug 6, 2023

This isn't just not easy, it would probably be extremely political to change the structure of the NSF, National Labs, all universities and colleges, etc., so dramatically.

SkyMarshal · on Aug 6, 2023

> x fewer papers but x number of replications, and you are expected to have x replications in your specialty.

Could it be simplified it even further to say x number of papers, but they only count if they’re replicated by others in the field?

nine_k · on Aug 6, 2023

No, the idea is that the same researcher should produce k papers and n replications, instead of just k + n published papers.

I'd argue that since replication is somehow faster than original research, the requirement would count a replication somewhat lower than an original paper (say, at 0.75).

ebiester · on Aug 6, 2023

That is my idea... If we opened it up, there's probably more interesting iterations, such as requiring pre-registration for all papers, having papers with pre-registration count as some portion of a full paper even if they fail so long as the pre-registration passed scrutiny, having non-replicated papers count as some portion of a fully replicated paper, and having replication as a separate category such that there is a minimum k, a minimum n, and a minimum k+n.

The non-easy part of this is once we start making changes to the criteria for tenure, this opens up people trying to stuff all the solutions for all of the problems that everyone knows already. (See Above.) Would some one try to stuff code-available for CS conference papers, for example? What does it mean for a poster session? At what point are papers released for pre-print? What does it mean for the tenure clock or the Ph.D clock? Does it mean that pre-tenure can't depend on studies that take time to replicate? What do we do with longitudinal studies?

I think you're looking at a 50 year transition where you would have to start simple and iterate.

harimau777 · on Aug 6, 2023

Is tenure really as mechanical as "publish this many papers and you get it"? My impression was that it took into account things like impact factor and was much more subjective. If that were the case, then wouldn't you run into problems with whoever decides tenure paying lip service to counting replication or failed pre-registered papers but in practice being biased in favor of original research?

ebiester · on Aug 7, 2023

It depends on where you are. The more prestigious the university, the more that impact factors start to matter. There is also the matter of teaching and service, but I'm not talking about the entire tenure system.

RugnirViking · on Aug 6, 2023

lets be brutally honest with ourselves.

99% of all papers mean nothing. They add nothing to the collective knowledge of humanity. In my field of robotics there are SOOO many papers that are basically taking three or four established algorithms/machine learning models, and applying them to off-the-shelf hardware. The kind of thing any person educated in the field could almost guess the results exactly. Hundreds of such iterations for any reasonably popular problems space (prosthetics, drones for wildfires, museum guide robot) etc every month. Far more than could possibly be useful to anyone.

There should probably be some sort of separate process for things that actually claim to make important discoveries. I don't know what or how that should work. In all honesty maybe there should just be less papers, however that could be achieved.

indymike · on Aug 6, 2023

> 99% of all papers mean nothing. They add nothing to the collective knowledge of humanity.

A lot of papers are done as a part of the process of getting a degree or keeping or getting job. The value is mostly the candidate showing they have the acumen to produce a paper of such quality that meets the publisher and peer review requirements. In some cases, it is to show a future employer some level of accomplishment or renown. The knowledge for humanity is mostly the authors ability to get published.

RugnirViking · on Aug 6, 2023

well yes. But these should go somewhere else than the papers that may actually contain significant results. The problem we have here is that there is an enormous quantity of such useless papers mixed in with the ones actually trying to do science.

I understand that part of the reason for that is that people need to appear as though they are part of the "actually trying" crowd to get the desired job effects. But it is nonetheless a problem, and a large one very worth at least trying to solve.

indymike · on Aug 6, 2023

> I understand that part of the reason for that is that people need to appear as though they are part of the "actually trying" crowd to get the desired job effects

There's another less obvious problem: we don't always know what is groundbreaking and what is not until something is published and is accepted at large as being a really big deal.

staunton · on Aug 6, 2023

99% of science is a waste of time, not just the papers. We just don't know which 1% will turn out not to be. The point is that this is making progress. As such, these 99% definitely are adding to the collective knowledge. Maybe they add very little and maybe it's not worth the effort but it's not nothing. I think one of the effects of AI progress will be allowing to extract much more of the little value such publications have (the 99% of papers might not be worth reading but are good enough for feeding the AI).

LBTables · on Aug 6, 2023

> In my field of robotics there are SOOO many papers that are basically taking three or four established algorithms/machine learning models, and applying them to off-the-shelf hardware.

This is a direct result of the aggressive "publish or perish" system. I worked as an aide in an autonomous vehicles lab for a year and a half during my undergrad, and while the actual work we were doing was really cool cutting edge stuff, it was absolutely maddening the amount of time we wasted blatantly pulling bullshit nothing papers exactly like you describe out of our asses to satisfy the constant chewing out we got that "your lab has only published X papers this month".

awesomeMilou · on Aug 7, 2023

Thank you. Not even saying this to shit on academia, but modern scientific publishing follows the same governing rules as publishing a YouTube video (in principle).

> There should probably be some sort of separate process for things that actually claim to make important discoveries.

This used to be Springer Nature and the likes, but they've had so many retractions in the past years + they broke their integrity in the Schoen scandal, allowing lenience in the review process to secure a prestigious publication in their journal.

In reality, I mean you're probably my academic senior: How does true advancement get publicized these days? You post a YouTube video somewhere. See LK99. No peer review, no fancy stuff, a YouTube video was enough to get Argonne National lab on the case.

PaulHoule · on Aug 7, 2023

99% is bombastic. What I would say is that the median scientific paper is wrong and back that up with a very long list of things that could make a paper "wrong" or "not even wrong". In the case of physics, everything about string theory may be one day considered "wrong". In the case of medicine all the studies where N < 1/10 the number it would take to draw a reliable conclusion are wrong.

justinpombrio · on Aug 6, 2023

> If someone cares enough about the work to build on it, they will replicate it anyway.

Well, the trouble is that hasn't been the case in practice. A lot of the replication crisis was attempting for the first time to replicate a foundational paper that dozens of other papers took as true and built on top of, and then seeing said foundational paper fail to replicate. The incentives point toward doing new research instead of replication, and that needs to change.

p1esk · on Aug 6, 2023

It is the case in my field (ML): if I care enough about a published result I try to replicate it.

tnecniv · on Aug 6, 2023

This is something very sensible in ML since, you likely want to use that algorithm for something else (or to extend / modify it), so you need to get it working in your pipeline and verify it works by comparing with the published result.

In something like psychology that is likely harder, since the experiment you want to do might be related to but differ significantly from the prior work. I am no psychologist, but I’d like to think that they don’t take one study as ground truth for that reason but try to understand causal mechanisms with multiple studies as data points. If the hypothesis is correct, it will likely present in multiple ways.

johnnyworker · on Aug 6, 2023

> If someone cares enough about the work to build on it, they will replicate it anyway.

Does it really deserve to be called work if it doesn't include the a full, working set of instructions that if followed to a T allow it to be replicated? To me that's more like pollution, making it someone else's problem. I certainly don't see how "we did this, just trust us" can even be considered science, and that's not because I don't understand the scientific method, that's because I don't make a living with it, and have no incentive to not rock the boat.

MrJohz · on Aug 6, 2023

I work with code, which is about as reproducible as it is possible to get - the artifacts I produce are literally just instructions on how to reproduce the work I've done again, and again, and again. And still people come to me with some bug that they've experienced on their machine, that I cannot reproduce on my machine, despite the two environments being as identical as I can possibly make them.

I agree that reproduction in scientific work is important, but it is also apparently impossible in the best possible circumstances. When dealing with physical materials, inexact measurements, margins of error, etc, I think we have to accept that there is no set of instructions that, if followed to a T, will ever ensure perfect replication.

johnnyworker · on Aug 6, 2023

> And still people come to me with some bug that they've experienced on their machine, that I cannot reproduce on my machine

But this is the other way around. Have you ever written a program that doesn't run anywhere except a single machine of yours? Would you release it and advertise it and encourage other people to use it as dependency in their software?

If it only runs on one machine of yours, you don't even know if your code is doing something, or something else in the machine/OS. Or in terms of science, whether the research says something about the world, or just about the research setup.

MrJohz · on Aug 6, 2023

I think you misunderstand the point of scientific publication here (at least in theory, perhaps less so in practice). The purpose of a paper is typically to say "I have achieved these results in this environment (as far as I can tell)", and encourages reproduction. But the original result is useful in its own right - it tells us that there may be something worth exploring. Yes, it may just be a measurement error (I remember the magic faster than light neutrinos), but if it is exciting enough, and lots of eyes end up looking, then flaws are typically found fairly quickly.

And yes, there are often overly excited press releases that accompany it - the "advertise it and encourage others to us it as a dependency" part of it analogy - but this is typically just noise in the context of scientific research. If that is your main problem with scientific publishing, you may want to be more critical of science journalism instead.

Fwiw, yes of course I've written code that only runs on my machine. I imagine everyone has, typically accidentally. You do it, you realise your mistake, you learn something from it. Which is exactly what we expect from scientific papers that can't be reproduced.

johnnyworker · on Aug 6, 2023

> But the original result is useful in its own right - it tells us that there may be something worth exploring.

I disagree. It shows that when someone writes something in a text editor and publishes it, others can read the words they wrote. That's all it shows, by itself. Just like someone writing something on the web only tells us that a textarea accepts just about any input.

And even if it did show more than that, when someone "explores" it, is the result is more of that, something that might be true, might not be, but "is worth exploring"? Then at what point does falsifiability enter into it? Why not right away? To me it's just another variation of making it someone else's problem, kicking the can down the road.

> if it is exciting enough, and lots of eyes end up looking, then flaws are typically found fairly quickly.

If that was true, there wouldn't even be a replication issue, much less a replication crisis. It's like saying open source means a lot of people look at the code, if it's important enough. Time and time again that's proven wrong, e.g. https://www.zdnet.com/article/open-source-software-security-...

> yes of course I've written code that only runs on my machine. I imagine everyone has

I wouldn't even know how to go about doing that. Can you post something that only runs on one of your machines, and you don't know why? Note I didn't say your machine, I said one machine of yours. Would you publish something that runs on one machine of yours but not a single other one, other than to ask "can anyone tell me why this only runs on this machine"? I doubt it.

MrJohz · on Aug 6, 2023

I think you may be seeing the purpose of these papers differently to me, which may be the cause of this confusion.

The way you're describing a scientific publication is as if it were the end result of the scientific act. To use the software analogy, you're describing publication like a software release: all tests have been performed, all CI workflows have passed, QA have checked everything, and the result is about to be shipped to customers.

But talking to researchers, they see publishing more like making a new branch in a repository. There is no expectation that the code in that branch already be perfect (hence why it might only run on one machine, or not even run at all, because sometimes even something that doesn't work is still worth committing and exploring later).

And just like in software, where you might eventually merge those branches and create a release out of it, in the scientific world you have metastudies or other forms of analysis and literature reviews that attempt to glean a consensus out of what has been published so far. And typically in the scientific world, this is what happens. However, in journalism, this isn't usually what happens, and one person's experimental, "I've only tested this on my machine" research is often treated as equivalent to another person's "release branch" paper evaluating the state of a field and identifying which findings are likely to represent real, universal truths.

Which isn't to say that journalists are the only ones at fault here - universities that evaluate researchers primarily on getting papers into journals, and prestige systems that make it hard to go against conventional wisdom in the field both cause similar problems by conflating different levels of research or adding competing incentives to researchers' work. But I don't think that invalidates the basic idea of published research: to present a found result (or non-really), provide as much information as possible about how to replicate the result again, and then let other people use that information to inform their work. It just requires us to be mindful of how we let that research inform us.

johnnyworker · on Aug 6, 2023

> But talking to researchers, they see publishing more like making a new branch in a repository.

Well some do, others don't. Like the one who wrote the article this is a discussion of.

https://en.wikipedia.org/wiki/Replication_crisis

> Replication is one of the central issues in any empirical science. To confirm results or hypotheses by a repetition procedure is at the basis of any scientific conception. A replication experiment to demonstrate that the same findings can be obtained in any other place by any other researcher is conceived as an operationalization of objectivity. It is the proof that the experiment reflects knowledge that can be separated from the specific circumstances (such as time, place, or persons) under which it was gained.

Or, in short, "one is none". One might turn into more than one, it might not. Until it does, it's not real.

more snippets from the above WP article:

> This experiment was part of a series of three studies that had been widely cited throughout the years, was regularly taught in university courses

> what the community found particularly upsetting was that many of the flawed procedures and statistical tools used in Bem’s studies were part of common research practice in psychology.

> alarmingly low replication rates (11-20%) of landmark findings in preclinical oncological research

> A 2019 study in Scientific Data estimated with 95% confidence that of 1,989 articles on water resources and management published in 2017, study results might be reproduced for only 0.6% to 6.8%, even if each of these articles were to provide sufficient information that allowed for replication

I'm not saying it couldn't be fine to just publish things because they "could be interesting". But the overall situation seems like quite the dumpster fire to me. As does software, FWIW.

varjag · on Aug 6, 2023

> Note I didn't say your machine, I said one machine of yours.

This thread discusses peer replication, this is not even an analogy.

johnnyworker · on Aug 6, 2023

If you can't even replicate it yourself, what makes you think peers could? We are talking about something not being replicated, not even by the original author. The most extreme version would be something that you could only get to run once on the same machine, and never on any other machine.

davidktr · on Aug 6, 2023

You just described the majority of scientific papers. A "working set of instructions" is not really feasible in most cases. You can't include every piece of hard- and software required to replicate your own setup.

johnnyworker · on Aug 6, 2023

Then don't call it science, since it doesn't contribute anything to the body of human knowledge.

I think it's fascinating that we can at the same time hold things like "one is none" to be true, or that you should write tests first, but with science we already got so used to a lack of discipline that we just declare it fine.

It's not hard to not climb a tower you can't get down from. It's the default, actually. You start with something small where you can describe everything that goes into replicating it. Then you replicate it yourself, based on your own instructions. Before that, you don't bother anyone else with it. Once that is done, and others can replicate as well, it "actually exists".

And if that means the majority of stuff has to be thrown out, I'd suggest doing that sooner rather than later, instead of just accumulating scientific debt.

cycomanic · on Aug 6, 2023

This is a very simplistic view. Why do believe QC departments exist? Even in an industrial setting, companies make the same thing at the same place on the same equipment after sometimes years of process optimisation of well understood technology. This is essentially a best case scenario and still results fail to reproduce. How are scientists who work at the cutting edge of technology with much smaller budgets supposed to give instructions that can be easily reproduced on first go? Moreover how are they supposed to easily reproduce other results?

That is not to say that scientist should not document the process to their best ability so it can be reproduced in principle. I'm just arguing that it is impossible to easily reproduce other people's results. Again when chemical/manufacturing companies open another location they often spend months to years to make the process work in the new factory.

johnnyworker · on Aug 6, 2023

> companies make the same thing at the same place on the same equipment after sometimes years of process optimisation of well understood technology. This is essentially a best case scenario and still results fail to reproduce.

We're not talking about 1 of 10 reproduction attempts failing, we're talking about 100%. And no, companies don't time and time again try to reproduce something that has never been reproduced and fail, to then try again, endlessly. That's just not a thing.

> it is impossible to easily reproduce other people's results

We're also not talking about "easily" reproducing something, but at all. And in principle doesn't cut it, it needs to be reproduced in practice.

davidktr · on Aug 6, 2023

Imagine two scientists, Bob and Alice. Bob has spent the last 5 years examining a theory thoroughly. Now he can explain down to the last detail why the theory does not hold water, and why generations of researchers have been wrong about the issue. Unfortunately, he cannot offer an alternative, and nobody else can follow his long winded arguments anyway.

Meanwhile, Alice has spent the last 5 years making the best possible use of the flawed theory, and published a lot of original research. Sure, many of her publications are rubbish, but a few contain interesting results. Contrary to Bob, Alice can show actual results and has publications.

Who do you believe will remain in academia? And, according to public perception, will seem more like an actual scientist?

tnecniv · on Aug 6, 2023

Then Bob has failed.

Academic science isn’t just the doing science part but the articulation and presentation of your work to the broader community. If Bob knows this space so well, he should be able to clearly communicate the issue and, ideally, present an easily understandable counter example to the existing theory.

Technical folks undervalue presentation when writing articles and presenting at conferences. The burden of proof is on the presenter, and, unless there’s some incredible demonstration at the end, most researchers won’t have the time or attention to slog through your mess of a paper to decipher it. There’s only so much time in the day and too many papers to read.

In my experience, the best researchers are also the best presenters. I’ve been to great talks out of my domain that I left feeling like I understood the importance of their work despite not understanding the details. I’ve also seen many talks in my field that I thought were awful because the presentation was convoluted or they didn’t motivate the importance of their problem / why their work addressed it

johnnyworker · on Aug 6, 2023

I disagree that Bob doesn't produce actual results, or that something that is mostly rubbish, but partly "interesting" is an actual result. We know the current incentives are all sorts of broken, across the board. Goodhart's law and all that. To me the question isn't who remains in academia given the current broken model, but who would remain in academia in one that isn't as broken.

To put a point on it, if public distrust of science becomes big enough, it all can go away before you can say "cultural revolution" or "fascist strongman". Then there'd be no more academia, and its shell would be inhabited by party members, so to speak. I'd gladly sacrifice the ability of Alice and others like her to live off producing "mostly rubbish" to at least have a chance to save science itself.

lliamander · on Aug 6, 2023

Sounds like a problem worth solving.

johngladtj · on Aug 6, 2023

You should.

jofer · on Aug 6, 2023

Also, don't forget that a lot of replication would fundamentally involve going and collecting additional samples / observations / etc in the field area, which is often expensive, time consuming, and logistically difficult.

It's not just "can we replicate the analysis on sample X", but also "can we collect a sample similar to X and do we observe similar things in the vicinity" in many cases. That alone may require multiple seasons of rather expensive fieldwork.

Then you have tens to hundreds of thousands of dollars in instrument time to pay to run various analysis which are needed in parallel with the field observations.

It's rarely the simple data analysis that's flawed and far more frequently subtle issues with everything else.

In most cases, rather than try to replicate, it's best to test something slightly different to build confidence in a given hypothesis about what's going on overall. That merits a separate paper and also serves a similar purpose.

E.g. don't test "can we observe the same thing at the same place?", and instead test "can we observe something similar/analogous at a different place / under different conditions?". That's the basis of a lot of replication work in geosciences. It's not considered replication, as it's a completely independent body of work, but it serves a similar purpose (and unlike replication studies, it's actually publishable).

throwaway4aday · on Aug 6, 2023

What's the value in publishing something that is never replicated? If no one ever reproduces the experiment and gets the same results then you don't know if any interpretations based on that experiment are valid. It would also mean that whatever practical applications could have come from the experiment are never realized. It makes the entire pursuit seem completely useless.

wizofaus · on Aug 6, 2023

> What's the value in publishing something that is never replicated?

Because it presents an experimental result to other scientists that they may consider worth trying to replicate?

dongping · on Aug 6, 2023

Then those unconfirmed results are better put on arxiv, instead of being used to evaluate the performance of scientists. Tenure and grant committees should only consider replicated work.

geysersam · on Aug 6, 2023

I don't agree. A published article should not be taken for Gods Truth no matter if it's replicated or peer reviewed.

Lots of "replicated" "peer-reviewed" research have been found to be wrong. That's fine, it's part of the process of discovery.

A paper should be taken for what it is: a piece of scientific work, a part of a puzzle.

geysersam · on Aug 6, 2023

It still has value if we assume the experiment was done by competent honest people who are unlikely to try to fool us on purpose and unlikely do have made errors.

It would be even better if it was replicated of course.

Depending on what certainty you need you might have to wait for the result of one or several replications, but that is application dependent.

mattkrause · on Aug 6, 2023

Longer, even!

Some experiments that study biological development or trained animals can take a year or more of fairly intense effort to start generating data.

Maxion · on Aug 6, 2023

A year? some data sets take decades to build up before significant papers can be published on their data. Replication of the dataset is just not feasible.

This whole thread just shows how little the average HNer knows about the academic sciences.

tnecniv · on Aug 6, 2023

I know people that had to take a 6+ month trip to Antarctica for part of their work and others that had to share time on a piece of experimental equipment with a whole department — they got a few weeks per year to run their experiment and had to milk that for all it’s worth. Even if they had funding, that machine required large amounts of space and staff to keep it running and they aren’t off the shelf products — only a few exist at large research centers.

coldtea · on Aug 6, 2023

>I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

Then perhaps those papers shouldn't be published? Or held in any higher esteem than a blog post by the same authors?

gus_massa · on Aug 7, 2023

An arXiv preprint is like a blog post.

A paper in a peer review journal is like posting a request for reproduction in a heavily moderated mailing list.

A paper in a predatory journal is like the "You are the best ___" price that you get if you pay to go to the "congress" invitation in spam.

Neither of them guaranty that the result is true. The publication in some peer review journals give a minimal guaranty that the paper is not horribly bad, but I've seen too much crap there too.

I know a few journals and author in my area that are serious and I can guess the result will hold, but I find very difficult to evaluate journals and authors in other areas.

kshahkshah · on Aug 6, 2023

When I looked into this, more than 15 years ago, I thought the difficult portion wasn't sharing the recipe, but the ingredients, if you will - granted I was in a molecular biology lab. Effectively the Material Transfer Agreements between Universities all trying to protect their IP made working with each other unbelievably inefficient.

You'd have no idea if you were going down a well trodden path which would yield no success because you have no idea it was well trod. No one publishes negative results, etc.

majormajor · on Aug 6, 2023

I think the current system is just measuring entirely the wrong thing. Yes, fewer papers would be published. But today's goal is "publish papers" not "learn and disseminate truly useful and novel things", and while this doesn't solve it entirely, it pushes incentives further away from "publish whatever pure crap you can get away with." You get what you measure -> sometimes you need to change what/how you measure.

> If someone cares enough about the work to build on it, they will replicate it anyway.

That's duplicative at the "oh maybe this will be useful to me" stage, with N different people trying to replicate. And with replication not a first-class part of the system, the effort of replication (e_R) is high. For appealing things, N is probably > 2. So N X e_R total effort.

If you move the burden at the "replicate to publish" stage, you can fix the number of replicas needed so N=2 (or whatever) and you incentive the orginal researchers to make e_R lower (which will improve the quality of their research even before the submit-for-publication stage).

I've been in the system, I spent a year or two chasing the tail of rewrites, submissions, etc, for something that was detectable as low-effect-size in the first place but I was told would still be publishable. I found out as part of that that it would only sometimes yield a good p-value! And everything in the system incentivized me to hide that for as long as possible, instead of incentivizing me to look for something else or make it easy for others to replicate and judge for themselves.

Hell, do something like "give undergrads the opportunity to earn Master's on top of their BSes, say, by replicating (or blowing holes in) other people's submissions." I would've eaten up an opportunity like that to go really really deep* in some specialized area in exchange for a masters degree in a less-structured way than "just take a bunch more courses."

oldgradstudent · on Aug 7, 2023

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time

If you build upon a result, you almost have to replicate it.

An acquaintance spent years building upon a result that turned out to be fraudulent/p-hacked.

dongping · on Aug 6, 2023

While it is a lot of work, I tend to think that one can then always publish preprints if they can't wait for the replication. I don't understand why a published paper should count as an achievement (against tenure or funding) at all before the work is replicated. The current model just creates perverse incentives to encourage lying, P-hacking, and cherry-picking. This would at least work for fields like machine learning.

This is, of course, a naive proposal without too much thought into it. But I was wondering what I would have missed here.

i_no_can_eat · on Aug 6, 2023

and in this proposal, who will be tasked with replicating the work?

dongping · on Aug 6, 2023

In some fields, replication is already the prerequisite to benchmark the SoTA. So the incentives boil down to publishing them along with negative results. Or as some have suggested, make it mandatory for PHD candidates to replicate.

Though, it seems that it is possible to game the system, by creating positive/negative replication intentionally, to collude with/harm the author.

boxed · on Aug 6, 2023

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

I don't see how the current system works really either. Fraud is rampant, and replication crisis is the most common state of most fields.

Basically the current system is failing at finding out what is true. Which is the entire point. That's pretty damn bad.

tptacek · on Aug 6, 2023

Fraud seems rampant because you hear about cases of fraud, but not about the tens of thousands of research labs plugging away day after day.

lliamander · on Aug 6, 2023

I agree that most labs are probably not out to defraud people. But without replication I don't think it's reasonable to have much confidence in what is published.

magimas · on Aug 6, 2023

replication happens over time. For example, when I did my PhD I wanted to grow TaS2 monolayers on a graphene layer on an Iridium crystal. So I took published growth recipees of related materials, adapted them to our setup and then finetuned the recipee for TaS2. This way I basically "peer replicated" the growth of the original paper. I then took those samples to a measurement device and modified the sample in-situ by evaporating Li atoms on top (which was the actual paper but I needed a sample to modify first). I published the paper with the growth recipee and the modification procedure and other colleagues then took those instructions to grow their own samples for their own studies (I think it was MoS2 on Graphene on Cobalt that they grew).

This way papers are peer replicated in an emerging manner because the knowledge is passed from one group to another and they use parts of that knowledge to then apply it to their own research. You have to see this from a more holistic picture. Individual papers don't mean too much, it's their overlap that generates scientific consesus.

In contrast, requiring some random reviewer to instead replicate my full paper would be an impossible task. He/she would not have the required equipment (because there's only 2 lab setups in the whole world with the necessary equipment), he/she would probably not have the required knowledge (because mine and his research only partially overlap - e.g. we're researching the same materials but I use angle-resolved photoemission experiments and he's doing electronic transport) and he/she would need to spend weeks first adapting the growth recipee to the point where his sample quality is the same as mine.

tptacek · on Aug 6, 2023

That's not what publication is about. Publication is a conversation with other researchers; it is part of the process of reaching the truth, not its endpoint.

cpach · on Aug 6, 2023

People in general (at least on da Internetz) seem to focus way to much on single studies, and way too little on meta-studies.

AFAICT meta-studies is the level where we as a society really can try to say something intelligent about how stuff works. If an important question is not included in a meta-study, we (i.e. universities and research labs) probably need to do more research on that topic before we really can say that much about it.

lliamander · on Aug 6, 2023

Sure, and scientists need a place to have such conversations.

But publication is not a closed system. The "published, peer-reviewed paper" is frequently an artifact used to decide practical policy matters in many institutions both public and private. To the extent that Science (as an institution in its own right) wants to influence policy, that influence needs to be grounded in reproducible results.

Also, I would not be surprised if stronger emphasis on reproducibility improved the quality of conversation among scientists.

vladms · on Aug 6, 2023

Maybe replication should (and probably does) happen when the published thing is relevant to some entity and also interesting.

I never seen papers as "truth", but more as "possibilities". After many other "proofs" (products, papers, demos, etc.) you can assign some concepts/ideas the label "truth" but one/two papers from the same group is definitely not enough.

tnecniv · on Aug 6, 2023

Yeah passing peer review doesn’t mean that the article is perfect and to be taken as truth now (and remember, to err is human; any coder on here has had some long standing bug that went mostly unnoticed in their code base). It means it passed the journal’s standards for novelty, interest, and rigor based on the described methods as a retained by the editor / area chair and peer reviewers that are selected for being knowledgeable on the topic.

Implicit in this process is that the authors are acting in good faith. To treat the authors as hostile is both demoralizing for the reviewers (who wants to be that cynical about their field) and would require extensive verification of each statement well beyond what is required to return the review in a timely manner.

Unless your paper has mathematical theory (and mistakes do slip through), a publication should not be taken as proof of something on its own, but a data point. Over time and with enough data points, a field builds evidence to turn a hypothesis into a scientific theory.

mike_hearn · on Aug 6, 2023

Unfortunately there's a lot of evidence that fraud really is very prevalent and we don't hear about it anywhere near enough. It depends a lot on the field though.

One piece of evidence comes from software like GRIM and SPRITE. GRIM was run over psychology papers and found around 50% had impossible means in them (that could not be arrived at by any combination of allowed inputs) [1]. The authors generally did not cooperate to help uncover the sources of the problems.

Yet another comes from estimates by editors of well known journals. For example Richard Horton at the Lancet is no stranger to fraud, having published and promoted the Surgisphere paper. He estimates that maybe 50% of medical papers are making untrue claims, which is interesting in that this intuition matches the number obtained in a different field by a more rigorous method. The former editor of the New England Journal of Medicine stated that it was "no longer possible to believe much of the medical research that is published".

50%+ is a number that crops up frequently in medicine. The famous Ioannidis paper, "Why most published research findings are false" (2005) has been cited over 12,000 times.

Marc Andreessen has said in an interview that he talked to the head of a very large government grant agency, and asked him whether it could really be true that half of all biomedical research claims were fake? The guy laughed and said no it's not true, it's more like 90%. [2]

Elizabeth Bik uncovers a lot of fraud. Her work is behind the recent resignation of the head of Stanford University for example. Years ago she said, "Science has a huge problem: 100s (1000s?) of science papers with obvious photoshops that have been reported, but that are all swept under the proverbial rug, with no action or only an author-friendly correction … There are dozens of examples where journals rather accept a clean (better photoshopped?) figure redo than asking the authors for a thorough explanation." In reality there seem to be far more than mere thousands, as there are companies that specialize in professionally producing fake scientific papers, and whole markets where they are bought and sold.

So you have people who are running the scientific system saying, on the record, that they think science is overrun with fake results. And there is some quantitive data to support this. And it seems to happen quite often now that presidents of entire universities are being caught having engaged in or having signed off on rule breaking behavior, like image manipulation or plagiarism, implying that this behavior is at least rewarded or possibly just very common.

There are also whole fields in which the underlying premises are known to be false so arguably that's also pretty deceptive (e.g. "bot studies"). If you include those then it's quite likely indeed that most published research is simply untrue.

[1] https://peerj.com/preprints/2064v1/

[2] https://www.richardhanania.com/p/flying-x-wings-into-the-dea...

coding123 · on Aug 6, 2023

Maybe doing an experiment twice, even with a cost that is double, makes more sense so that we don't all throw away our coffee when coffee is bad, or throw away our gluten when gluten is bad, etc... (those are trivial examples) basically the cost to perform the science in many cases is so minuscule in scale to how it could affect society.

pvaldes · on Aug 6, 2023

One. Doing experiments is yet enough difficult and painful.

Two. This drain of resources can't be done for free. Somebody will need to pay twice for half of the research [1], and faster. Peers will need to be hired and paid, maybe by the writer's grants. Researchers cant justify to give their own funds to other teams without a profound change in regulation and even in that case would be harming their own projects.

[1] as the valuable experts are now stuck validating things instead doing their own job

Would open also a door for foul play. Blocking competitors teams in molasses just trowing them secondary silly problems that they know that are a dead end, while the other team work in the real deal, and take the advantage to win the patent.

faeriechangling · on Aug 7, 2023

In some fields research can’t be replicated later. Much of all autism research will NEVER be replicated because the population of those considered autistic is not stable over time.

Other research proves impossible to replicate because whatever experiment was not described in enough detail to actually replicate it, which should be grounds to immediately dismiss the research before publishing, but which can’t truly be caught if you don’t actually try to reproduce.

Finally these practical concerns don’t even touch on the biggest benefit of reproduction as standard which is that almost nobody wants to reproduce research as they are not rewarded for doing so. This would give somebody, namely those who want to publish something, a strong impetus to get that reproduction done which wouldn’t otherwise exist.

DoctorOetker · on Aug 6, 2023

> [...] non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper

Either "peer reviewed" articles describe progress of promising results, or they don't. If they don't the research is effectively ignored (at least until someone finds it promising). So let's consider specifically output that described promising results.

After "peer review" any apparently promising results prompt other groups to build on them by utilizing it as a step or building block.

It can take many failed attempts by independent groups before anyone dares publish the absence of the proclaimed observations, since they may try it over multiple times thinking they must have botched it somewhere.

On paper it sounds more expensive to require independent replication, but only because the costs of replication attempts are hidden until its typically rather late.

Is it really more expensive if the replication attempts are in some sense mandatory?

Or is it perhaps more expensive to pretend science has found a one-shot "peer reviewed" method, resulting in uncoordinated independent reproduction attempts that may go unannounced before, or even after failed replications?

The pseudo-final word, end of line?

What about the "in some sense mandatory" replication? Perhaps roll provable dice for each article, and in-domain sortition to randomly assign replicators. So every scientist would be spending a certain fraction of their time replicating the research of others. The types of acceptable excuses to derelict these duties should be scrutinized and controlled. But some excuses should be very valid, for example conscientious objection. If you are tasked to reproduce some of Dr. Mengele's works, you can cop out on condition that you thoroughly motivate your ethical concerns and objections. This could also bring a lot of healthy criticism to a lot of practices, which is otherwise just ignored an glossed over for fear of future career opportunities.

brightball · on Aug 6, 2023

> I don't see how this could ever work, and non-scientists seem to often dramatically underestimate the amount of work it would be to replicate every published paper.

The alternative is a bunch of stuff being published which people belief as "science" that doesn't hold up under scrutiny, which undermines the reliability of science itself. The current approach simply gives people reason to be skeptical.

ImPostingOnHN · on Aug 6, 2023

I'm not convinced this proposed alternative is better than the status quo. It's simply not feasible, no matter how many benefits one might imagine.

the concern about skepticism is not irrelevant, but many of these skeptics also are skeptical of the earth being round, or older than a few thousand years, or not created by an omnipotent skylord, and I'm not sure it's actually a significant concern given the current number and expertise of those who are skeptical

so, we can hear their arguments for their skepticism, but that doesn't mean the arguments are valid to warrant the skepticism exhibited. And in the end, that's what matters: skepticism warranted by valid arguments, not just any Cletus McCletus's skepticism of heliocentrism, as if his opinion is equal to that of an astrophysicist (it isn't). And you know what? It isn't necessary to convince a ditch digger that the earth goes around the sun, if they feel like arguing about it.

brightball · on Aug 7, 2023

> skeptical of the earth being round

I’m skeptical that these people truly exist outside of the internet wanting it to be true.

ImPostingOnHN · on Aug 7, 2023

> I’m skeptical that these people truly exist outside of the internet wanting it to be true

I'm not, just like I'm not skeptical that climate change denialists truly exist outside of the internet

there's simply no valid argument to warrant skepticism of either, given the ease of locating evidence that both do

brightball · on Aug 7, 2023

Sure there is. Ease of proof.

I can walk outside and prove gravity.

I can find a million pictures of the earth taken from space, look at a globe and view trade routes that circumnavigate it. I can also look to the sky and see the sun and moon are clearly circular which makes a pretty good case for a pattern.

Climate change or the age of the earth are based on a whole lot more interconnected bits of science that even if you studied your entire life you could not truly say that you understand. You’re putting your trust in layers of science that add up to a certain conclusion (which is good). When people are given good reasons to believe that science and peer reviews aren’t always legitimate it undermines that process of trust building on trust.

Modern aviation is layer upon layer of science building on each other, but I can easily watch a plane takeoff to validate all of those processes.

That’s it in a nutshell. If you can easily replicate it, it’s easy to trust. If you can’t, it’s not…especially when it’s used to drive politics.

ImPostingOnHN · on Aug 7, 2023

as far as I can tell, you weren't asked to prove either climate change or a round earth, or any concept at all

recall the tangent we deflected to was the mere existence of round earth deniers, climate change deniers (read: people), both of which do, in fact, exist as real people in the real world, as can be easily confirmed by anyone researching the topic in good faith, like I said

anyhow, back on topic: the proposed alternative just doesn't seem better than the status quo, no matter how you've sliced it so far, for the reasons given in my original post and ignored by you

brightball · on Aug 7, 2023

> there's simply no valid argument to warrant skepticism of either, given the ease of locating evidence that both do

I wasn’t proving either, just validating skepticism

ImPostingOnHN · on Aug 7, 2023

you weren't able to do so, since "X is ridiculous to deny" is orthogonal to "there exist people who deny X", and the latter is the tangent deflected to. Like I said, anyone actually spending ten seconds researching the topic in good faith would discover that both round-earth denialists and climate-change denialists exist in real life, no matter how ridiculous their beliefs

anyhow, back on topic: the proposed alternative just doesn't seem better than the status quo, no matter how you've sliced it so far, for the reasons given in my original post and ignored by you

brightball · on Aug 9, 2023

I explained plain as day how it’s easy to be skeptical of anything that can’t be verified quickly and then explained the difference.

There’s not a lot left to say if you’re going to ignore that.

ImPostingOnHN · on Aug 10, 2023

>I explained plain as day how it’s easy to be skeptical of anything that can’t be verified quickly and then explained the difference

maybe, maybe not. Problem is, that's totally, absolutely, and completely irrelevant to your tangent, which was: you doubted the existence of round earth deniers in real life

(when they do, in fact, exist, just like climate change deniers, no matter how ridiculous either of them or their beliefs are)

see, you neglect that some people are also skeptical of things which CAN be verified quickly, for many varied reasons which we'll for expedience summarize here as "the dumb", and many of these round-earth-denying, perhaps climate-change-denying people are in the "we're losing faith in science!!!1" crowd

which brings us back on topic: the proposed alternative just doesn't seem better than the status quo, no matter how you've sliced it so far, for the reasons given in my original post and ignored by you

backtoyoujim · on Aug 6, 2023

Yes it would indeed mean slowing down and having more scientists.

It would mean disruption is no longer a useful tool for human development.

ebiester · on Aug 6, 2023

I don't necessarily think it would mean more scientists, but it would mean more expense. You have a moderate number of low impact papers that people are doing for tenure today - papers for the purpose of cranking out papers. We are talking about redirecting efforts but increasing quality of what you have.

brnaftr361 · on Aug 6, 2023

It may not be. I would be willing to argue that there was a tipping point and we've long exceeded its boundary - progress and disruption now is just making finding an equilibrium in the future increasingly difficult.

So entering into a paradigm where we test the known space - especially presently - would 1) help reduce cruft; 2) abate undersirable forward progress; 3) train the next generation(s) of scientists to be more diligent and better custodians of the domain.

throwawaymaths · on Aug 6, 2023

> I don't see how this could ever work,

http://www.orgsyn.org/

> All procedures and characterization data in OrgSyn are peer-reviewed and checked for reproducibility in the laboratory of a member of the Board of Editors

Never is a strong word.

omgwtfbyobbq · on Aug 6, 2023

What about a system where peer replication is required if the number of citations exceeds some threshold?

p1esk · on Aug 6, 2023

Who will be replicating it? Why would I want to set aside my own research to replicate some claim someone made? How would this help my career?

omgwtfbyobbq · on Aug 6, 2023

I dunno. Offhand, I guess whoever is citing the work would need to replicate it, but only if it's cited sufficiently (overall number of citations, considered foundational, etc...)

This could help your career by increasing the probability that the work you're citing is more likely accurate, and as a result, your work is also likely more accurate.

RoyalHenOil · on Aug 6, 2023

A typical paper may cite dozens or hundreds of other papers. This does not sound feasible. It honestly seems like it would worsen the existing problem and force scientists to focus even more on their own original research in isolation from others, to avoid the expense of running myriad replication experiments that they likely don't have the funding and personnel to do.

Knee_Pain · on Aug 6, 2023

Academia's values are not objective. Why is it that replicating or refuting a study is not seen on par as being a co-author of said study? There is nothing set in stone preventing this, only the current academic culture.

p1esk · on Aug 6, 2023

Because I want to do original research, and be known for doing original research. Only if I fail at that, I might settle for being a guy who reproduces others’ work (which basically means the transition from a researcher to an engineer).

omgwtfbyobbq · on Aug 6, 2023

Whether or not you would be doing original research depends on whether the cited work can be replicated.

If the cited work is unable to be replicated, and you try to replicate but get different results, then you would be doing original research, and then you can base further work on your initial original study that came to a different result.

On the flip side, if you are able to replicate it, then you are doing extra work initially, but after replicating the work you've cited, the work you've done is more likely to be reproducible by someone else.

The amount of citations needed to require replication could itself be a function of how easy it is to replicate work across an entire field.

A field where there's a high rate of success in replicating work could have a higher threshold for requiring replication compared to a field where it's difficult to replicate work.

Knee_Pain · on Aug 7, 2023

So you want to be known for original results, which cannot be confirmed to be true? So what kind of results are they?

Also your mentality is exactly part of the problem: you arrogantly believe that replication work is beneath you and that originality is all that matters.

p1esk · on Aug 7, 2023

So you want to be known for original results, which cannot be confirmed to be true?

My results can be confirmed by anyone who wishes to do so, because I always publish my code.

you arrogantly believe that replication work is beneath you

I’ve done plenty of replication work when it was needed for my research. Replicating other people’s work however is not my responsibility and it is not usually required to perform original research.

Knee_Pain · on Aug 8, 2023

>Replicating other people’s work however is not my responsibility

so you want to do science.... which requires there to be replication.... but somehow it's not your responsibility to replicate anything? last I checked the scientific method has to be followed through all the steps to be valid, not just the ones that you personally like

p1esk · on Aug 8, 2023

I usually replicate someone's results for one of the two reasons:

1. I don't trust the results

2. I want to understand their work in sufficient detail

Usually these reasons apply when I'm building directly on top of someone's work. This is not always the case. Often my research is based on ideas which could be tested by generating my own results. As long as I can replicate my own results I don't see a problem.

Knee_Pain · on Aug 8, 2023

>As long as I can replicate my own results I don't see a problem.

replicating your own results is not science though, it's anectodes

vkou · on Aug 7, 2023

If what the parent poster discovers is interesting, and other people consider it to be valuable, they will replicate their results in the process of building on top of it.

High-impact work that people care about gets built on, and replicated. Low impact work does not.

iamthemonster · on Aug 6, 2023

My Master's thesis was basically taking a purely theoretical paper and "replicating" it, by which I mean taking the formulae and just writing the software to run them. It sounds trivial to an outsider but even that was I guess 300 hours of work.

In general I think undergraduate projects are a great space to attempt to replicate findings, but it heavily depends on the field. Fundamental physics experiments can be expensive and require equipment that's outside the reach of undergrads. But one thing I love about engineering as an academic field, by comparison, is that anything you research tends to be more achievable for others to replicate because as your end goal you are aiming for something that's practical in the field.

ljf · on Aug 7, 2023

Indeed, a friend of my was running experiments on quantum computing - the first 18 months on each run of his tests, as he moved between jobs and universities, was just setting up the necessary refrigeration systems (and generally building them from scratch each time).

Only then could he even start building the experiment - total time to run it all seems to run across years.

techas · on Aug 6, 2023

Well, you could put incentives to make replication attractive. Give credit for replication. Give money to the researchers doing the replication/review. Today we pay an average of 2000€ per article, reviewers get 0€ and the editorial keeps all for putting a pdf online. I would say there is margin there to invest in improving the review process.

mandmandam · on Aug 6, 2023

It's wild to me that although we know that it was Ghislaine Maxwell's daddy who started this incredibly corrupt system, people hardly mention this fact.

The US system, and others, even attack people who dare to try and make science more open. RIP Aaron Swartz, and long live Alexandra Elbakyan.

indymike · on Aug 6, 2023

> This is also all work that doesn't benefit the scientists replicating the paper. It only costs them money and time.

Maybe this is what needs to change. If we only reward discovery and success, then the incentive is to only produce discovery and success.

chmod600 · on Aug 7, 2023

Please excuse my ignorance, but I'm not convinced.

What are we supposed to do in a hundred years when the scientists of today are dead and we have a bunch of results with important implications that aren't documented well enough to replicate?

jononomo · on Aug 6, 2023

If it is not replicated it shouldn't be published, other than as a provisional draft. I don't care if it hurts your feelings.

wilde · on Aug 6, 2023

The pressure to replicate would make folks publish things in forms that are easier to replicate. This cost would go down over time.

picadores · on Aug 7, 2023

Yes, the amount of work done that could go into more paper churn?

matthewdgreen · on Aug 6, 2023

The purpose of science publications is to share new results with other scientists, so others can build on or verify the correctness of the work. There has always been an element of “receiving credit” to this, but the communication aspect is what actually matters from the perspective of maximizing scientific progress.

In the distant past, publication was an informal process that mostly involved mailing around letters, or for a major result, self-publishing a book. Eventually publishers began to devise formal journals for this purpose, and some of those journals began to receive more submissions than it was feasible to publish or verify just by reputation. Some of the more popular journals hit upon the idea of applying basic editorial standards to reject badly-written papers and obvious spam. Since the journal editors weren’t experts in all fields of science, they asked for volunteers to help with this process. That’s what peer review is.

Eventually bureaucrats (inside and largely outside of the scientific community) demanded a technique for measuring the productivity of a scientist, so they could allocate budgets or promotions. They hit on the idea of using publications in a few prestigious journals as a metric, which turned a useful process (sharing results with other scientists) into [from an outsider perspective] a process of receiving “academic points”, where the publication of a result appears to be the end-goal and not just an intermediate point in the validation of a result.

Still other outsiders, who misunderstand the entire process, are upset that intermediate results are sometimes incorrect. This confuses them, and they’re angry that the process sometimes assigns “points” to people who they perceive as undeserving. So instead of simply accepting that sharing results widely to maximize the chance of verification is the whole point of the publication process, or coming up with a better set of promotion metrics, they want to gum up the essential sharing process to make it much less efficient and reduce the fan-out degree and rate of publication. This whole mess seems like it could be handled a lot more intelligently.

sebastos · on Aug 6, 2023

Very well put. This is the clearest way of looking at it in my view.

I’ll pile on to say that you also have the variable of how the non-scientist public gleans information from the academics. Academia used to be a more insular cadre of people seeking knowledge for its own sake, so this was less relevant. What’s new here is that our society has fixated on the idea that matters of state and administration should be significantly guided by the results and opinions of academia. Our enthusiasm for science-guided policy is a triple whammy, because 1. Knowing that the results of your study have the potential to affect policy creates incentives that may change how the underlying science is performed 2. Knowing that results of academia have outside influence may change WHICH science is performed, and draw in less-than-impartial actors to perform it 3. The outsized potential impact invites the uninformed public to peer into the world of academia and draw half-baked conclusions from results that are still preliminary or unreplicated. Relatively narrow or specious studies can gain a lot of undue traction if their conclusions appear, to the untrained eye, to provide a good bat to hit your opponent with.

Maxion · on Aug 6, 2023

A significant problem we face today is the way research, especially in academia, gets spotlighted in the media. They often hyper-focus on single studies, which can give a skewed representation of scientific progress.

The reality is that science isn't about isolated findings; it's a cumulative effort. One paper might suggest a conclusion, but it's the collective weight of multiple studies that provides a more rounded understanding. Media's tendency to cherry-pick results often distorts this nuanced process.

It's also worth noting the trend of prioritizing certain studies, like large RCTs or systematic reviews, while overlooking smaller ones, especially pilot studies. Pilot studies are foundational—they often act as the preliminary research needed before larger studies can even be considered or funded. By sidelining or dismissing these smaller, exploratory studies, we risk undermining the very foundation that bigger, more definitive research efforts are built on. If we consistently ignore or undervalue pilot studies, the bigger and often more impactful studies may never even see the light of day.

casualscience · on Aug 6, 2023

Most of this is very legit, but this

> Still other outsiders, who misunderstand the entire process, are upset that intermediate results are sometimes incorrect. This confuses them, and they’re angry that the process sometimes assigns “points” to people who they perceive as undeserving. So instead of simply accepting that sharing results widely to maximize the chance of verification is the whole point of the publication process, or coming up with a better set of promotion metrics, they want to gum up the essential sharing process to make it much less efficient and reduce the fan-out degree and rate of publication.

Does not represent my experience in the academy at all. There is a ton of gamesmanship in publishing. That is ultimately the yardstick academics are measured against, whether we like it or not. No one misunderstands that IMO, the issue is that it's a poor incentive. I think creating a new class of publication, one that requires replication, could be workable in some fields (e.g. optics/photonics), but probably is totally impossible in others (e.g. experimental particle physics).

For purely intellectual fields like mathematics, theoretical physics, philosophy, you probably don't need this at all. Then there are 'in the middle fields' like machine learning which in theory would be easy to replicate, but also would be prohibitively expensive for, e.g. baseline training of LLMs.

Maxion · on Aug 6, 2023

And on the extreme end you have the multi-decade longitudinal studies in epidemiology / biomedicine that would be more-or-less impossible to replicate.

tnecniv · on Aug 6, 2023

I remember reading that some epidemiologists saw the wealth of new data from CoVID as a silver lining because of how few events there are at that scale. Apparently it’s not uncommon to still use the Spanish Flu data which is spotty at best because it might be the only thing available at the scale you’re interested in

oneshtein · on Aug 7, 2023

IMHO, physicist, especially theoretical physicist, must be able to create a physical model of something, to confirm that their mathematical models are somewhat connected to reality. WTF is «wave of probability»? WTF is «bending of space-time»? These things are possible in dream-land physics only.

nine_k · on Aug 6, 2023

For sharing results widely, there's arxiv. The problem is that the fanout is now overwhelming.

The public perception of a publication in a prestigious journal as the established truth does not help, too.

isaacremuant · on Aug 6, 2023

> The public perception of a publication in a prestigious journal as the established truth does not help, too.

it's not so much the public perception but what govs/media/tech and other institutions have pushed down so that the public doesn't question whatever resulting policy they're trying to put forth.

"Trust the science" means "Thou shalt not question us, simply obey".

Anyone with eyes who has worked in institutions knows that bureocracy, careerism and corruption are intrinsic to them.

dmbche · on Aug 6, 2023

Your analysis seems to portray all scientists as pure hearted. May I remind you of the latest Stanford scandal where the president of Stanford was found to have manipulated data?

Today, publications do not serve the same purpose as they did before the internet. It is trivial today to write a convincing paper without research and getting that published(www.theatlantic.com/ideas/archive/2018/10/new-sokal-hoax/572212/&sa=U&ved=2ahUKEwjnp5mRtsiAAxVwF1kFHesBDC8QFnoECAkQAg&usg=AOvVaw0t_Bo31BrT5D9zHBdmNAqi).

matthewdgreen · on Aug 6, 2023

No subset of humanity is “pure hearted.” Fraud and malice will exist in everything people do. Fortunately these fraudulent incidents seem relatively rare, when one compares the number of reported incidents to the number of publications and scientists. But this doesn’t change anything. The benefit of scientific publication is to make it easier to detect and verify incorrect results, which is exactly what happened in this case.

I understand that it’s frustrating it didn’t happen instantly. And I also understand that it’s deeply frustrating that some undeserving person accumulated status points with non-scientists based on fraud, and that let them take a high-status position outside of their field. (I think maybe you should assign some blame to the Stanford Trustees for this, but that’s up to you.) None of this means we’d be better off making publication more difficult: it means the metrics are bad.

PS When a TFA raises something like “the replication crisis” and then entangles it with accusations of deliberate fraud (high profile but exceedingly rare) it’s like trying to have a serious conversation about automobile accidents, but spending half the conversation on a handful of rare incidents of intentional vehicular homicide. You’re not going to get useful solutions out of this conversation, because it’s (perhaps deliberately) misunderstanding the impact and causes of the problem.

dmbche · on Aug 6, 2023

For your analogy on car accidents - a notable difference between both is that in the case of car accidents, we are able to get numbers on when, how and why they happen and then make conclusions from that.

In this case, we are not even aware of most events of fraud/"bad papers"/manipulation - the "crisis" is that we are losing faith in the science we are doing - results that were cornerstones of entire fields are found to be nonreproducible, making all the work built on top of it pointless.(psychology, cancer, economics, etc - I'm being very broad)

At this point, we don't know how deep the rot goes. We are at the point of recognizing that it's real, and looking for solutions. For car accidents, we're past that - we're just arguing about what are the best solutions. For the replication crisis, we're trying to find a way forward.

Like that scene in The Thing, where they test the blood? We're at the point where we don't know who to trust.

Ps: what's a tfa?

mike_hearn · on Aug 6, 2023

Fraud isn't exceedingly rare :( It only seems that way because academia doesn't pay anyone to find it, reacts to volunteer reports by ignoring it, and the media generally isn't interested.

Fraud is so frequent and easy to find that there are volunteers who in their spare time manage to routinely uncover not just individual instances of fraud but entire companies whose sole purpose is to generate and sell fake papers on an industrial scale.

https://www.nature.com/articles/d41586-023-01780-w

Fraud is so easy and common that there are a steady stream of journals which publish entire editions consisting of nothing but AI generated articles!

https://www.nature.com/articles/d41586-021-03035-y

Despite being written as a joke over a decade ago, you can page through an endless stream of papers that were generated by SciGen - a Perl script - and yet they are getting published:

https://pubpeer.com/search?q=scigen

The problem is so prevalent that some people created the Problematic Paper Screener, a tool that automatically locates articles that contain text indicative of auto-generation.

https://dbrech.irit.fr/pls/apex/f?p=9999:1::::::

This is all pre-ChatGPT, and is just the researchers who can't be bothered writing a paper at all. The more serious problem is all the human written fraudulent papers with bad data and bad methodologies that are never detected, or only detected by randos with blogs or Twitter accounts that you never hear around.

matthewdgreen · on Aug 6, 2023

The wonderful thing about the western world is that most countries value freedom of the press. The dark side of this is that you can spin up your own “scientific journal” and charge people to publish in it, game the rankings like any common SEO scam, and nobody will stop you because (especially here in the US) you’re exercising your first amendment rights. Then people can fill it with nonsense and even script-generated fake papers. People outside the scientific community can also scam more “legitimate” for-profit journals in various ways, resulting in more silly publications that the actual scientific community has to filter out. It’s very annoying.

None of this has any more bearing on fraud by professional scientists than, say, the existence of some garbage-filled Wikimedia server or a badly-edited Wikipedia page means that the Wikipedia editors themselves are fraudsters.

mike_hearn · on Aug 7, 2023

With respect, I think you should research the topic more deeply before assuming that this is some sort of fringe problem that doesn't exist in the "actual" scientific community. The second link I provided is by Nature News and states specifically that the problem affects "prestigious journals" (their words).

Auto-generated papers have been published in journals from the IEEE, Elsevier, Springer Nature and other well known publishing houses. These papers have supposedly passed peer review in western journals that have been around for decades, and have been signed off by professional academics. Invariably no satisfactory explanation for how this happens is provided, with "we got hacked" being a remarkably common claim. Quite how you publish an entire magazine full of fraudulent articles due to one person getting hacked is unclear; actual newspapers and magazines don't ever have this problem.

Here's an example. The Springer Nature journal "Personal and Ubiquitous Computing" was established in 1997 and has its own Wikipedia page:

https://en.wikipedia.org/wiki/Personal_and_Ubiquitous_Comput...

The Editor-In-Chief is a British academic, who also has his own Wikipedia page. So these aren't fly-by-night no-brand nobodies. Yet this journal somehow managed to publish dozens of obviously auto-generated papers, like this one:

https://static-content.springer.com/esm/art%3A10.1007%2Fs007...

"The conversion of traditional arts and crafts to modern art design under the background of 5G mobile communication network"

or

https://static-content.springer.com/esm/art%3A10.1007%2Fs007...

"The application of twin network target tracking and support tensor machine in the evaluation of orienteering teaching"

The papers are just template paragraphs from totally unrelated topics spliced together. Nobody noticed this had happened until months after publication, strongly implying that this journal has no readers at all (this is a common theme in all these stories, they never seem to notice themselves). The editor agreed the papers were nonsense (his words), but blamed peer reviewers. Yet this journal claims to have a large editorial board with over 40 people on it, mostly from universities in the Europe, USA and China.

What's amazing is that this exact same "attack" had happened before. The previous year Springer Nature had to retract over 400 papers which were auto-generated in the exact same way. They learned nothing and appear to treat the problem as a similar level of severity to filtering email spam.

And in the last six months alone we've seen major fraud scandals impacting Stanford (the President no less), Harvard and Yale. These are supposedly elite universities and researchers. Francesca Gino was earning over $1M a year. Yet their fraud is being uncovered by motivated volunteers, not any kind of systematic well funded science police.

So all the signs here point towards fraud being incredibly easy to get away with. Whole journals have literally no readers at all, academia relies on Scooby-Doo levels of policing, and supposedly prestigious brands are constantly having fraud uncovered by random tweeters, undergrads doing journalism as a hobby etc.

jltsiren · on Aug 7, 2023

As far as research is concerned, the names you mentioned are not prestigious brands. Universities are organizations that provide facilities and services in exchange for their share of grant money. Publishers publish whatever people are willing to pay for. Prominent mentions of supposedly prestigious institutions are a red flag in science reporting. Very often, either the writer is trying to promote something, or they don't know what they are talking about.

There are two parallel academias. There is the reputable high-trust one, where it's easy to get away with fraud, because people generally don't commit it. And there is the scammy one that exists to help people to game the metrics. While the two overlap a bit, they are mostly disjoint.

If you are an academic, you get a steady stream of spam from the scammy side of the academia. You get calls to submit papers to a conference with "Proceedings by Springer" (but the scope of the conference is barely mentioned), you get invited to become an "ΕԀitоrial ΜҽmƄҽr" of a journal, and so on. Those are like Nigerian letters. They make it very explicit that they are scams, in order to avoid wasting people's time.

You guessed that nobody reads the journal you mentioned, and that's trivially true. Of course nobody reads journals, because their scopes are too wide. No matter what you are working on, most articles in the journals you publish in are irrelevant to you. People read only articles that look interesting or relevant. If nobody cares about an article, it doesn't get read.

While the rest of the world is based on top-down hierarchies, that's not a good model for understanding research. In general, the higher up in the hierarchy you go, the less relevant things become. The article is more relevant than the journal, and the journal is more relevant than the publisher. A rank-and-file professor is more relevant than a department chair. A department chair is more relevant than a dean. And a dean is more relevant than a chancellor/president/whatever.

snowwrestler · on Aug 7, 2023

I hope you appreciate the futility of trying to prove that fraud is ruining science publication by linking to a bunch of publications that capably detect and point out all the fraud.

mike_hearn · on Aug 7, 2023

What makes you think they detect and point out all the fraud? These sites are hobby sites run by people who just run some very basic text filtering software. If fraud was being reliably detected paper mills wouldn't have businesses, yet these companies seem to be quite common and exist in multiple countries.

And recall that I said all this work pre-dates ChatGPT. Using LLMs to generate scientific papers works great, and you won't be able to find them using regexs.

The journals themselves admit there are serious fraud problems and that they don't know what to do about it. So it's very concerning. The world needs a trustworthy scientific literature.

dmbche · on Aug 6, 2023

Thanks you - just discovered Scigen, these links are incredible

miga · on Aug 6, 2023

Peer review does not serve to assure replication, but assure readability and comprehensibility of the paper.

Given that some experiments cost billions to conduct, it is impossible to implement "Peer Replication" for all papers.

What could be done is to add metadata about papers that were replicated.

NalNezumi · on Aug 6, 2023

Isn't readability and comprehensibility the job of the editor/journal to check. (after all they're actually paid) maybe not for conference, but peer review is more for checking if the methodology, scope, claim, direction, conclusion and relevances is sound&trustable.

At least that's my understanding

kergonath · on Aug 6, 2023

The editor is often not the right person to decide based on technical details. Most often, articles they receive anre outside their field of expertise and they don’t really have a way of deciding if a section is comprehensible or not. It’s very difficult for an outsider to know what bit of jargon is redundant and what bit is actually important to make sense of the results. So this bit of readability check falls to the referees.

In theory editors (or rather copyeditors, the editors themselves have to handle too many papers to do this sort of thing) should help with things like style, grammar, and spelling. In practice, quality varies but it is often subpar.

kkylin · on Aug 6, 2023

Highly dependent on journal / field. In mine (mathematics) most associate editors work for free, same as reviwers. The reviewer do all the things you say, and in addition try to ensure readability & novelty. Most journals do have professional copy editing, but that's separate from the content review.

I don't know how refereed conference proceedings work (we don't really use these). The only journals I know of that have professional editors (i.e., editors who are not active researchers themselves) are Nature and affiiliated journals, but someone more knowledgeble should correct me here.

snowwrestler · on Aug 7, 2023

> Isn’t readability and comprehensibility the job of the editor/journal to check

Yes, who do you think ask the reviewers to perform their reviews?

> peer review is more for checking if the methodology, scope, claim, direction, conclusion and relevances is sound&trustable.

No, the parent comment has it right. The only thing being reviewed is the paper, and the point is to make sure it communicates clearly, not that it’s “sound and trustable.”

kjkjadksj · on Aug 7, 2023

The editor is basically deferring to people with expertise who can put the paper into context better than they could. The editor might be an expert in the field, but they can’ speak for every aspect of it like someone working day to day in that specific aspect of the field could. Sometimes the authors themselves even recommend potentially relevant reviewers for the editor to contact for peer reviewing.