Why Kurate?

Not all evidence is equal.

There are widely reported systematic issues across many domains of scientific evidence. A paper can be useful and still sit inside a public record shaped by missing registration data, late results reporting, selective publication, or incomplete accountability. This matters because clinical decisions are only justifiable when the evidence behind them can be inspected, weighted, and challenged. We have collected some of those concerns below.

Clinical trials, medicine, pharma, and regulation

01Trial registration and results-reporting compliance remain incomplete despite journal, WHO, and legal reporting frameworks.

De Angelis, C., Drazen, J. M., Frizelle, F. A., et al. (2004). Clinical trial registration: A statement from the International Committee of Medical Journal Editors. New England Journal of Medicine, 351, 1250–1251. doi:10.1056/NEJMe048225
Prayle, A. P., Hurley, M. N., & Smyth, A. R. (2012). Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: Cross sectional study. BMJ, 344, d7373. doi:10.1136/bmj.d7373
Anderson, M. L., Chiswell, K., Peterson, E. D., Tasneem, A., Topping, J., & Califf, R. M. (2015). Compliance with results reporting at ClinicalTrials.gov. New England Journal of Medicine, 372, 1031–1039. doi:10.1056/NEJMsa1409364
DeVito, N. J., Bacon, S., & Goldacre, B. (2020). Compliance with legal requirement to report clinical trial results on ClinicalTrials.gov: A cohort study. The Lancet, 395(10221), 361–369. doi:10.1016/S0140-6736(19)33220-9

02Selective publication, delayed publication, and outcome-reporting bias make the public trial literature an incomplete and biased sample of the evidence.

Hopewell, S., Loudon, K., Clarke, M. J., Oxman, A. D., & Dickersin, K. (2009). Publication bias in clinical trials due to statistical significance or direction of trial results. Cochrane Database of Systematic Reviews, 2009(1), MR000006. doi:10.1002/14651858.MR000006.pub3
Ross, J. S., Tse, T., Zarin, D. A., Xu, H., Zhou, L., & Krumholz, H. M. (2012). Publication of NIH funded trials registered in ClinicalTrials.gov: Cross sectional analysis. BMJ, 344, d7292. doi:10.1136/bmj.d7292
Speich, B., Gryaznov, D., Busse, J. W., Gloy, V. L., Lohner, S., Klatte, K., et al. (2022). Nonregistration, discontinuation, and nonpublication of randomized trials: A repeated metaresearch analysis. PLOS Medicine, 19(4), e1003980. https://doi.org/10.1371/journal.pmed.1003980
Turner, E. H., Matthews, A. M., Linardatos, E., Tell, R. A., & Rosenthal, R. (2008). Selective publication of antidepressant trials and its influence on apparent efficacy. New England Journal of Medicine, 358(3), 252–260. doi:10.1056/NEJMsa065779
Dwan, K., Altman, D. G., Arnaiz, J. A., et al. (2008). Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLOS ONE, 3(8), e3081. doi:10.1371/journal.pone.0003081
Kirkham, J. J., Dwan, K. M., Altman, D. G., Gamble, C., Dodd, S., Smyth, R., & Williamson, P. R. (2010). The impact of outcome reporting bias in randomised controlled trials on a cohort of systematic reviews. BMJ, 340, c365. doi:10.1136/bmj.c365
Dwan, K., Gamble, C., Williamson, P. R., & Kirkham, J. J. (2013). Systematic review of the empirical evidence of study publication bias and outcome reporting bias: An updated review. PLOS ONE, 8(7), e66844. doi:10.1371/journal.pone.0066844

03Registered or protocol-specified outcomes often differ from published outcomes, creating endpoint-switching and selective-reporting risk.

Chan, A.-W., Hróbjartsson, A., Haahr, M. T., Gøtzsche, P. C., & Altman, D. G. (2004). Empirical evidence for selective reporting of outcomes in randomized trials: Comparison of protocols to published articles. JAMA, 291(20), 2457–2465. doi:10.1001/jama.291.20.2457
Mathieu, S., Boutron, I., Moher, D., Altman, D. G., & Ravaud, P. (2009). Comparison of registered and published primary outcomes in randomized controlled trials. JAMA, 302(9), 977–984. doi:10.1001/jama.2009.1242
Jones, C. W., Keil, L. G., Holland, W. C., Caughey, M. C., & Platts-Mills, T. F. (2015). Comparison of registered and published outcomes in randomized controlled trials: A systematic review. BMC Medicine, 13, 282. doi:10.1186/s12916-015-0520-3
Chen, T., Li, C., Qin, R., Wang, Y., Yu, D., et al. (2019). Comparison of Clinical Trial Changes in Primary Outcome and Reported Intervention Effect Size Between Trial Registration and Publication. JAMA Network Open, 2(7), e197242. doi:10.1001/jamanetworkopen.2019.7242
Falk Delgado, A., & Falk Delgado, A. (2017). Outcome switching in randomized controlled oncology trials reporting on surrogate endpoints: A cross-sectional analysis. Scientific Reports, 7, 9206. https://doi.org/10.1038/s41598-017-09553-y
Florez, M. A., Abi Jaoude, J., Patel, R. R., et al. (2023). Incidence of primary end point changes among active cancer phase 3 randomized clinical trials. JAMA Network Open, 6(5), e2313819. doi:10.1001/jamanetworkopen.2023.13819

04Surrogate endpoints can be weak substitutes for patient-important benefit when their relationship to survival, symptoms, or quality of life is uncertain.

Prasad, V., Kim, C., Burotto, M., & Vandross, A. (2015). The strength of association between surrogate end points and survival in oncology: A systematic review of trial-level meta-analyses. JAMA Internal Medicine, 175(8), 1389–1398. doi:10.1001/jamainternmed.2015.2829
Ciani, O., Buyse, M., Drummond, M., Rasi, G., Saad, E. D., & Taylor, R. S. (2016). Use of surrogate end points in healthcare policy: A proposal for adoption of a validation framework. Nature Reviews Drug Discovery, 15(7), 516. doi:10.1038/nrd.2016.81

05“Spin” in abstracts, conclusions, press releases, and news coverage can make weak or nonsignificant clinical findings sound decisive.

Boutron, I., Dutton, S., Ravaud, P., & Altman, D. G. (2010). Reporting and interpretation of randomized controlled trials with statistically nonsignificant results for primary outcomes. JAMA, 303(20), 2058–2064. doi:10.1001/jama.2010.651
Yavchitz, A., Boutron, I., Bafeta, A., et al. (2012). Misrepresentation of randomized controlled trials in press releases and news coverage: A cohort study. PLOS Medicine, 9(9), e1001308. doi:10.1371/journal.pmed.1001308
Sumner, P., Vivian-Griffiths, S., Boivin, J., et al. (2014). The association between exaggeration in health related science news and academic press releases: Retrospective observational study. BMJ, 349, g7015. doi:10.1136/bmj.g7015
Chiu, K., Grundy, Q., & Bero, L. (2017). “Spin” in published biomedical literature: A methodological systematic review. PLOS Biology, 15(9), e2002173. doi:10.1371/journal.pbio.2002173

06Industry sponsorship and financial conflicts of interest are associated with more favorable conclusions and outcomes.

Lexchin, J., Bero, L. A., Djulbegovic, B., & Clark, O. (2003). Pharmaceutical industry sponsorship and research outcome and quality: Systematic review. BMJ, 326(7400), 1167–1170. doi:10.1136/bmj.326.7400.1167
Bekelman, J. E., Li, Y., & Gross, C. P. (2003). Scope and impact of financial conflicts of interest in biomedical research: A systematic review. JAMA, 289(4), 454–465. doi:10.1001/jama.289.4.454
Lundh, A., Lexchin, J., Mintzes, B., Schroll, J. B., & Bero, L. (2017). Industry sponsorship and research outcome. Cochrane Database of Systematic Reviews, 2, MR000033. doi:10.1002/14651858.MR000033.pub3

07Ghostwriting, guest authorship, and abandoned trials can make the published record misrepresent who did the work and what the data showed.

Gøtzsche, P. C., Hróbjartsson, A., Johansen, H. K., Haahr, M. T., Altman, D. G., & Chan, A.-W. (2007). Ghost authorship in industry-initiated randomised trials. PLOS Medicine, 4(1), e19. doi:10.1371/journal.pmed.0040019
Ross, J. S., Hill, K. P., Egilman, D. S., & Krumholz, H. M. (2008). Guest authorship and ghostwriting in publications related to rofecoxib: A case study of industry documents from rofecoxib litigation. JAMA, 299(15), 1800–1812. doi:10.1001/jama.299.15.1800
Doshi, P., Dickersin, K., Healy, D., Vedula, S. S., & Jefferson, T. (2013). Restoring invisible and abandoned trials: A call for people to publish the findings. BMJ, 346, f2865. doi:10.1136/bmj.f2865
Le Noury, J. C., Nardo, J. M., Healy, D., Jureidini, J., Raven, M., Tufanaru, C., & Abi-Jaoude, E. (2015). Restoring Study 329: Efficacy and harms of paroxetine and imipramine in treatment of major depression in adolescence. BMJ, 351, h4320. doi:10.1136/bmj.h4320

08Harms and adverse events are often less completely reported than efficacy outcomes.

Ioannidis, J. P. A., Evans, S. J. W., Gøtzsche, P. C., et al. (2004). Better reporting of harms in randomized trials: An extension of the CONSORT statement. Annals of Internal Medicine, 141(10), 781–788. doi:10.7326/0003-4819-141-10-200411160-00009
Pitrou, I., Boutron, I., Ahmad, N., & Ravaud, P. (2009). Reporting of safety results in published reports of randomized controlled trials. Archives of Internal Medicine, 169(19), 1756–1761. doi:10.1001/archinternmed.2009.306
Wieseler, B., Wolfram, N., McGauran, N., et al. (2013). Completeness of reporting of patient-relevant clinical trial outcomes: Comparison of unpublished clinical study reports with publicly available data. PLOS Medicine, 10(10), e1001526. doi:10.1371/journal.pmed.1001526
Golder, S., Loke, Y. K., Wright, K., & Norman, G. (2016). Reporting of adverse events in published and unpublished studies of health care interventions: A systematic review. PLOS Medicine, 13(9), e1002127. doi:10.1371/journal.pmed.1002127

09Trial design features such as inadequate allocation concealment, lack of blinding, and small samples can exaggerate treatment effects.

Schulz, K. F., Chalmers, I., Hayes, R. J., & Altman, D. G. (1995). Empirical evidence of bias: Dimensions of methodological quality associated with estimates of treatment effects in controlled trials. JAMA, 273(5), 408–412. doi:10.1001/jama.1995.03520290060030
Savović, J., Jones, H. E., Altman, D. G., et al. (2012). Influence of reported study design characteristics on intervention effect estimates from randomized, controlled trials. Annals of Internal Medicine, 157(6), 429–438. doi:10.7326/0003-4819-157-6-201209180-00537
Dechartres, A., Trinquart, L., Boutron, I., & Ravaud, P. (2013). Influence of trial sample size on treatment effect estimates: Meta-epidemiological study. BMJ, 346, f2304. doi:10.1136/bmj.f2304

10Trials stopped early for benefit can overestimate treatment effects, especially when stopped after few outcome events.

Montori, V. M., Devereaux, P. J., Adhikari, N. K. J., et al. (2005). Randomized trials stopped early for benefit: A systematic review. JAMA, 294(17), 2203–2209. doi:10.1001/jama.294.17.2203
Bassler, D., Briel, M., Montori, V. M., et al. (2010). Stopping randomized trials early for benefit and estimation of treatment effects: Systematic review and meta-regression analysis. JAMA, 303(12), 1180–1187. doi:10.1001/jama.2010.310

11Subgroup claims, baseline comparisons, and missing outcome data can generate misleading clinical interpretations.

Assmann, S. F., Pocock, S. J., Enos, L. E., & Kasten, L. E. (2000). Subgroup analysis and other misuses of baseline data in clinical trials. The Lancet, 355(9209), 1064–1069. doi:10.1016/S0140-6736(00)02039-0
Sun, X., Briel, M., Walter, S. D., & Guyatt, G. H. (2012). Credibility of claims of subgroup effects in randomised controlled trials: Systematic review. BMJ, 344, e1553. doi:10.1136/bmj.e1553
Wood, A. M., White, I. R., & Thompson, S. G. (2004). Are missing outcome data adequately handled? A review of published randomized controlled trials in major medical journals. Clinical Trials, 1(4), 368–376. doi:10.1191/1740774504cn032oa
Bell, M. L., Kenward, M. G., Fairclough, D. L., & Horton, N. J. (2013). Differential dropout and bias in randomised controlled trials: When it matters and when it may not. BMJ, 346, e8668. doi:10.1136/bmj.e8668

12Strict eligibility criteria can make trial participants unrepresentative of the patients who later receive the intervention.

Rothwell, P. M. (2005). External validity of randomised controlled trials: “To whom do the results of this trial apply?” The Lancet, 365(9453), 82–93. doi:10.1016/S0140-6736(04)17670-8
Van Spall, H. G. C., Toren, A., Kiss, A., & Fowler, R. A. (2007). Eligibility criteria of randomized controlled trials published in high-impact general medical journals: A systematic review. JAMA, 297(11), 1233–1240. doi:10.1001/jama.297.11.1233
Kennedy-Martin, T., Curtis, S., Faries, D., Robinson, S., & Johnston, J. (2015). A literature review on the representativeness of randomized controlled trial samples and implications for the external validity of trial results. Trials, 16, 495. doi:10.1186/s13063-015-1023-4

Statistics, inference, and methods

13P-values and “statistical significance” are widely misinterpreted as evidence strength, truth, or practical importance.

Gelman, A., & Stern, H. (2006). The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant. The American Statistician, 60(4), 328–331.
Gelman, A., & Loken, E. (2013). The Garden of Forking Paths: Why Multiple Comparisons Can Be a Problem, Even When There Is No “Fishing Expedition” or “p-Hacking” and the Research Hypothesis Was Posited Ahead of Time.
Wasserstein, R. L., & Lazar, N. A. (2016). The ASA’s statement on p-values: Context, process, and purpose. The American Statistician, 70(2), 129–133. doi:10.1080/00031305.2016.1154108
Greenland, S., Senn, S. J., Rothman, K. J., et al. (2016). Statistical tests, P values, confidence intervals, and power: A guide to misinterpretations. European Journal of Epidemiology, 31, 337–350. doi:10.1007/s10654-016-0149-3
Wasserstein, R. L., Schirm, A. L., & Lazar, N. A. (2019). Moving to a world beyond “p < 0.05.” The American Statistician, 73(sup1), 1–19. doi:10.1080/00031305.2019.1583913
Amrhein, V., Greenland, S., & McShane, B. (2019). Scientists rise up against statistical significance. Nature, 567(7748), 305–307. doi:10.1038/d41586-019-00857-9

14Low statistical power increases false negatives, inflates significant effect sizes, and lowers the positive predictive value of findings.

Ioannidis, J. P. A. (2005). Why most published research findings are false. PLOS Medicine, 2(8), e124. doi:10.1371/journal.pmed.0020124
Button, K. S., Ioannidis, J. P. A., Mokrysz, C., Nosek, B. A., Flint, J., Robinson, E. S. J., & Munafò, M. R. (2013). Power failure: Why small sample size undermines the reliability of neuroscience. Nature Reviews Neuroscience, 14, 365–376. doi:10.1038/nrn3475
Szucs, D., & Ioannidis, J. P. A. (2017). Empirical assessment of published effect sizes and power in the recent cognitive neuroscience and psychology literature. PLOS Biology, 15(3), e2000797. doi:10.1371/journal.pbio.2000797
Dumas-Mallet, E., Button, K. S., Boraud, T., Gonon, F., & Munafò, M. R. (2017). Low statistical power in biomedical science: A review of three human research domains. Royal Society Open Science, 4(2), 160254. doi:10.1098/rsos.160254

15Researcher degrees of freedom, p-hacking, and HARKing can turn flexible analysis pipelines into false-positive machines.

Kerr, N. L. (1998). HARKing: Hypothesizing after the results are known. Personality and Social Psychology Review, 2(3), 196–217. doi:10.1207/s15327957pspr0203_4
Simmons, J. P., Nelson, L. D., & Simonsohn, U. (2011). False-positive psychology: Undisclosed flexibility in data collection and analysis allows presenting anything as significant. Psychological Science, 22(11), 1359–1366. doi:10.1177/0956797611417632
John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. doi:10.1177/0956797611430953
Head, M. L., Holman, L., Lanfear, R., Kahn, A. T., & Jennions, M. D. (2015). The extent and consequences of p-hacking in science. PLOS Biology, 13(3), e1002106. doi:10.1371/journal.pbio.1002106

16Circular analysis and “double dipping” allow data used to select a signal to be reused to test that same signal.

Kriegeskorte, N., Simmons, W. K., Bellgowan, P. S. F., & Baker, C. I. (2009). Circular analysis in systems neuroscience: The dangers of double dipping. Nature Neuroscience, 12(5), 535–540. doi:10.1038/nn.2303

17Predictive, explanatory, and causal goals are often conflated, producing claims that outrun what the design and model can support.

Shmueli, G. (2010). To explain or to predict? Statistical Science, 25(3), 289–310. doi:10.1214/10-STS330
Rohrer, J. M. (2018). Thinking clearly about correlations and causation: Graphical causal models for observational data. Advances in Methods and Practices in Psychological Science, 1(1), 27–42. doi:10.1177/2515245917745629
Grosz, M. P., Rohrer, J. M., & Thoemmes, F. (2020). The taboo against explicit causal inference in nonexperimental psychology. Perspectives on Psychological Science, 15(5), 1243–1255. doi:10.1177/1745691620921521
Vowels, M. J. (2023). Misspecification and unreliable interpretations in psychology and social science. Psychological Methods, 28(3), 507–526.
Vowels, M. J. (2024). Trying to outrun causality with machine learning: Limitations of model explainability techniques for exploratory research. Psychological Methods. Advance online publication. https://doi.org/10.1037/met0000699

18Weak construct validity and questionable measurement practices can make precise-looking estimates answer the wrong question.

Flake, J. K., & Fried, E. I. (2020). Measurement schmeasurement: Questionable measurement practices and how to avoid them. Advances in Methods and Practices in Psychological Science, 3(4), 456–465. doi:10.1177/2515245920952393
Loken, E., & Gelman, A. (2017). Measurement error and the replication crisis. Science, 355(6325), 584–585. doi:10.1126/science.aal3618

19Statistical reporting errors are common enough to change some reported inferences.

García-Berthou, E., & Alcaraz, C. (2004). Incongruence between test statistics and P values in medical papers. BMC Medical Research Methodology, 4, 13. doi:10.1186/1471-2288-4-13
Nuijten, M. B., Hartgerink, C. H. J., van Assen, M. A. L. M., Epskamp, S., & Wicherts, J. M. (2016). The prevalence of statistical reporting errors in psychology, 1985–2013. Behavior Research Methods, 48(4), 1205–1226. doi:10.3758/s13428-015-0664-2
Veldkamp, C. L. S., Nuijten, M. B., Dominguez-Alvarez, L., van Assen, M. A. L. M., & Wicherts, J. M. (2014). Statistical reporting errors and collaboration on statistical analyses in psychological science. PLOS ONE, 9(12), e114876. doi:10.1371/journal.pone.0114876

Citation, evidence synthesis, and the published record

20Citation bias and citation distortion can amplify positive, fashionable, or misleading claims while burying negative evidence.

Greenberg, S. A. (2009). How citation distortions create unfounded authority: Analysis of a citation network. BMJ, 339, b2680. doi:10.1136/bmj.b2680
Jannot, A.-S., Agoritsas, T., Gayet-Ageron, A., & Perneger, T. V. (2013). Citation bias favoring statistically significant studies was present in medical research. Journal of Clinical Epidemiology, 66(3), 296–301. doi:10.1016/j.jclinepi.2012.09.015
Duyx, B., Urlings, M. J. E., Swaen, G. M. H., Bouter, L. M., & Zeegers, M. P. (2017). Scientific citations favor positive results: A systematic review and meta-analysis. Journal of Clinical Epidemiology, 88, 92–101. doi:10.1016/j.jclinepi.2017.06.002

21Coercive citation and continued citation of retracted work degrade the trustworthiness of citation metrics and literature trails.

Budd, J. M., Sievert, M., & Schultz, T. R. (1998). Phenomena of retraction: Reasons for retraction and citations to the publications. JAMA, 280(3), 296–297. doi:10.1001/jama.280.3.296
Wilhite, A. W., & Fong, E. A. (2012). Coercive citation in academic publishing. Science, 335(6068), 542–543. doi:10.1126/science.1212540
Bar-Ilan, J., & Halevi, G. (2017). Post retraction citations in context: A case study. Scientometrics, 113, 547–565. doi:10.1007/s11192-017-2242-0

22Research waste arises when studies are poorly designed, unreliably reported, unnecessary, or synthesized in redundant and conflicted ways.

Chalmers, I., & Glasziou, P. (2009). Avoidable waste in the production and reporting of research evidence. The Lancet, 374(9683), 86–89. doi:10.1016/S0140-6736(09)60329-9
Moher, D., Tetzlaff, J., Tricco, A. C., Sampson, M., & Altman, D. G. (2007). Epidemiology and reporting characteristics of systematic reviews. PLOS Medicine, 4(3), e78. doi:10.1371/journal.pmed.0040078
Ioannidis, J. P. A. (2016). The mass production of redundant, misleading, and conflicted systematic reviews and meta-analyses. The Milbank Quarterly, 94(3), 485–514. doi:10.1111/1468-0009.12210
Page, M. J., Shamseer, L., Altman, D. G., et al. (2016). Epidemiology and reporting characteristics of systematic reviews of biomedical research: A cross-sectional study. PLOS Medicine, 13(5), e1002028. doi:10.1371/journal.pmed.1002028

23Reporting guidelines exist because incomplete and nonstandard reporting makes studies difficult to interpret, appraise, and reproduce.

Schulz, K. F., Altman, D. G., & Moher, D. (2010). CONSORT 2010 statement: Updated guidelines for reporting parallel group randomised trials. PLOS Medicine, 7(3), e1000251. doi:10.1371/journal.pmed.1000251
Chan, A.-W., Tetzlaff, J. M., Gøtzsche, P. C., et al. (2013). SPIRIT 2013 explanation and elaboration: Guidance for protocols of clinical trials. Annals of Internal Medicine, 158(3), 200–207. doi:10.7326/0003-4819-158-3-201302050-00583
Page, M. J., McKenzie, J. E., Bossuyt, P. M., et al. (2021). The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ, 372, n71. doi:10.1136/bmj.n71
Percie du Sert, N., Hurst, V., Ahluwalia, A., et al. (2020). The ARRIVE guidelines 2.0: Updated guidelines for reporting animal research. PLOS Biology, 18(7), e3000410. doi:10.1371/journal.pbio.3000410

Reproducibility and preclinical research

24Data, code, and materials often remain unavailable, making verification and reuse difficult.

Wicherts, J. M., Borsboom, D., Kats, J., & Molenaar, D. (2006). The poor availability of psychological research data for reanalysis. American Psychologist, 61(7), 726–728. doi:10.1037/0003-066X.61.7.726
Vines, T. H., Albert, A. Y. K., Andrew, R. L., et al. (2014). The availability of research data declines rapidly with article age. Current Biology, 24(1), 94–97. doi:10.1016/j.cub.2013.11.014
Iqbal, S. A., Wallach, J. D., Khoury, M. J., Schully, S. D., & Ioannidis, J. P. A. (2016). Reproducible research practices and transparency across the biomedical literature. PLOS Biology, 14(1), e1002333. doi:10.1371/journal.pbio.1002333
Hardwicke, T. E., Mathur, M. B., MacDonald, K., et al. (2018). Data availability, reusability, and analytic reproducibility: Evaluating the impact of a mandatory open data policy at the journal Cognition. Royal Society Open Science, 5(8), 180448. doi:10.1098/rsos.180448

25Large-scale replication efforts show that many published findings do not reproduce with similar effect sizes or statistical significance.

Open Science Collaboration. (2015). Estimating the reproducibility of psychological science. Science, 349(6251), aac4716. doi:10.1126/science.aac4716
Camerer, C. F., Dreber, A., Holzmeister, F., et al. (2018). Evaluating replicability of social science experiments in Nature and Science between 2010 and 2015. Nature Human Behaviour, 2, 637–644. doi:10.1038/s41562-018-0399-z

26Preclinical biomedical findings can fail to replicate, wasting resources and weakening translation into human trials.

Prinz, F., Schlange, T., & Asadullah, K. (2011). Believe it or not: How much can we rely on published data on potential drug targets? Nature Reviews Drug Discovery, 10(9), 712. doi:10.1038/nrd3439-c1
Begley, C. G., & Ellis, L. M. (2012). Raise standards for preclinical cancer research. Nature, 483(7391), 531–533. doi:10.1038/483531a
Freedman, L. P., Cockburn, I. M., & Simcoe, T. S. (2015). The economics of reproducibility in preclinical research. PLOS Biology, 13(6), e1002165. doi:10.1371/journal.pbio.1002165
Errington, T. M., Denis, A., Perfito, N., et al. (2021). Reproducibility in cancer biology: Challenges for assessing replicability in preclinical cancer biology. eLife, 10, e67995. doi:10.7554/eLife.67995

27Animal studies often suffer from publication bias, weak internal validity, and poor reporting of randomization and blinding.

Bebarta, V., Luyten, D., & Heard, K. (2003). Emergency medicine animal research: Does use of randomization and blinding affect the results? Academic Emergency Medicine, 10(6), 684–687. doi:10.1111/j.1553-2712.2003.tb00056.x
Macleod, M. R., van der Worp, H. B., Sena, E. S., Howells, D. W., Dirnagl, U., & Donnan, G. A. (2008). Evidence for the efficacy of NXY-059 in experimental focal cerebral ischaemia is confounded by study quality. Stroke, 39(10), 2824–2829. doi:10.1161/STROKEAHA.108.515957
Sena, E. S., van der Worp, H. B., Bath, P. M. W., Howells, D. W., & Macleod, M. R. (2010). Publication bias in reports of animal stroke studies leads to major overstatement of efficacy. PLOS Biology, 8(3), e1000344. doi:10.1371/journal.pbio.1000344

Incentives, peer review, integrity, and metrics

28Scientific incentives can select for publishable, novel, and significant results rather than reliable cumulative knowledge.

Munafò, M. R., Nosek, B. A., Bishop, D. V. M., et al. (2017). A manifesto for reproducible science. Nature Human Behaviour, 1, 0021.
Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: II. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6), 615–631. doi:10.1177/1745691612459058
Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. doi:10.1098/rsos.160384

29Peer review is useful but weakly evidenced, inconsistent, and vulnerable to bias and error.

Jefferson, T., Alderson, P., Wager, E., & Davidoff, F. (2002). Effects of editorial peer review: A systematic review. JAMA, 287(21), 2784–2786. doi:10.1001/jama.287.21.2784
Jefferson, T., Wager, E., & Davidoff, F. (2002). Measuring the quality of editorial peer review. JAMA, 287(21), 2786–2790. doi:10.1001/jama.287.21.2786
Smith, R. (2006). Peer review: A flawed process at the heart of science and journals. Journal of the Royal Society of Medicine, 99(4), 178–182. doi:10.1258/jrsm.99.4.178
Bornmann, L. (2011). Scientific peer review. Annual Review of Information Science and Technology, 45(1), 197–245. doi:10.1002/aris.2011.1440450112

30Fabrication, falsification, image manipulation, and other misconduct contribute materially to retractions and unreliable literatures.

Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLOS ONE, 4(5), e5738. doi:10.1371/journal.pone.0005738
Fang, F. C., Steen, R. G., & Casadevall, A. (2012). Misconduct accounts for the majority of retracted scientific publications. Proceedings of the National Academy of Sciences, 109(42), 17028–17033. doi:10.1073/pnas.1212247109
Bik, E. M., Casadevall, A., & Fang, F. C. (2016). The prevalence of inappropriate image duplication in biomedical research publications. mBio, 7(3), e00809-16. doi:10.1128/mBio.00809-16

31Predatory journals exploit author incentives and can contaminate indexing, citation, and evidence-synthesis systems.

Shen, C., & Björk, B.-C. (2015). “Predatory” open access: A longitudinal study of article volumes and market characteristics. BMC Medicine, 13, 230. doi:10.1186/s12916-015-0469-2
Grudniewicz, A., Moher, D., Cobey, K. D., et al. (2019). Predatory journals: No definition, no defence. Nature, 576(7786), 210–212. doi:10.1038/d41586-019-03759-y
Rice, D. B., Skidmore, B., Cobey, K. D., & Moher, D. (2021). Dealing with predatory journal articles captured in systematic reviews. Systematic Reviews, 10, 175. doi:10.1186/s13643-021-01733-2

32Journal impact factors and simple citation metrics are poor proxies for research quality and can distort researcher behavior.

Seglen, P. O. (1997). Why the impact factor of journals should not be used for evaluating research. BMJ, 314(7079), 497–502. doi:10.1136/bmj.314.7079.497
Hicks, D., Wouters, P., Waltman, L., de Rijcke, S., & Rafols, I. (2015). Bibliometrics: The Leiden Manifesto for research metrics. Nature, 520(7548), 429–431. doi:10.1038/520429a