Log in

  • 23 May 2018 11:26 | Michael Seadle (Administrator)

    Testing for Reliability

    The principle that scientists (and scholars generally) can build on past results means that past results ought to be replicable. Brownill et al (2016) write:

    This replication by different labs and different researchers enables scientific consensus to emerge because the scientific community becomes more confident that subsequent research examining the same question will not refute the findings.

    And MacMillan (2017) writes in his editorial “Replication Studies”:

    Replication studies are important as they essentially perform a check on work in order to verify the previous findings and to make sure, for example, they are not specific to one set of data or circumstance.

    Increasingly replication is also seen as a way to test for data falsification, on the presumption that unreliable results will not be replicable; but as with most forms of testing, it offers no simple answer.

    How does Replication Work?

    The ability to replicate results means that those doing the replication need exact information about how the original experiment was carried out. In physics and chemistry this means precise descriptions in lab books and in articles, and the same machines using the same calibration. In the social sciences, it can be much harder to reproduce the exact conditions, since they depend on human reactions and a variable environment. One well-known case comes from a study by Cornell social psychologist Daryl Bem, who did a word recognition test:

    “[Bem] published his findings in the Journal of Personality and Social Psychology (JPSP) along with eight other experiments providing evidence for what he refers to as “psi”, or psychic effects. There is, needless to say, no shortage of scientists sceptical about his claims. Three research teams independently tried to replicate the effect Bem had reported and, when they could not, they faced serious obstacles to publishing their results.” (Yong, 2012)

    The fact that the other research teams could not replicate the experiment successfully did not suggest to anyone that the data were fake (presumably the students could attest to that), but the failure did cast doubt on the apparent “psychic effects”. Since an exact replication using those Cornell students in that class with all the same social conditions was not possible, the question arises: how close to the original must a replication be to validate an original experiment?

    Dennis and Valacich (2014) talk about “three fundamental categories” of replication:

    • Exact Replications: These articles are exact copies of the original article in terms of method and context. All measures, treatments statistical analyses, etc. are identical to those of the original study…
    • Methodological Replications: These articles use exactly the same methods as the original study (i.e., measures, treatments, statistics etc.) but are conducted in a different context. …
    • Conceptual Replications: These articles test exactly the same research questions or hypotheses, but use different measures, treatments, analyses and/or context….

    Since the Cornell students were not available for the replications, the replications presumably come under the “methodological” category, or perhaps even the “conceptual”. Dennis and Valacich (2014)comment: “Conceptual replications are the strongest form of replication because they ensure that there is nothing idiosyncratic about the wording of items, the execution of treatments, or the culture of the original context that would limit the research conclusions.”

    In any case these replication types represent a significant contribution to knowledge by confirming or throwing skepticism on the earlier results. Why then did the research teams have trouble publishing their results?

    Publishing Replications

    Most journals do not encourage replications. A study that strikes readers as new and exciting and generates attention is a plus, whereas a study that appears to cover old ground, even if it has scholarly value, is less likely to get through the peer review process. Lucy Goodchild van Hilten (2015) writes:

    Publication bias affects the body of scientific knowledge in different ways, including skewing it towards statistically significant or “positive” results. This means that the results of thousands of experiments that fail to confirm the efficacy of a treatment or vaccine – including the outcomes of clinical trials – fail to see the light of day.

    This may be changing and the degree to which it is true depends in part on the academic discipline. David McMillan (2017) writes:

    Cogent Economics & Finance recognises the importance of replication studies. As an indicator of this importance, we now welcome research papers that focus on replication and whose ultimate acceptance depends on the accuracy and thoroughness of the work rather than seeking a ‘new’ result.

    If other journals follow this trend, there could be significantly more testing of scholarly results. Nonetheless a problem remains. Except for the design time, replicating results costs almost as much as doing the original experiment and if the results are in fact exactly the same, it is unlikely to be published. Some fields solve the problem with a repeat-and-extend approach where replication is tied to new features that explicitly build on the replicated results. Much depends on the culture of the discipline.

    For all of its problems, replication remains one of the most effective and reliable tools for uncovering flaws and fake data, and should be used more widely.


    Bem (2015) did a further “meta-analysis of 90 [replication] experiments from 33 laboratories in 14 countries …” which he claims supports his hypothesis. He published this meta-analysis in an open-access journal for the life sciences that charges $1000 for an article of this length, and Bem explicitly declared that he had no grant support. If nothing else, this is a sign of how difficult it is to continue the discourse in standard academic venues.


    Ms. Melanie Rügenhagen (MA) suggested the topic and assisted with the research. Prof. Dr. Joan Luft, provided research content.


    Bem D, Tressoldi PE, Rabeyron T and Duggan M. 2015. “Feeling the Future: A Meta-Analysis of 90 Experiments on the Anomalous Anticipation of Random Future Events.” F1000Research 4:1188. Available online.

    Brownill, Sue, Dennis, Alan R., Binny, Samuel, Tan, Barney, Valacich , Joseph and Whitley, Edgar A. 2016. “Replication Research: Opportunities, Experiences and Challenges.” In Thirty Seventh International Conference on Information Systems. Dublin, Ireland. Available online.

    Dennis, Alan R, and Joseph S Valacich. 2014. “A Replication Manifesto.” AIS Transactions on Replication Research 1 (1): 1–5.

    Goodchild van Hilten, Lucy. 2015. “Why It’s Time to Publish Research ‘Failures.’” Elsevier Connect. Available online.

    McMillan, David. 2017. “Replication Studies.” Cogent Economics and Finance, 2017. Available online.

    Yong, Ed. 2012. “Replication Studies: Bad Copy.” Nature 485 (7398): 298–300. Available online.

  • 16 May 2018 11:24 | Michael Seadle (Administrator)

    Burden of Proof

    Jochen Zenthöfer wrote an article in the Frankfurter Allgemeine newspaper on 18 April 2018 in which he expresses concern about the number of plagiarism cases under consideration at German universities. As he notes, the cases come largely from the VroniPlag Wiki. His article is the focus of this column.

    There is an assumption in most western legal systems that a person is innocent until proven guilty, but, as Zenthöfer (2018) notes, this principle derives from criminal law:

    “Als Grund führt die HU das Prinzip der Unschuldsvermutung an, das freilich nur im – hier nicht einschlägigen – Strafrecht gilt.” [The reason given by the HU is the principle of the presumption of innocence, which only applies in criminal law, and is not relevant here. – my translation]

    The author seems to imply that the presumption of innocence ought to be ignored in a process that could destroy a career and strip a person of the means of livelihood. When the accuser is an official body, such as a commission of a university, that has done a careful analysis and presents a well-founded conclusion, it may be reasonable to put the burden of proof of innocence on the accused, but when a self-constituted group such as the VroniPlag Wiki makes an accusation, the universities involved have an obligation to investigate thoroughly and carefully to see whether the accusation is legitimate.


    In conducting an investigation into plagiarism, appropriate standards need to be considered. In talking about “Policies and Initiatives Aimed at Addressing Research Misconduct in High-Income Countries,” Resnik (2013) refers to the COPE guidelines (2018), which define plagiarism as occurring:

    “When somebody presents the work of others (data, words or theories) as if they were his/her own and without proper acknowledgment.” (Cope, 2018)

    While this definition is comprehensive, it gives no explicit measure to determine what actually constitutes plagiarism. Mere text overlap is insufficient. A short factual statement, such as “Berlin is the capital of Germany” gets thousands of hits in a Google search, and could not reasonably be called plagiarism. The phrase in the guidelines about  “proper acknowledgment” is equally inspecific, not merely because citation styles vary, but because expectations vary about exactly how and where in the text to put the reference.

    Paraphrasing is not the same as plagiarism. Lee (2015) explains rules for paraphrasing in the American Psychological Association Style Blog:

    “A paraphrase restates someone else’s words in a new way. For example, you might put a sentence into your own words, or you might summarize what another author or set of authors found. When you include a paraphrase in a paper, you are required to include only the author and date in the citation.”

    This definition leaves latitude for understanding what “in your own words” means, which does not necessarily imply avoiding all the original words and phrases. When paraphrasing, it is almost impossible to avoid content-carrying words or phrases that have a particular meaning. Nonetheless there are plagiarism hunters who see plagiarism in every overlap.


    When evaluating a work for plagiarism, it is important to have rational metrics. Copying a complete paragraph word for word (without quotes) is plagiarism. Copying a complete long sentence word for word suggests plagiarism. A case in which a majority of the words in a paragraph or sentence match words in the same order in another text could be deliberate plagiarism that the author tried to obscure, or it might be a case of good verbal memory or it might be that there was a logic to the order and the word choice. Absolute uniqueness of language is not necessarily the hallmark of good scholarship.

    Companies like iThenticate are very careful only to talk about the percentage of plagiarism in terms of the number of words in the whole work. VroniPlag counts plagiarism in terms of how many pages have hits  (“Anzahl Seiten mit Funden”), which means that even a page with a mere nine words (a set of four words and a set of five words) adds to the page count. (see VroniPlag). This exaggerates the impression of the problem to a point that could be considered misrepresentation in any scholarly work.

    In my book on “Quantifying Research Integrity” (December 2016), I suggest a grey-scale measure for plagiarism cases where the number of contiguous words are measured in a particular unit, such as a paragraph or sentence. One can disagree with the exact numbers, but using transparent metrics as a standard matters. Exactly where copying occurs matters too. It is less surprising to have word overlap in a literature review than in conclusions, and facts and standard phrases in an academic discipline need to be deducted.

    Ultimately decisions about plagiarism depend on the distinction between negligence and gross negligence. The former implies sloppiness, while the latter represents actual misconduct. Hunting for plagiarism may have a game-like quality for those who spend their free time doing it that pushes the volunteers toward judgments that increase the number of hits without regard to the distinction between negligence and gross negligence.


    Certainly plagiarism is an ethical and copyright problem, but its long term actual harm to modern scholarship may be modest. The real harm comes to the personal integrity of the person doing the plagiarism. Integrity matters and certainly instances of plagiarism need to be caught, but the current focus on hunting plagiarism may actually be a distraction from the more important task of identifying problems with falsified or manipulated data. False data undermines the foundations of scholarship (especially the natural sciences) in ways that plagiarism does not.

    The popularity of plagiarism hunting grows in part from tools that make it easy to compare texts word for word. Some British universities distinguish between actual plagiarism and the appearance of plagiarism by requiring students to submit their own papers to a plagiarism checker like Turnitin. King’s College London even allows students to submit their works multiple times (see “Submitting Assessments Online“). While this is a measure to prevent plagiarism, it serves also as a recognition that inadvertent copying is common and does not necessarily involve fraudulent intent.


    Ms. Melanie Rügenhagen (MA) assisted with the research.


    COPE (Committee on Publication Ethics). 2018. “Plagiarism.” 2018. Available online.

    Lee, Chelsea. 2015. “When and How to Include Page Numbers in APA Style Citations.” American Psychological Association: APA Style Blog. 2015. Available online.

    Resnik, David B., and Zubin Master. 2013. “Policies and Initiatives Aimed at Addressing Research Misconduct in High-Income Countries.” PLoS Medicine 10 (3). Public Library of Science: e1001406. Available online.

    Seadle, Michael. 2016. Quantifying Research Integrity. Morgan Claypool: Synthesis Lectures on Information Concepts, Retrieval, and Services. Available online.

    Zenthöfer, Jochen. 2018. “Wie Universitäten Auf Plagiate in Doktorarbeiten Reagieren: Auch Mit Diebstahl Kann Man Es Weit Bringen.” Frankfurter Allgemeine, April 18, 2018. Available online.

  • 2 May 2018 11:21 | Michael Seadle (Administrator)

    Justice is often slow. Articles with integrity problems can stay in print without any warning label for years. Chen (2013) wrote:

    “We found that it takes about 2 years, on average, to retract an article and another 2 years to see a substantial decrease of citations to the retracted article.”

    Two years may well even underestimate the time to retraction, since the accusation often triggers formal investigations at universities and at journals, before either institution is ready to take action. As soon as an accusation becomes public, the press typically pushes for swift action, and university authorities typically want to make the problem go away, without much concern for the assumption of innocence that is part of democratic justice systems. One of the constant themes of this column is that integrity problems are sometimes more complex than the accusations imply. Nonetheless two years is a long time, during which ideas can become easily established.

    From a journal perspective, the commercial value of an article declines sharply two years after publication, though value over time varies greatly with the field: humanities articles generally have a longer half-life than articles in the natural sciences or medicine. Most researchers in most fields will have read an article before two years are up, if it is at all relevant to their work. This means that an article that a publisher has retracted after two years has already exhausted a significant part of its commercial value and is intellectually present in the minds of the scholarly community. Two years more for a decrease in citations is hardly surprising, since scholars who read a paper are unlikely to go back to read it again. Likely they have a digital copy or a paper copy and work from that for their own new article.

    Authors may also ignore a retraction for a variety of reasons that may depend on the reason for the retraction. As Madlock-Brown and Eichmann (2015) wrote:

    There are many reasons articles may be retracted, some more problematic than others.

    A work that was retracted for plagiarism, for example, may still contain worthwhile information, despite the ethical and copyright violations. Readers may also discount retractions for procedural or peer review issues. Self-citation plays a role too.

    18% of authors self-cite retracted work post retraction with only 10% of those authors also citing the retraction notice.” (Madlock-Brown & Eichmann, 2015)

    What exactly authors are citing from their own retracted paper may matter. It is not quite fair to assume that everything in a paper is contaminated because of a retraction. The degree to which an integrity violation in one part of a paper affects others may depend on the field. A humanities paper may, for example, draw multiple conclusions, only one of which the retraction affects. The assumption that everything in a retracted paper is flawed is part of the black-or-white thinking that currently pervades the integrity literature.

    The interesting question is whether the flawed portions of a retracted work, especially faked or manipulated data, continues in the minds of scholars after the integrity violation is discovered and established beyond reasonable doubt. Greitemeyer (2014) writes:

    … numerous studies have shown that corrections do not work as intended, in that individuals are influenced in their later judgments by misinformation even after correction. For instance, Loftus (1979) found that after witnessing an event, exposure to misleading information makes a person often report something that was only suggested. This phenomenon has been labeled the misinformation effect…

    In some ways this is not surprising. If the original article made a clear and cogent argument that seemed on the face of it to be reasonable, a memory of and even a belief in the argument may persist.

    Once a belief is formed, people generate explanations that fit the evidence. These explanations continue to imply that the belief is correct even after exposure to evidence that invalidates the evidence once used to support one’s belief.” (Greitemeyer, 2014)

    An interesting example can be found in the retracted study by Diederik Stapel where he asks travelers to choose a chair next to a Dutch-African or a Dutch-Caucasian. (Stapel & Lindenberg, 2011) The data may have been fake, but the conclusion felt so plausible that it remained in the minds of many. Indeed, this reference to a retracted work is an example of why such citations may take place.

    The good news is that researchers who are accused and exonerated may not suffer long term damage to their reputation. Greitemeyer and Sagioglou (2015) writes:

    The present research suggests that people do abandon their attitude toward an accused researcher after learning that the researcher has been exonerated. In both studies, participants in the exoneration condition had a more favorable attitude toward the researcher than participants in the uncorrected accusation condition. Moreover, in the exoneration condition, participants’ post­-exoneration attitude was more favorable than their pre-­exoneration attitude.

    This should be a comforting thought to those who are exonerated, but those cases seem to be rare. Interestingly enough Greitemeyer and Sagioglou (2015) begin with the example discussed in last week’s column, and note: “…it is important to keep in mind that the LOWI concluded that it cannot be determined whether Förster had manipulated the data.” Thus far he has not been exonerated and may well have given up hope. For others it may offer a grain of comfort after a time of stress.


    Ms. Vera Hillebrand (MA) suggested the topic and the title. She also provided most of the references.


    Chen, Chaomei, Zhigang Hu, Jared Milbank, and Timothy Schultz. 2013. “A Visual Analytic Study of Retracted Articles in Scientific Literature.” Journal of the American Society for Information Science and Technology 64 (2): 234–53. Available online.

    Greitemeyer, Tobias. “Article retracted, but the message lives on.” Psychonomic bulletin & review 21, no. 2 (2014): 557-561. Available online.

    Greitemeyer, Tobias and Sagioglou, Christina. 2015. “Does Exonerating an Accused Researcher Restore the Researcher’s Credibility?” PloS One 10 (5). Available online.

    Madlock-Brown, C.R. & Eichmann, D. 2015. “The (Lack of) Impact of Retraction on Citation Networks.” Sci Eng Ethics 21 (127). Available online.

    Stapel, Diederik A, and Siegwart Lindenberg. 2011. “Coping with Chaos: How Disordered Contexts Promote Stereotyping and Discrimination.” Science 332 (6026): 251–253. Available online.

  • 1 May 2018 13:46 | Thorsten Beck (Administrator)

    This video introduces to the image manipulation research carried out at the HEADT Centre. It discusses the relevance of understanding and analyzing images using existing software tools such as Photoshop, ImageJ or Gimp and explains the importance of establishing a comprehensive database that may help teams around the globe to develop and train algorithms for image manipulation detection. The overall aim is to raise awareness and to make detection tools more efficient. Thus the center is going to make a significant contribution to a more thorough understanding of the phenomenon of image manpulation and to sharpen the view on what kinds of manipulation require a closer look.

  • 25 Apr 2018 11:16 | Michael Seadle (Administrator)

    Data falsification cases generally take time to discover, and generally require someone who is motivated enough to look for problems. Falsification should theoretically be found in the course of peer review, and sometimes is, but journals do not routinely make public the detailed results of peer review. Data falsification can also be hard to prove with certainty. This column will look at a case from social psychology that arose in the wake of the Diederik Stapel retractions. Stapel admitted his guilt and his name is now routinely part of discussions about data falsification. The 2014 case under discussion here is somewhat different because the author of the retracted papers still insists on his innocence. Since the person’s name is irrelevant to the scholarly discussion, this column will refer to him only as JF. Anyone who really wants to learn his name need only look at the reference.

    The issue in the JF case involves datasets whose results are statistically too perfect. An unnamed whistleblower did an analysis:

    “The chances of this happening were one in 508,000,000,000,000,000,000, he claimed.”(Kolfschooten, 2014)

    The whistleblower is apparently known to the university and to the National Board for Research Integrity (LOWI) in the Netherlands (Kolfschooten, 2014). Maintaining the whistleblower’s anonymity seems legitimate as long as due process is followed and the accused has a reasonable chance to respond. Just how much opportunity JF had to respond is unclear from published sources. He implied that the opportunity was limited in an open letter to Retraction Watch (Amarcus41, 2014):

    The rapid publication of the results of the LOWI and UvA [University of Amsterdam] case happened quite unexpectedly, the negative evaluation came unexpectedly, too. Note that we were all sworn to secrecy by the LOWI, so please understand that I have to write this letter in zero time. Because the LOWI, from my point of view, did not receive much more information than was available for the preliminary, UvA-evaluation, and because I did never did something even vaguely related to questionable research practices, I expected a verdict of not guilty… I do feel like the victim of an incredible witch hunt directed at psychologists after the Stapel-affair.

    JF appears not to have kept the original data, only his summary of the results, which is a lesson to other scholars not to be too ready to clean their files in case the original data are needed. Investigators also raised suspicions about the data in the thesis of one of JF’s doctoral students. The doctoral student was declared innocent of wrongdoing, because the data came from JF. For JF the trouble did not stop:

    A panel of statistical experts from UvA that embarked on a second, more comprehensive investigation found “strong evidence for low veracity” of the results in all three papers, as well as in five others.” (Kolfschooten, 2016)

    And “… as part of a settlement with the German Society for Psychology (DGPs)” JF agreed to further retractions (Palus, 2016). The weight of opinion has been strongly against JF to the point that he left the academic world for private practice. (Stern, 2017)

    In a sense the case is closed, but questions remain. Accusations of fraud tend to come in groups, perhaps because an initial case inspires people to look more carefully, and perhaps because opinion shifts away from a presumption of innocence. After the Stapel case, Uri Simonsohn built a statistical tool to detect the possibility of certain kinds of fraud where the data patterns were too perfect to be believed (Enserink, 2013). There is no evidence that this tool was involved in JF’s case, but the principle appears to be the same: the data were just too perfect, not merely once, but in paper after paper. Of course high quality data are what scholars need to get publications. The push to get perfect data is strong.

    One should not forget how complex the creation of a research data set is, and that experienced researchers learn how to get good results without necessarily faking or directly manipulating the data. Selecting participants is an art in a world where genuine random selection is often impossible. A highly successful scholar might unconsciously seek just the right subjects without obvious tampering, and might learn how to ask exactly the right questions in exactly the right way to elicit exactly the right responses without further manipulation. Perhaps this seems implausible, but highly successful researchers must do something different or they would not be quite so untypical.

    In any particular case, repeated perfect results must seem unlikely, but it may be less unlikely that factors other than outright fraud could play a role. In the case of JF, the investigation seems never to have considered other reasons.

    One of the lessons from this case for researchers young and old is to keep all of the experimental data over a longer period. The lack of original data was a factor in this case that counted strongly against JF.


    Amarcus41. 2014. “Social Psychologist Förster Denies Misconduct, Calls Charge ‘Terrible Misjudgment.’” Retraction Watch. 2014. Available online.

    Enserink, Martin. 2012. “Fraud-Detection Tool Could Shake up Psychology.” Science 337 (6090). American Association for the Advancement of Science: 21–22. Available online.

    Kolfschooten, Frank van. 2014. “Scientific Integrity. Fresh Misconduct Charges Hit Dutch Social Psychology.” Science (New York, N.Y.) 344 (6184). American Association for the Advancement of Science: 566–67. Available online.

    Kolfschooten, Frank van. 2016. “No Tenure for German Social Psychologist Accused of Data Manipulation.” Science, July. Available online.

    Palus, Shannon. 2016. “Psychologist Jens Förster Earns Second and Third Retractions as Part of Settlement.” Retraction Watch. 2016. Available online.

    Stern, Victoria. 2017. “Psychologist under Fire Leaves University to Start Private Practice – Retraction Watch.” Retraction Watch. 2017-12-12. Available online.

  • 19 Apr 2018 11:13 | Michael Seadle (Administrator)

    Problems with data are arguably the most serious issue for information integrity in the research world, because they undermine the ability of scholars to build on past results. These problems come in many variations, including people who make up fake data, people who manipulate data to get specific results, and people who leave out data or sources. Each of these represent some form of misconduct when done deliberately. Nonetheless not everyone is guilty of malicious intent. Ordinary negligence plays a role too. The results remain unreliable and irreproducible, but the persons involved may be innocent of intentional wrongdoing. This column looks at the scholarly literature on “honest” errors.

    Classification Issues

    Resnik (2012) explains that recognizing honest error is important but hard:

    “It is important to distinguish between misconduct and honest error or a difference of scientific opinion to prevent unnecessary and time-consuming misconduct proceedings, protect scientists from harm, and avoid deterring researchers from using novel methods or proposing controversial hypotheses. … the line between misconduct and honest error or a scientific dispute is often unclear,”

    Precisely what constitutes honest error may depend on personal judgment. An older study by Nath (2006) in the Medical Journal of Australia looked at ”[a]ll retractions of English language publications indexed in MEDLINE between 1982 and 2002…” and “[t]wo reviewers categorised the reasons for retraction of each article…”. Nath concluded that:

    “Of the 395 articles retracted between 1982 and 2002, 107 (27.1%) were retracted because of scientific misconduct, 244 (61.8%) because of unintentional errors, and 44 (11.1%) could not be categorised.”

    The percentage of unintentional errors suggests surprisingly high rate of unintentional error. While it is possible that misconduct has increased significantly over time (see below for more recent numbers), the more likely lesson here is that it matters how the classification is made. It is hard to know how accurate the classifications of misconduct are under circumstances where the assumption of innocence is not always strictly observed after an accusation has been made.

    Estimates of Size

    Later studies do not confirm the Nath estimate about the number of unintentional errors. An article by Arturo Casadevall (2014) argues that

    Analysis of the retraction notices for 423 articles indexed in PubMed revealed that the most common causes of error-related retraction are laboratory errors, analytical errors, and irreproducible results. … The database used for this study includes 2047 English language articles identified as retracted articles in PubMed as of May 3, 2012…

    This suggests that the cause of just under 12% of the PubMed retractions are essentially ordinary human error. A different study by Moylan and Kowalczuk (2016) looks at the BioMed Central journals finds a similar percentage:

    “Honest error accounted for 17 retractions (13%) of which 10 articles (7%) were published in error. … A total of 13 articles (10%) of retractions were due to problems with the data. Often these issues occurred through honest error in how the data were handled, for example … although in some cases it is difficult to determine whether honest error or misconduct was the cause. “

    Daniele Fanelli (2016) offers a somewhat higher percentage of honest error:

    However, retractions reliably ascribed to honest error account for less than 20% of the total, and are often a source of dispute among authors and a legal headache for journal editors. The recalcitrance of scientists asked to retract work is not surprising. Even when they are honest and proactive, they have much to lose: a paper, their time and perhaps their reputation. Much reluctance to retract errors would be avoided if we could easily distinguish between ‘good’ and ‘bad’ retractions.

    In this case good retractions are generally ones where the authors recognize their own mistake and ask for the paper to be withdrawn. Fanelli (2016) makes the further argument that:

    Self-retractions should be considered legitimate publications that scientists would treat as evidence of integrity. Self-retractions from prestigious journals would be valued more highly, because they imply that a higher sacrifice was paid for the common good.

    This could, as he notes, be open to abuse, but some abuse could well be tolerable in the interests of providing an incentive for researchers to withdraw misleading results so that they do not mislead other scholars. Considering present publication pressure and the effect of public opinion, researchers may be unwilling to admit honest errors because they will be thought guilty of misconduct. It may be hard to escape censure regardless of the choice.

    Greyscale Measurement

    One of the measurements that can help define honest error is the degree to which errors confirm the desired conclusions. This is not to say that every error in favor of the authors’ arguments is dishonest, but errors that weaken the conclusion are more likely unintentional. There is of course a human tendency to believe confirming results and to doubt disruptive ones, and a part of research training that may need more emphasis is a healthy skepticism toward desired results. Another form of measurement has to do with the frequency of error. Everyone makes some errors. When authors repeatedly make errors, it may be reasonable to think that the errors follow a standard distribution where some are for and some against the conclusions. A pattern that is consistently in favour of the desired conclusion may imply more bias than honesty.

    Those judging integrity should not forget that honest errors exist, and that people under career or social pressure may be more error prone without particular ill intent.


    Casadevall, Arturo, R. Grant Steen, and Ferric C. Fang. 2014. “Sources of Error in the Retracted Scientific Literature.” FASEB Journal 28 (9): 3847–55. Available online.

    Fanelli, Daniele. 2016. “Set up a ‘self-Retraction’ System for Honest Errors.” Nature. Available online.

    Moylan, Elizabeth C., and Maria K. Kowalczuk. 2016. “Why Articles Are Retracted: A Retrospective Cross-Sectional Study of Retraction Notices at BioMed Central.” BMJ Open 6 (11). Available online.

    Nath, Sara B., Steven C. Marcus, and Benjamin G. Druss. 2006. “Retractions in the Research Literature: Misconduct or Mistakes?” Medical Journal of Australia. Available online.

    Resnik, David B., and C. Neal Stewart. 2012. “Misconduct versus Honest Error and Scientific Disagreement.” Accountability in Research. Available online.

  • 11 Apr 2018 11:55 | Melanie Rügenhagen (Administrator)

    The HEADT Centre launches its first column on Information Integrity. It will appear weekly on Wednesdays with contributions primarily by Principal Investigator Prof. Dr. Michael Seadle. Potential other authors include Dr. Thorsten Beck whose expertise is in image manipulation.

    Follow this blog where we announce each new article, if you are interested in scholarly perspectives on Information Integrity. You can also reach the column from the menu of our page (Research ->  Column on Information Integrity).

    Read the first article here!

  • 11 Apr 2018 11:04 | Michael Seadle (Administrator)

    What is Information Integrity?

    Information integrity is fundamentally about what makes information true or false, both at the scholarly level (research integrity) and for public and policy discourse. There are reports about false information almost daily. A recent example involves the BBC, which has long been a model for the integrity of its reporting. (Sweney, 2018) This column will focus mainly on the scholarly aspects of information integrity, but the effect of integrity problems on policy matters (public health issues, for example) will not be ignored.

    The topic includes a broad range of problems, including data falsification, image manipulation, and plagiarism. While plagiarism is perhaps the most prominent issue, it is primarily an ethical and legal issue and generally does not undermine scholarship that builds on it because the results are not necessarily false. This column will discuss all aspects of information integrity, but will focus especially on data problems, since no generalized detection tools exist, though a few disciplines (such as psychology) are working on them.

    A core concept in my book on “Quantifying Research Integrity” (Seadle, 2017) is the greyscale approach: integrity issues rarely separate neatly into simple black and white, guilty or innocent, categories. Many scholarly works have imperfections, and problematic works may still contain valid information. From the viewpoint of a university or a publisher, formal decision-making processes involving punishments and retractions may make black-and-white decisions about integrity problems preferable, but such black-and-white decisions can themselves be an integrity issue, since an overly simplistic label is at least partly untrue.

    Scholarly literature contains a wealth of examples of integrity problems going well back in historical time. Today there are tools for investigating plagiarism and for examining some kinds of image manipulation. Data falsification presents more of a challenge because of its variety and complexity. Simple cases such as that of Diederik Stapel, who admitted manufacturing his results, are rarer than scholars who make poor choices about data or its interpretation. (Bhattacharjee, 2013) Unintentional error is also an information problem, even if it is not falsification.

    Selection Bias

    Selecting problematic research may have lasting effects on political discourse as well as on scholarship. While the evidence for climate change appears to be overwhelming, studies by a small number of skeptics have given oil and coal lobbies in the US a tool for opposing effective measures to reduce hydrocarbons in the atmosphere. Natural science builds on the ability to reproduce results, and when many scientists produce the same results based on a wide range of measures, the conclusions are normally accepted as valid. Lay persons unfamiliar with the scholarly literature sometimes select flawed studies that confirm their own personal preferences.

    Other more historical examples of selection bias can be found in claims about the inferiority of people in the US who were not of northern European descent — not merely those from Africa, but also from Italy, Ireland, and eastern Europe. Such claims were popular among the right wing in many European countries in the Nazi era, and are still popular among some groups today. A basis for them reaches back to Christoph Meiners (Grundriß der Geschichte der Menschheit, 1785) in the 18th century and is as modern as “The Bell Curve” by Richard Herrnstein and Charles Murray (1994). These studies did not fake their data and used scientific methods that seemed appropriate at the time, but they were selective about what evidence they included, and today it is widely accepted that the exclusions skewed results in a particular direction.

    Selection bias may have social and cultural origins that can change over time. For those who believe in the inerrancy of Holy Scripture, the data confirming evolution is invalid. A scholar of research integrity needs in some sense to be an historian, in order to understand the research in time and place, and to be an ethnographer, in order to understand integrity violations across cultures and disciplines. No one should imagine that integrity research involves simple labels.

    The Research Integrity Literature

    This column will focus on discussing papers about research integrity and will look at specific cases, whose complexity gives opportunities to apply a greyscale analysis. There are many good sources of information, not the least of which is Retraction Watch (Oransky, 2018), which provides an excellent news feed and classifies cases of retractions by type and field. Retractions may represent only part of the problem, simply because discovering problems is hard and because false positives may distract from more important issues. The ability to reproduce results is a classic hallmark of good science, but there is good evidence that results in behavioral and social science studies are harder to reproduce than natural-science results for the simple reason that social circumstances change.

    The goal of this column is scholarly, not investigative. It does not actively seek out new cases where research integrity may have been violated, but seeks to examine existing cases in order to apply a greyscale understanding of what happened and what the consequences are. As Principal Investigator for the research integrity part of the HEADT Centre, I will be the primary columnist, but others will likely contribute as well, including Dr. Thorsten Beck, who specializes in image manipulation.


    Bhattacharjee, Yuduit. 2013. “The Mind of a Con Man.” New York Times, April 26, 2013. Available online.

    Seadle, Michael. 2017. Quantifying Research Integrity. Morgan Claypool: Synthesis Lectures on Information Concepts, Retrieval, and Services. Available online.

    Sweney, Mark. 2018. “No Title.” New York Times, April 4, 2018. Available online.

    Oransky, Ivan, and Adam Marcus. 2018. “Retraction Watch.” 2018. Available online.

  • 28 Feb 2018 11:52 | Melanie Rügenhagen (Administrator)

    Michael Seadle (HEADT Centre) and David Neal (Elsevier) are on the steering committee of the NetDiploma project, which is currently giving a two-day workshop at Northumbria University. Read all about the NetDiploma project on their project website.

    Photo: Courtesy of the NetDiploma Project

  • 28 Feb 2018 11:45 | Melanie Rügenhagen (Administrator)

    The Berlin School of Library and Information Science at Humboldt-Universität zu Berlin will launch a new one-year postgraduate certificate programme in Digital Information Stewardship in autumn 2018. It is taught entirely in English and offers an option to take courses at University College Dublin.

    This programme blends distance learning with video-based discussions and three brief face-to-face meetings. The goal is to enable people to continue their careers and simultaneously to open new job opportunities.

    Find further details such as the admission requirements on our website: https://www.ibi.hu-berlin.de/en/teaching/postgraduate-certificate-programme/postgraduate-certificate-DIS

    You are welcome to apply. Contact the programme coordinator, Melanie Rügenhagen (melanie.ruegenhagen [at] hu-berlin.de), if you are interested.

Powered by Wild Apricot Membership Software