Log in

<< First  < Prev   1   2   3   4   5   Next >  Last >> 
  • 9 May 2019 14:00 | Michael Seadle (Administrator)

    VroniPlag has posted an analysis of the dissertation of Franziska Giffey (geb. Süllke) with the claim that they have documented plagiarism on 76 of the 205 pages with content or 37.1%.

    "Bisher (8. Mai 2019, 09:20:57 (UTC+2)) wurden auf 76 von 205 Seiten Plagiatsfundstellen dokumentiert. Dies entspricht einem Anteil von 37.1% aller Seiten. Davon enthalten 11 Seiten 50%-75% Plagiatstext und 1 Seiten mehr als 75% Plagiatstext."

    "Up to now (8 May 2019 at 09:20:57 (UTC+2)) 76 instances of plagiarism out of 205 pages have been documented. This means 37.1% of all pages. Of those, 11 pages have 50%-75% plagiarism and 1 page more than 75%." [my translation] (VroniPlag, 2019)

    The figures on VroniPlag are misleading, because they give the impression that 37.1% of the whole content had plagiarism, rather than that problems (according to their definition) occurred on 37.1% of individual pages, regardless of whether just a few lines were involved. In fact the overall percentage is significantly lower by VroniPlag's own standards, if one uses the percentages linked to their own colour-coding:

    • Black is up to 50%

    • Dark Red is 50% to 75%

    • Red is 75% to 100%

    If one multiplies the number of pages in each of the colours times the maximum percentage, the results are as follows:

    • 64 pages are coloured black: 64 * 50% (the maximal value for black) = 32 pages worth of possible plagiarism.

    • 11 pages are coloured dark red: 11 * 75% (the maximal value for dark red) = 8.3 pages worth of possible plagiarism.

    • 1 page is coloured red: 1 * 100% (the maximal value for red) = 1 page worth of possible plagiarism.

    • The total for all three colours using the maximum percentages is: 41.3 pages or 20.1% of the 205 pages with content.

    Since the percentages associated with the colours are maximum values, the midpoint may give a more accurate picture:

    • 64 pages are coloured black: 64 * 25% (the midpoint between 0 and 50%) = 16 pages worth of possible plagiarism.

    • 11 pages are coloured dark red: 11 * 62.5% (the midpoint between 50% and 75%) = 6.9 pages worth of possible plagiarism.

    • 1 page is coloured red: 1 * 87.5% (the midpoint between 75% and 100%) = .9 page worth of possible plagiarism

    • The total for all three colours using the midpoint percentages is: 23.8 pages or 11.6% of the 205 pages with content.

    The VroniPlag figures need to be understood in context. VroniPlag's mission is to find plagiarism and they put the worst possible interpretation on their results. There is in fact a big difference between 37.1% and 11.6%. One could argue that 11.6% is still too much -- if all of the marked passages were genuine plagiarism -- but the numbers need to be presented in a more balanced and less misleading way, which VroniPlag fails to do. Systems like iThenticate give a percentage of words, not of pages with any potential plagiarism.

    VroniPlag publishes their criteria for plagiarism, which is commendable, but their criteria reflect rigid rules that are not universal in academic practice. The rules also take no account of legitimate choices about which source to cite or about the context within a work. In a literature review, for example, it is almost impossible not to reuse words from the articles being discussed.

    A set of standards that measures the number of overlapping words in a particular spatial context (that is, the number of words in a sentence or paragraph that overlap with another text) gives a more nuanced and more accurate view. An example can be found in my book "Quantifying Research Integrity". The results of this kind of "greyscale analysis" are not designed for capturing headlines, but for judging fairly.


    Seadle, Michael. December 2016. Quantifying Research Integrity. Morgan Claypool. Available online.

    VroniPlag, 2019. Eine kritische Auseinandersetzung mit der Dissertation von Dr. Franziska Giffey (geb. Süllke): Europas Weg zum Bürger - Die Politik der Europäischen Kommission zur Beteiligung der Zivilgesellschaft. Available online.

  • 4 Apr 2019 01:08 | Michael Seadle (Administrator)

    The US Federal Trade Commission has announced "a $50 million court judgment against Omics International of Hyderabad, India, and its owner, Srinubabu Gedela."  (New York Times, 3 April 2019). 

  • 2 Apr 2019 15:51 | Thorsten Beck (Administrator)

    How to Detect Image Manipulations Part V: The Subtraction Tool in ImageJ

    As already outlined in Part IV of this series, the image editing program ImageJ, developed by Wayne Rasband under the auspices of the United States National Institute of Health, offers a wide range of image editing possibilities, as well as a wealth of tools and extensions that can be used for measuring and analyzing images.


    One of the obvious challenges when screening scientific images is to reliably identify elements that have been added subsequently. In the following experiment we are going to test the capacities and limitations of the subtraction operation in the Image Calculator – one of the tools available in ImageJ. We find this tool in the drop-down menu under ‘Process’ (see Fig. 1).

    Fig. 1. Image Calculator in ImageJ

    To test how effective the Image Calculator is, we think of a scenario in which an element has been copied into an image. The Image Calculator should be able to identify this very element. I generated the following three examples to use them for this test.


    In the first example (see Fig. 2 for the original image), one of the individual bees has been copied, moved and pasted into the image (see Fig. 3). Compared to the original image, it becomes clear how difficult it is to perceive this manipulation with the naked eye. Even the trained eye has difficulties to clearly identify such copied and re-used picture elements when there are many of the same kind.

    Fig. 2. Original Image

    Fig. 3. Image with bee added.

    In example 2 we apply this basic operation to an electrophoresis image to simulate a scenario in scientific research. An image element from the original image is duplicated and reused several times (see Fig. 4). Here, too, we will test how effectively the Image Calculator can help to identify and highlight this manipulation.

    Fig. 4. Electrophoresis Image (Original on the left, and altered version on the right.)

    In the third and last example (see Fig. 5 for the original image), one of the elements is removed from the photo using the Clone Stamp in Photoshop (and overwritten with background texture). Another element is copied and used to replace the removed element (see Fig. 6). The resulting image will then also be analyzed with the help of the ImageJ subtraction tool in the Image Calculator.

    Fig. 5. Original Image

    Fig. 6. Manipulated Version: Image element deleted with Photoshop Clone Stamp and pliers (Pos. 4 on the right) copied and pasted.


    In order to analyze the test images, a reference image is required as a source for comparison for each of the examples. Here are some instructions of the steps that must be performed one after the other in the program: First, the source image and the manipulated version are dragged and dropped onto the ImageJ tool bar. Both images are now displayed automatically and the Image Calculator opens. To determine the difference between both versions, the operation 'Subtract' (the original version is subtracted from the manipulated version) is selected (see Fig. 7). The result image clearly reveals the difference that was so difficult to see with the naked eye (see Fig. 8).

    Fig. 7. Image Calculator Menu

    Fig. 8. Subtraction Result of Bee Example Image

    We proceed similarly with the second example. Here, too, the operation produces a quite clear result. The subsequently added blots are clearly displayed, while all identical image information remains black (see Fig. 9).

    Fig. 9. Subtraction Result of Electrophoresis Example Image

    Somewhat less clear, but still clear enough, is the result of the third experiment. The Image Calculator shows both copied and moved image element as well as the cloned region and the image information that is actually covered by the image background (see Fig. 10).

    Fig. 10. Subtraction Result of Tool Example Image


    The Image Calculator in ImageJ reliably determines the differences between non-manipulated source images and manipulated versions – upon visual inspection. Especially for images with elements that are somewhat difficult to clearly identify and analyze for the naked eye, this tool appears to be a good help.

    The procedure presented here, however, is an idealized scenario. The tool will only produce such impressive results if the original version of the modified image is available and can be compared with the manipulated image. But that is rarely the case in real-world scenarios. Even if the journal instructions would require authors to submit the original data of their images along with the publication, a small change to the manipulated image is enough to significantly reduce the efficiency of the Image Calculator. As soon as the dimensions of the image are not identical, the program calculates a completely different result:

    Fig. 11. Subtraction Result with Cropped Reference Image

    In summary, the ImageJ subtraction tool has delivered surprisingly clear and comprehensible results, but facing the limitation that these were artificial conditions.

    Given the precondition that original and reference image are both available (and not cropped or stretched) for analysis, the program delivers consistent and convincing results. It goes without saying that reality tends to be somewhat more complicated than hand-made test conditions.

    More tools will be evaluated soon – please visit HEADT.EU for upcoming posts.

    ©HEADT CENTRE 2019

  • 6 Dec 2018 13:55 | Melanie Rügenhagen (Administrator)

    In this video, Prof. Michael Seadle gives a short introduction to our exhibition “How Trustworthy? An Exhibtion on Negligence, Fraud and Measuring Integrity”.

  • 19 Oct 2018 13:52 | Melanie Rügenhagen (Administrator)
    On the evening of 18 October 2018, we opened the exhibition HOW TRUSTWORTHY? – An Exhibition on Negligence, Fraud, and Measuring Integrity

    in the Jacob-und-Wilhelm-Grimm-Zentrum, the university library of Humboldt-Universität zu Berlin (HU Berlin). Short welcoming speeches by Prof. Dr. Degkwitz (director of the university library), Prof. Dr. Graßhoff (vice dean of the Faculty of Humanities) as well as Prof. Seadle, PhD (Senior Researcher at HU Berlin and Principal Investigator of the HEADT Centre) introduced the urgency of the topic.

    Impressions from setting up the exhibition. Photo credit: Thorsten Beck.

    The exhibition is a joint project of the HEADT Centre and the Berlin School of Library and Information Science at Humboldt-Universität zu Berlin and shows that both human error and deliberate manipulation can compromise the integrity of scholarly findings.

    Interested people can visit the exhibition until 17 December 2018 in the foyer of the library at no charge. The English version of the text can be accessed at any time on our website (here).

  • 10 Oct 2018 13:46 | Melanie Rügenhagen (Administrator)

    On 9 October, Elsevier Connect published the HEADT Centre article Combating image misuse in science: new Humboldt database provides “missing link” by Dr, Thorsten Beck. The article is free to access. You are welcome to read it here!

  • 21 Sep 2018 13:49 | Thorsten Beck (Administrator)

    The goal of this exhibition is to increase awareness about research integrity. The exhibition highlights areas where both human errors and intentional manipulation have resulted in damage to careers, and it serves as a learning tool.

    Cover Image: Eagle Nebula, M 16, Messier 16 // NASA, ESA / Hubble and the Hubble Heritage Team (2015) // Original picture in color, greyscale edited © HEADT Centre 2018

    The exhibition has four parts. One deals with image manipulation and falsification. Another addresses data problems, including human error and fabrication. A third is about plagiarism, fake journals and censorship. The last section covers detection and the nuanced analysis needed to distinguish the grey zones between minor problems and negligence in case of fraud.

    October 18 – December 17, 2018Opening on October 18, 6 pm
    in the foyer of the Jacob-und-Wilhelm-Grimm-Zentrum of Humboldt-Universität zu Berlin

    Geschwister-Scholl-Straße 1/3, 10117 Berlin

    Admission to the exhibition is free.

    An exhibition of the HEADT Centre (Humboldt-Elsevier Advanced Data & Text Center)
    in cooperation with the Berlin School of Library and Information Science
    at Humboldt-Universität zu Berlin

    Cover Image: Eagle Nebula, M 16, Messier 16
    False colored space photography ( 2015)
    NASA, ESA / Hubble and the Hubble Heritage Team
    Original picture in color, greyscale edited

  • 13 Jul 2018 12:00 | Melanie Rügenhagen (Administrator)

    We have published a new HEADT Centre video that introduces how the HEADT Centre is contributing to research on similarity search at Humboldt-Universität zu Berlin. The video is available here:

  • 13 Jun 2018 11:28 | Michael Seadle (Administrator)

    NOTE: Melanie Rügenhagen was co-author of this entry.

    Replication is difficult to apply to qualitative studies in so far as it means recreating the exact conditions of the original study — a condition that is often impossible in the real world. The key question then becomes: “how close to the original must a replication be to validate an original experiment?” (Seadle, 2018)

    This question is particularly important because of the widespread belief that only quantitative research is replicable. Leppink (2017) writes:

    “Unfortunately, the heuristic of equating a qualitative–quantitative distinction with that of a multiple–single truths distinction is closely linked with the popular belief that replication research has relevance for quantitative research only. In fact, the usefulness of replication research has not rarely been narrowed down even further to repeating randomised controlled experiments.” (Leppink, 2017)

    Dennis and Valacich (2014) suggest three categories for replication studies, only one of which is “exact” (see the column from 23 May 2018). The conceptual and methodological categories are both relevant to qualitative research, because the participants and the context can vary as long as the replication tests the inherent goals and concepts, as well as the methodological framework of the original. In other words, successful qualitative replications can provide a confirmation of the hypotheses at a higher level of generalisation. Even when the specific contexts change. What matters is that the concepts and outcomes remain constant. As Polit and Beck (2010) write:

    “If concepts, relationships, patterns, and successful interventions can be confirmed in multiple contexts, varied times, and with different types of people, confidence in their validity and applicability will be strengthened.” (Polit & Beck, 2010)

    These authors support the use of replication in qualitative research, and argue that replication is the best way to confirm the results of a study:

    “Knowledge does not come simply by testing a new theory, using a new instrument, or inventing a new construct (or, worse, giving an inventive label to an old construct). Knowledge grows through confirmation. Many theses and dissertations would likely have a bigger impact on nursing practice if they were replications that yielded systematic, confirmatory evidence—or if they revealed restrictions on generalized conclusions.” (Polit & Beck, 2010)

    How can one ensure that the evidence is systematic? Leppink (2017) suggests that researchers in all kinds of studies have to decide when they no longer need more data in order to answer their research question and calls this concept saturation.

    It is important to remember that qualitative research normally does not generalise about results beyond the community involved in the samples, which sets a very limited and specific context for the research question. At some point researchers need to decide when their question is answered, stop their inquiries, and come to a conclusion. Leppink (2017) writes:

    “If saturation was achieved, one might expect that a replication of the study with a very similar group of participants would result in very similar findings. If the replication study leads to substantially different findings, this would provide evidence against the saturation assumption made by the researchers in the initial study.”

    Saturation means that the answer to a research question is complete, and becomes a core element of the “systematic, confirmatory evidence” (Polit & Beck, 2010) for analyzing validity. It can also help to provide metrics by uncovering the degree to which a study may be flawed or even intentionally manipulated.

    Nonetheless there are barriers. While a range of studies based on the same concepts and methodology can lead to insights about whether a phenomenon is true, not knowing exactly how the original researchers conducted their studies may make replication impossible (Leppink, 2017). This makes describing the methodology particularly important.

    None of this is easy. Replication studies remain a stepchild in the world of academic publishing. Gleditsch and Janz (2016) write about efforts to encourage replicating their own research area (international relations):

    “Nevertheless, progress has been slow, and many journals still have no policy on replication or fail to follow up in practice.”

    The problem is simple. There is no fame to be gained in showing that someone else’s ideas and conclusions are in fact correct, and it is hardly surprising that ambitious researchers avoid doing replications, especially for qualitative research, where the risk of failing is high and succeeding only makes readers think that the original study was done well.


    Gleditsch, Nils Petter, and Nicole Janz. 2016. “Replication in International Relations.” International Studies Perspectives, ekv003. Available online.

    Polit, Denise F., and Cheryl Tatano Beck. 2010. “Generalization in Quantitative and Qualitative Research: Myths and Strategies.” International Journal of Nursing Studies 47 (11): 1451–58. Available online.

    Michael Seadle. 2018. “Replication Testing.” Column on Information Integrity 2/2018. Published on 23 May 2018. Available online.

    Leppink, Jimmie. 2017. “Revisiting the Quantitative–Qualitative-Mixed Methods Labels: Research Questions, Developments, and the Need for Replication.” Journal of Taibah University Medical Sciences 12 (2). Elsevier B.V.: 97–101. Available online.

    Dennis, Alan R, and Joseph S Valacich. 2014. “A Replication Manifesto.” AIS Transactions on Replication Research 1 (1): 1–5.

  • 23 May 2018 11:26 | Michael Seadle (Administrator)

    Testing for Reliability

    The principle that scientists (and scholars generally) can build on past results means that past results ought to be replicable. Brownill et al (2016) write:

    This replication by different labs and different researchers enables scientific consensus to emerge because the scientific community becomes more confident that subsequent research examining the same question will not refute the findings.

    And MacMillan (2017) writes in his editorial “Replication Studies”:

    Replication studies are important as they essentially perform a check on work in order to verify the previous findings and to make sure, for example, they are not specific to one set of data or circumstance.

    Increasingly replication is also seen as a way to test for data falsification, on the presumption that unreliable results will not be replicable; but as with most forms of testing, it offers no simple answer.

    How does Replication Work?

    The ability to replicate results means that those doing the replication need exact information about how the original experiment was carried out. In physics and chemistry this means precise descriptions in lab books and in articles, and the same machines using the same calibration. In the social sciences, it can be much harder to reproduce the exact conditions, since they depend on human reactions and a variable environment. One well-known case comes from a study by Cornell social psychologist Daryl Bem, who did a word recognition test:

    “[Bem] published his findings in the Journal of Personality and Social Psychology (JPSP) along with eight other experiments providing evidence for what he refers to as “psi”, or psychic effects. There is, needless to say, no shortage of scientists sceptical about his claims. Three research teams independently tried to replicate the effect Bem had reported and, when they could not, they faced serious obstacles to publishing their results.” (Yong, 2012)

    The fact that the other research teams could not replicate the experiment successfully did not suggest to anyone that the data were fake (presumably the students could attest to that), but the failure did cast doubt on the apparent “psychic effects”. Since an exact replication using those Cornell students in that class with all the same social conditions was not possible, the question arises: how close to the original must a replication be to validate an original experiment?

    Dennis and Valacich (2014) talk about “three fundamental categories” of replication:

    • Exact Replications: These articles are exact copies of the original article in terms of method and context. All measures, treatments statistical analyses, etc. are identical to those of the original study…
    • Methodological Replications: These articles use exactly the same methods as the original study (i.e., measures, treatments, statistics etc.) but are conducted in a different context. …
    • Conceptual Replications: These articles test exactly the same research questions or hypotheses, but use different measures, treatments, analyses and/or context….

    Since the Cornell students were not available for the replications, the replications presumably come under the “methodological” category, or perhaps even the “conceptual”. Dennis and Valacich (2014)comment: “Conceptual replications are the strongest form of replication because they ensure that there is nothing idiosyncratic about the wording of items, the execution of treatments, or the culture of the original context that would limit the research conclusions.”

    In any case these replication types represent a significant contribution to knowledge by confirming or throwing skepticism on the earlier results. Why then did the research teams have trouble publishing their results?

    Publishing Replications

    Most journals do not encourage replications. A study that strikes readers as new and exciting and generates attention is a plus, whereas a study that appears to cover old ground, even if it has scholarly value, is less likely to get through the peer review process. Lucy Goodchild van Hilten (2015) writes:

    Publication bias affects the body of scientific knowledge in different ways, including skewing it towards statistically significant or “positive” results. This means that the results of thousands of experiments that fail to confirm the efficacy of a treatment or vaccine – including the outcomes of clinical trials – fail to see the light of day.

    This may be changing and the degree to which it is true depends in part on the academic discipline. David McMillan (2017) writes:

    Cogent Economics & Finance recognises the importance of replication studies. As an indicator of this importance, we now welcome research papers that focus on replication and whose ultimate acceptance depends on the accuracy and thoroughness of the work rather than seeking a ‘new’ result.

    If other journals follow this trend, there could be significantly more testing of scholarly results. Nonetheless a problem remains. Except for the design time, replicating results costs almost as much as doing the original experiment and if the results are in fact exactly the same, it is unlikely to be published. Some fields solve the problem with a repeat-and-extend approach where replication is tied to new features that explicitly build on the replicated results. Much depends on the culture of the discipline.

    For all of its problems, replication remains one of the most effective and reliable tools for uncovering flaws and fake data, and should be used more widely.


    Bem (2015) did a further “meta-analysis of 90 [replication] experiments from 33 laboratories in 14 countries …” which he claims supports his hypothesis. He published this meta-analysis in an open-access journal for the life sciences that charges $1000 for an article of this length, and Bem explicitly declared that he had no grant support. If nothing else, this is a sign of how difficult it is to continue the discourse in standard academic venues.


    Ms. Melanie Rügenhagen (MA) suggested the topic and assisted with the research. Prof. Dr. Joan Luft, provided research content.


    Bem D, Tressoldi PE, Rabeyron T and Duggan M. 2015. “Feeling the Future: A Meta-Analysis of 90 Experiments on the Anomalous Anticipation of Random Future Events.” F1000Research 4:1188. Available online.

    Brownill, Sue, Dennis, Alan R., Binny, Samuel, Tan, Barney, Valacich , Joseph and Whitley, Edgar A. 2016. “Replication Research: Opportunities, Experiences and Challenges.” In Thirty Seventh International Conference on Information Systems. Dublin, Ireland. Available online.

    Dennis, Alan R, and Joseph S Valacich. 2014. “A Replication Manifesto.” AIS Transactions on Replication Research 1 (1): 1–5.

    Goodchild van Hilten, Lucy. 2015. “Why It’s Time to Publish Research ‘Failures.’” Elsevier Connect. Available online.

    McMillan, David. 2017. “Replication Studies.” Cogent Economics and Finance, 2017. Available online.

    Yong, Ed. 2012. “Replication Studies: Bad Copy.” Nature 485 (7398): 298–300. Available online.

<< First  < Prev   1   2   3   4   5   Next >  Last >> 
Powered by Wild Apricot Membership Software