The Clone Detection Tool
Another tool available on the website “Forensically” (https://29a.ch/photo-forensics/#clone-detection) helps with the detection of clones. Like the Error Analysis Tool, which was discussed earlier on this blog, the Clone Detection Tool includes a set of levels that enable the user to identify manipulated areas in images.
Before discussing this tool, we need to have a closer look at another instrument – the “clone stamp” in Adobe Photoshop. This instrument makes it easy to significantly alter images in a way that the average viewer tends not to recognize. The clone stamp makes it possible to reproduce and to overwrite the “natural” texture of a certain area in an image. Detecting image manipulations in all kinds of images (e.g., artistic or scientific) not only requires a basic understanding of the tools that are designed for detection, but of the tools and practices that made the manipulation possible. In other words, whoever wants to detect manipulations must be able to produce them.
This is why I include three examples of how to work with the clone stamp:
Erasing Artifacts with the Clone Stamp in Adobe Photoshop
Fig. 1: Blossom and Bee
The idea of the first experiment is to make this picture a little more dramatic by erasing elements surrounding the blossom on the left and on the right.
Fig. 2: Schaffhausen Boat Scene
The second image shows a boat near the Rheinfall in Schaffhausen. I decided to let the boat disappear completely using the clone tool.
Fig. 3: Outdoor Scene with Shadows
The third image manipulation is a little more complicated, since it is not trivial to reproduce consistent color flows. The plan is to hide two of the shadows in the foreground (keeping only the shadow in the middle), and to erase the figure in the background.
Before I present and discuss the final results of the manipulations, here are some screenshots that document how I altered the images.
Overview of manipulations:
1_Blossom and Bee
Fig. 4: Erasing of background texture
Fig. 5: Reproducing flower petals.
2_Schaffhausen Boat Scene
Fig. 6: Cloning the water surface gives the impression as if the boat were sinking.
Fig. 7: Half of the boat vanished.
Fig. 8: Image with replaced shadows in the foreground.
Figs. 9-11: Cloning the background figure. Because the figure is located in front of different background textures (like the tree, the fence, the rocks and the path), it is crucial to copy information from each of these elements to produce a somewhat natural impression.
THE CLONE DETECTION TOOL
Now, here are the three images after the manipulations:
Fig. 12: The image has a little less background texture, and duplicating the petals highlights the blossom.
Fig. 13: The boat is gone and the manipulation not easy to trace.
Fig 14: A closer look makes it possible to see the manipulations in the foreground. The lighting on the right appears unnatural and there are visible traces of the clone tool. The erasing of the figure in the background seems less easy to trace.
Let us now see whether the Clone Detection Analysis Tool can reveal these manipulations. For the purpose of comparison, I analyzed both the cloned and the original image to give an impression of how the results differ.
Example 1: Blossom and Bee
Fig. 15: It turns out that the clone detection tool covers most of the manipulations. The tool highlights the duplicated/cloned flower petals and it reveals some other similarities in background structures (not all of them represent manipulated areas). It does not, however, highlight any of the rather drastic background manipulations (I erased parts of the background on the left and enlarged the dark area on the right – best compared with the original below).
Here is how the Clone Detection Tool analyzes the unaltered version of the image:
Fig 16: Although the tool traces some similarities, the difference between original and manipulated image remains clearly visible. Overall, the tool helps revealing many clones – except for those in the background texture.
Example 2: Schaffhausen Boat Scene
Fig. 17: In analyzing the second image, the analysis highlights certain areas of the image, but since the texture of the water surface is naturally repetitive, it is not easy to tell the manipulated from the untouched parts.
Fig 18: The comparison is useful in this case, because it shows that the manipulated areas are highlighted in different ways than other highlighted areas. One possible conclusion is: whenever the detection tool highlights a section strongly, this may be interpreted as evidence for cloned areas.
Example 3: Outdoor Scene with Shadows
Fig. 19: The last example shows some of the tool’s limitations. The tool introduces a paradox: it highlights areas in the image that have not been touched, and it does not identify areas that the human eye can easily identify as suspicious.
Fig. 20: Most of the textures and elements in the image are highly repetitive, which may be a reason why the algorithm does not detect the cloned areas. What is obvious: the clone detection tool does not work well in this case.
The help section of the “Forensically” website says: “The clone detector highlights copied regions within an image. These can be a good indicator that a picture has been manipulated.” (https://29a.ch/photo-forensics/#help) As the above experiments have shown, there are clear limitations to the capacities of the tool. Yet none of the tools on “Forensically” claim to reveal all manipulations under all circumstances. They rather promise to make it easier to identify where to look closer.
All in all, out of the three examples discussed in this blog post, there is only one in which the tool clearly highlighted repetitive areas (Example 1), another one in which the tool indicated areas within the image that the algorithm marked as suspicious (Example 2), and one in which the algorithm remained oblivious, if not misleading (Example 3).
More tools will be evaluated soon – visit headt.eu for upcoming posts.
HEADT CENTRE 2017
Institutions typically treat research integrity violations as black and white, right or wrong. The result is that the wide range of grayscale nuances that separate accident, carelessness, and bad practice from deliberate fraud and malpractice often get lost. This lecture looks at how to quantify the grayscale range in three kinds of research integrity violations: plagiarism, data falsification, and image manipulation.
Quantification works best with plagiarism, because the essential one-to-one matching algorithms are well known and established tools for detecting when matches exist. Questions remain, however, of how many matching words of what kind in what location in which discipline constitute reasonable suspicion of fraudulent intent. Different disciplines take different perspectives on quantity and location. Quantification is harder with data falsification, because the original data are often not available, and because experimental replication remains surprisingly difficult. The same is true with image manipulation, where tools exist for detecting certain kinds of manipulations, but where the tools are also easily defeated.
This lecture looks at how to prevent violations of research integrity from a pragmatic viewpoint, and at what steps can institutions and publishers take to discourage problems beyond the usual ethical admonitions. There are no simple answers, but two measures can help: the systematic use of detection tools and requiring original data and images. These alone do not suffice, but they represent a start.
The scholarly community needs a better awareness of the complexity of research integrity decisions. Only an open and wide-spread international discussion can bring about a consensus on where the boundary lines are and when grayscale problems shade into black. One goal of this work is to move that discussion forward.
Over the last decade inappropriate image manipulations have become a serious concern in a variety of sectors of society, such as in the news, in politics or the entertainment sector. Digital image editing programs are nowadays very powerful and constantly change how we produce and understand images. In academia images play a very important role and due to a number of fraud incidents image manipulation gained more and more attention.
Given the number of scientific papers that contain problematic images (without necessarily representing fraudulent intend) and the fact that many retractions happen due to the inappropriate use of images there is definitely a need to take effective measures against inappropriate manipulations. But this is far easier said than done, since often is not trivial to
However, this is the first of a series of blog posts that deal with the simple question of how image manipulations can be detected. The field of research that is occupied with the detection of image manipulation is called ‘image forensics’. Forensic experts analyze whether there is evidence that makes an image suspicious and they gather all the clues that can possibly be found in order to make informed judgments about the appropriateness of the image. This can include aspects like compression, meta data, or lighting. It can be executed through mere observation or by applying suitable algorithms. Forensic analysis plays a practical role for insurance companies, in crime investigations and all thinkable fields in which images possess evidential value.
There are a number of free online resources available on the web that promise to support image analysis. Collections of forensic tools are available at https://29a.ch/photo-forensics ; http://fotoforensics.com/ ; or at http://www.getghiro.org/ to only name a few.
THE ERROR LEVEL ANALYSIS TOOL
One tool that all of these collections include is “Error Level Analysis” (hereinafter: ELA). Jonas Wagner, the developer of “Forensically” explains the tool on his web site as follows:
“This tool compares the original image to a recompressed version. This can make manipulated regions stand out in various ways. For example they can be darker or brighter than similar regions which have not been manipulated.“ (https://29a.ch/photo-forensics/#help)
The tool is designed to identify those areas within an image that are on a different compression level. When manipulations have been carried out on a JPEG image (e.g. with elements added or removed) the ELA analysis tool is expected to identify and mark all of the manipulated regions, since the resave of the image puts the original image and the added element on different compression levels.
Let us now see how this works out in practice:
Below you see the unaltered (but downsized) version of a random snapshot I took last summer near the river Elbe in Saxony, Germany, with a Sony Alpha 6000 Camera:
This is the ELA analysis result of the digital image using the free online resource “Forensically”:
The image appears consistently dark with only a few regions standing slightly out because of the original lighting. The edges of the objects appear a little lighter than the rest of the image and in some areas they show a little violet touch, while the sun appears as a black uniform stain. I then uploaded the original image to Adobe Photoshop and added and changed a number of features in the picture. For example I included a PNG of the moon and duplicated it on the upper left side of the image. Moreover, I inserted a swarm of birds and removed some disturbing stains from the glass in the foreground with the Photoshop erasure tool and, last not least I copied the flower from the milk can and duplicated it (see images below).
List of manipulations:
1_Added and duplicated moon on different saturation levels:
3_Removed stains (you can clearly see the round marks of the erasure tool):
4_Copied and pasted flowers on the milk carton
This is how the resulting image looks like:
Note: I did intentionally not alter any of the standard settings, like contrast or brightness over the entire image, since ELA results could be affected.
After I carried out these rather basic manipulations I saved the image from Photoshop as JPEG and uploaded the image to “Forensically” to analyze it with the ELA tool (the tool only allows the analysis of JPEG and TIFF images). This is how the result of the ELA analysis looks like:
Here are some details:
Moon area: Clearly visible. It is noteworthy that the ELA analysis tool shows all 5 objects highlighted uniformly and does not represent the different saturation levels.
Added birds: The highlighting is clearly recognizable.
Flower area: Edges appear almost like the edges of unaltered objects in the photo. The highlighting of the copied and pasted objects is not clearly distinguishable from other structures in the image. In other words, if I did not know about the manipulation I would not have recognized it.
Removed stains on the glass: The area appears almost like the unaltered version. From the ELA analysis, traces of the erasure tool are not recognizable.
This short experiment showed some of the strong and some of the weak sides of the ELA analysis tool. The tool clearly identified elements that were introduced to the picture after a single resave. (Any further resave would decrease the quality of the JPEG and consequently influence the ELA result.). ELA did not reveal other manipulations like the copying of the flower element or the removing of stains on the glass, which definitely limits the usefulness of the tool. However, since ELA at least allowed to identify some of the manipulations it can be recommended as one possible tool to start with when analyzing images. Still, the user should be aware that regions which stand out do not necessarily imply manipulation. Jonas Wagner points out in the help section of “Forensically”: “The results of this tool can be misleading (…)”.
Another aspect that must be mentioned is that it requires a good bit of experience before you get results. The levels are not self-explanatory and interpreting the ELA results definitely requires some visual training (as well as reading through tutorials). One interesting insight I gained from working with this tool is that whenever a JPEG is uploaded to Photoshop it acquires a characteristic “rainbow effect” which can better be observed with the levels slightly altered, like in the example below (JPEG Quality 90, Error Scale 53, Magnifier Enhancement: None, Opacity at 0.64).
The characteristic rainbow effect that reveals an image has been uploaded to Photoshop (or another Adobe Product):
In sum, Error Level Analysis opens the mind of the user for a more systematic evaluation of what is visible and what can be hidden in a picture. It helps revealing some of the hidden features, but is definitely not a tool that can stand for itself or that produces all-inclusive forensic results for the uninitiated user.
More tools will be evaluated soon – please visit HEADT.EU for upcoming posts.
HEADT Centre 2017
Prof. Michael Seadle gave two lectures via Skype on 24 November 2016 about the research integrity work of the HEADT Centre to students in the Scientific Writing in English Course at the National Library of Technology/Czech Technical University in Prague.
Today images play a predominant role in public communication – in advertisements, the broadcasting industry as well as the Internet. Images massively influence how events are perceived – they attract attention and shape worldviews. At the same time images are often intentionally altered to serve a given purpose. In scholarship like in other fields of society images are highly valued as a key currency in an economy of attention. Through digital image editing programs it is relatively easy to produce images or to enhance their visual qualities and thereby to create images that are cleaned up or beautified.
In face of such operations, how do scholars actually conceive image manipulations? How do biologists, computer scientists, art historians, or designers judge their liberty when altering images? Where do they draw the line between appropriate image editing and fraudulent image manipulation? Such are the questions that are raised in the book “Shaping Images – Scholarly Perspectives on Image Manipulation”, published by De Gruyter 12 September 2016. The book includes the perspective of scholars with different disciplinary backgrounds – many of which are associated to the Cluster of Excellence Image Knowledge Gestaltung, located at Humboldt-Universität zu Berlin – while other participants have a background in the museum world.
Many of the scholars represented in this volume agree that image manipulation must remain transparent at all times in order to avoid inappropriate data falsification. The strategies in dealing with image manipulation and the levels of liberty scholars claim for themselves are compared and the question is raised whether the integrity of images can be preserved in times in which digital image editing programs blur the boundaries between what is possible and what is acceptable.
The team working on Research Integrity at the HEADT Centre carried out research at the Annual Conference of Computer Linguistics (ACL) in Berlin, August 7-9. Scholars and other visitors at the conference site had the opportunity to participate in an interactive online survey with focus on the decision-making processes that are involved in evaluating textual similarities and/or plagiarism cases. We are going to present results of the survey on our website soon.