We first do two checks that could reveal the presence of watermark :
- Tattoo research on LSBs :
The image watermarking on the LSB (Least Significant Bit) is an invisible watermarking, so we will check that the images do not have this type of watermarking. To do this, we create functions in python allowing us to break down the image according to its 8 bits, from the least to the most significant. Visually, we don't notice anything abnormal on the different layers, so we assume that there is no watermarking on the LSBs.
- Applying filters to images :
We then try to vary the different components of the image (contrast, luminosity, colours, etc.) to see if any marks stand out. Here again, nothing is detected with the naked eye.
We find nothing so we try to adapt the dataset in order to bypass this measurement bias.
- Transformation of the dataset into greyscale :
It is first assumed that the bias comes from the RGB components of the image (watermarking on one or more RGB components). We will therefore convert the images into grey levels in order to remove the RGB component. We then create a second dataset containing the same images but converted into shades of grey. With Saimple, again we check relevance mask, to see if there are still spots or if the mechanical part is finally correctly analysed.
Screw relevance mask – greysacle dataset
The results obtained are slightly better (30% accuracy on real data). Moreover, the first relevance points seem to be correct. On the other hand, we notice that the Measurement bias is still present. The more points we take, the more spots appear. Switching to greyscale slightly improved the model's performance and reduced the dataset bias but did not completely eliminate it. The change to greyscale may not have eliminated the bias because this change is only made with a linear operation (Grayscale = 0.299R + 0.587G + 0.114 B).
- Transformation of the dataset into black and white :
An attempt is then made to convert the dataset into black and white images. This conversion completely removes the bias in the RGB channels.
Nut relevance mask – black and white dataset
This conversion again improves the performance of the model on real world data (45% accuracy), but the accuracy on 3D images is lower (about 80% compared to 90% previously). The prediction results on real images are therefore better, but it is difficult to imagine that the network can achieve better performance.
Effectively, we lose a lot of information with this conversion, each pixel of the image is converted from 3 channel with 256 possible values (values from 0 to 255) to one channel with only two possible values (0 or 255). This representation is therefore not viable and not representative of the real world.
However, the relevance results obtained via Saimple demonstrate that the measurement bias initially present in the data has been completely removed. Effectively, spots do not appear anymore.