Medical images are used as a diagnostic tool, so protecting their confidentiality has long been a topic of study. From this, we propose a Resnet50-DCT-based zero watermarking algorithm for use with medical images. To begin, we use Resnet50, a pre-training network, to draw out the deep features of medical images. Then the deep features are transformed by DCT transform and the perceptual hash function is used to generate the feature vector. The original watermark is chaotic scrambled to get the encrypted watermark, and the watermark information is embedded into the original medical image by XOR operation, and the logical key vector is obtained and saved at the same time. Similarly, the same feature extraction method is used to extract the deep features of the medical image to be tested and generate the feature vector. Later, the XOR operation is carried out between the feature vector and the logical key vector, and the encrypted watermark is extracted and decrypted to get the restored watermark; the normalized correlation coefficient (NC) of the original watermark and the restored watermark is calculated to determine the ownership and watermark information of the medical image to be tested. After calculation, most of the NC values are greater than 0.50. The experimental results demonstrate the algorithm’s robustness, invisibility, and security, as well as its ability to accurately extract watermark information. The algorithm also shows good resistance to conventional attacks and geometric attacks.

In digital age, intelligent medicine and telemedicine diagnosis also have a better development. To realize the convenience and speed of medical diagnosis, a large number of medical images have to be transmitted through the Internet [

When protecting the medical images, we can’t destroy the original medical image, because once the images are destroyed, it will affect the doctor’s diagnosis. Therefore, the digital zero watermarking algorithms appear in the research idea of the researchers [

In 2003, Wen et al. proposed zero watermarking, which is a new digital watermarking technique that does not modify the original image data. In this paper, high-order cumulants are used to extract the features of the image to construct zero watermarking [

Recently, convolution neural network (CNN) and machine learning algorithms have been applied to computer vision, including the application of extracting image features with trained CNN to complete expected tasks. In this paper, a zero-watermarking algorithm for medical images based on Resnet50 depth residual neural network is proposed. The image features are extracted by the trained CNN to obtain the output of the full connection layer (fc_1000). Then the deep features are further transformed by DCT transform and the feature vector is generated by the hash function. In the image verification phase, the watermark information is restored by a series of operations using the same method, and compared with original watermark information to verify the availability of the algorithm [

The Resnet50 network consists of 49 convolution layers and one fully connected layer. Its network structure can be divided into seven parts. The first part does not contain residual blocks and mainly calculates the convolution, regularization, activation function and maximum pool of the input object. The second, third, fourth and fifth parts of the structure all contain residual blocks, which mainly solve the problem of gradient disappearance with the increase of network layers. In the Resnet50 network structure, the residual block has three convolution layers, so the network has a total of 49 convolution layers. Finally, add a full connection layer, a total of 50 layers, this is the origin of the name Resnet50. The input size of the network is 224 × 224 × 3. After the convolution calculation of the first five parts, the output size is 7 × 7 × 2048. In the pooling layer, it is pooled to reduce the amount of computation and enhance the invariance of image features, and then outputs a feature matrix with a size of 1 × 1000 after full connection layer processing. The feature of 1 × 1000 is the [

Layer name | Conv1 | Conv2_x | Conv3_x | Conv4_x | Conv5_x | |
---|---|---|---|---|---|---|

Output size | 112 × 112 | 56 × 56 | 28 × 28 | 14 × 14 | 7 × 7 | 1 × 1 |

Parameters | 7 × 7, 64, stride 2 | 3 × 3 maxpool, stride 2 | average pool, 1000-fc, softmax | |||

The ResNet50 used in this algorithm has two basic blocks, one is Identity Block, the dimensions of input and output are the same, so multiple can be connected in series; the other basic block is Conv Block, the dimensions of input and output are different, so it can’t be connected continuously, its function is to change the dimension of the feature vector. The two residual blocks contained in ResNet50 are shown in

DCT transform, the full name of discrete cosine transform, is mainly used for data or image compression. Because the DCT transform is symmetrical, the DCT inverse transform can be used to recover the original image information after quantization coding. DCT transform has a wide range of applications in the current compression field. It can be used not only in our commonly used JPEG still image coding, but also in MJPEG and MPEG dynamic coding [

The two-dimensional discrete cosine transform (DCT) is:

The inverse two-dimensional discrete cosine transform (IDCT) is:

In the formula, x, y is the sampling value of the image in the spatial domain. u, v is sampling value of the image in the frequency domain.

Logistic map is a very simple chaotic map in mathematical form, which was used to describe the changes in the population as early as the 1950s. This mapping has extremely complex dynamic behavior and is widely used in field of secure communication [

In this paper, normalized correlation degree (NC) is used as one of the indicators to measure the performance of the algorithm, that is, to evaluate the robustness of the algorithm. It is usually required that the value of the correlation coefficient be greater than 0.5 [

In the formula, m and n are the coordinate points of the image pixels; A and B are the pixel values corresponding to the corresponding coordinate points;

The second evaluation index in this paper, PSNR, is used to measure image quality. PSNR is required to be greater than or equal to 10 in this article [

In the formula, [m, n] refers to the size of the image, and

The algorithm mainly consists of three parts, namely, image feature extraction, zero watermark construction and embedding, and zero watermark extraction. First of all, the image features are extracted by ResNet50 and DCT transform, and the feature vector is generated by perceptual hash. Secondly, the XOR operation between the feature vector generated in the previous step and the encrypted watermark is carried out to get the zero watermark and embed the zero watermark. Finally, the zero-watermark detection algorithm is used to extract the watermark.

In this paper, the medical image with the size of 512 × 512 is selected as input image, but the pre-training network ResNet50 requires the image input size to be 224 × 224 × 3 s, so it is necessary to preprocess the original medical image. We send the preprocessed medical image to the pre-training network ResNet50, and the image is extracted from the deep features through the convolution layer and pooling layer of the network, and then through the full connection layer to get the output-“fc_1000”. The DCT transformation of “fc_1000” is performed to obtain a DCT transform coefficient matrix, and then the 64-bit valid coefficients are captured in this matrix and combined with the perceptual hash algorithm to generate the feature vectors of the image [

The construction and embedding process of zero watermark is shown in

Read the single-channel medical image

Read the original watermark image

This step is mainly to extract the medical image feature and combine the perceptual hashing algorithm to generate the feature vector

The scrambled watermark image performs XOR operation with the medical image feature sequence, that is, the construction and embedding of zero watermark is realized. At the same time, the logical key

The extraction process of the zero-watermark image is shown in

Read the single-channel medical image to be tested

The main purpose of this step is to extract the feature of the image to be tested and generate the feature vector

The feature vector of the medical image to be tested

After another XOR operation between the chaotic matrix generated by Logistic chaos and the watermark

The purpose of this experiment is to verify the performance and effectiveness of the algorithm by using conventional attacks (non-geometric attacks) and geometric attacks. Section 4.1.1 lists the experimental results of the algorithm’s ability to resist conventional attacks, and Section 4.1.2 lists the experimental results of the algorithm’s ability to resist geometric attacks. The medical image used in the experiment is shown in

In addition, the NC values between different images are tested, which are all less than 0.5, which can distinguish different images. The experimental figure is shown in

image1 | image2 | image3 | image4 | image5 | image6 | |
---|---|---|---|---|---|---|

image1 | 1.00 | 0.31 | 0.37 | 0.47 | 0.47 | 0.56 |

image2 | 0.31 | 1.00 | 0.31 | 0.28 | 0.28 | 0.06 |

image3 | 0.37 | 0.31 | 1.00 | 0.22 | 0.47 | 0.31 |

image4 | 0.47 | 0.28 | 0.22 | 1.00 | 0.43 | 0.22 |

image5 | 0.47 | 0.28 | 0.47 | 0.43 | 1.00 | 0.34 |

image6 | 0.56 | 0.06 | 0.31 | 0.22 | 0.34 | 1.00 |

This part shows experimental data when attack intensity increases gradually in the case of a conventional attack. The experimental results show that the algorithm proposed in this paper is robust against non-geometric attacks.

As shown in

Noise attack intensity | 1% | 3% | 5% | 8% | 10% |
---|---|---|---|---|---|

PSNR (dB) | 21.94 | 17.43 | 15.38 | 12.63 | 13.48 |

NC | 0.77 | 0.78 | 0.83 | 0.61 | 0.68 |

JPEG compression is widely used in image compression processing, and JPEG attacks are also one of the common non-geometric attacks in digital watermarking. As shown in

JPEG compress attack strength | 5% | 10% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|---|

PSNR (dB) | 25.83 | 28.92 | 30.25 | 30.25 | 32.69 | 33.56 |

NC | 0.62 | 0.74 | 0.74 | 0.88 | 1.00 | 0.96 |

As shown in

Filter window size | 3 × 3 | 5 × 5 | 7 × 7 | ||||||
---|---|---|---|---|---|---|---|---|---|

Filtering times | 5 | 15 | 25 | 5 | 15 | 25 | 5 | 15 | 25 |

PSNR (dB) | 28.97 | 27.90 | 27.64 | 24.29 | 22.52 | 21.98 | 22.19 | 20.76 | 20.35 |

NC | 0.53 | 0.50 | 0.50 | 0.56 | 0.72 | 0.66 | 0.59 | 0.62 | 0.72 |

The content of this part gives the experimental data of the image under different degrees of geometric attacks. Experimental results show that the proposed algorithm has a good ability to resist geometric attacks, can effectively protect personal privacy information, and has good robustness.

Rotate clockwise. After rotating the image by 30°, the NC value of the extracted watermark information is 0.79. When the image is rotated to 80°, the NC is 0.82. After the image is rotated by 80°, the relatively complete watermark information can still be extracted, which shows that the algorithm has good robustness. The experimental results of different degrees of rotation attacks are shown in

Rotation attack (clockwise) | 5° | 15° | 30° | 40° | 60° | 80° |
---|---|---|---|---|---|---|

PSNR (dB) | 19.40 | 16.38 | 15.31 | 15.03 | 14.26 | 13.81 |

NC | 0.88 | 0.81 | 0.79 | 0.84 | 0.80 | 0.82 |

Rotate counterclockwise. After the image is rotated 40°, the NC value of the extracted watermark information is 0.91. When the image is rotated to 80°, the NC value is 0.80. After the image is rotated by 80°, the relatively complete watermark information can still be extracted, which shows that the algorithm has good robustness. The experimental results of different degrees of rotation attacks are shown in

Rotation attack (Counterclockwise) | 5° | 15° | 30° | 40° | 60° | 80° |
---|---|---|---|---|---|---|

PSNR (dB) | 19.40 | 16.38 | 15.31 | 15.03 | 14.26 | 13.81 |

NC | 0.85 | 0.85 | 0.88 | 0.91 | 0.75 | 0.80 |

As shown in

Zoom attack times | 0.2 | 0.5 | 1.0 | 1.2 | 1.6 | 2.0 |
---|---|---|---|---|---|---|

PSNR (dB) | – | – | – | – | – | – |

NC | 0.62 | 0.77 | 1.00 | 0.89 | 0.89 | 0.89 |

Translation attack (upward) | 5% | 10% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|---|

PSNR (dB) | 15.46 | 14.05 | 13.17 | 12.48 | 11.59 | 11.31 |

NC | 0.93 | 0.92 | 0.93 | 0.91 | 0.93 | 0.74 |

Translation attack (downward) | 5% | 10% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|---|

PSNR (dB) | 15.69 | 14.27 | 13.29 | 12.65 | 11.90 | 12.26 |

NC | 0.88 | 0.88 | 0.91 | 0.79 | 0.61 | 0.75 |

The proportion of the image translating to the left is 15%. The NC is 0.98, which is very close to 1.00. When the image is translated 40% to the left, NC value is 0.85. The image translating 30% to the left is shown in

Translation attack (left) | 5% | 10% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|---|

PSNR (dB) | 15.14 | 14.17 | 13.29 | 12.87 | 12.43 | 12.19 |

NC | 0.92 | 0.87 | 0.98 | 0.92 | 0.88 | 0.85 |

Translation attack (right) | 5% | 10% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|---|

PSNR (dB) | 15.29 | 14.32 | 13.36 | 12.98 | 12.39 | 12.08 |

NC | 0.97 | 0.93 | 0.93 | 0.90 | 0.92 | 0.84 |

The effect of the experiment is shown in

Clipping attack (X axis) | 5% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|

PSNR (dB) | – | – | – | – | – |

NC | 0.94 | 0.91 | 0.84 | 0.81 | 0.77 |

Clipping attack (Y axis) | 5% | 15% | 20% | 30% | 40% |
---|---|---|---|---|---|

PSNR (dB) | – | – | – | – | – |

NC | 0.75 | 0.88 | 0.80 | 0.74 | 0.65 |

To further illustrate the anti-geometric attack ability of the algorithm, some experimental data are compared. The comparison results are shown in

Attack type | Attack intensity | Yang et al. [ |
Liu et al. [ |
Zeng et al. [ |
Yi et al. [ |
Proposed algorithm |
---|---|---|---|---|---|---|

Gaussian noise | 5% | 0.92 | 0.93 | 0.79 | 0.90 | 0.83 |

JPEG compression | 5% | - | - | 0.79 | 0.90 | 0.62 |

Rotation(clockwise) | 10° | 0.82 | 0.61 | - | - | |

20° | 0.79 | 0.53 | - | - | ||

80° | - | - | - | - | ||

Translation(down) | 15% | - | 0.61 | - | 0.90 | |

Translation(left) | 10% | 0.63 | - | - | - | |

Translation(right) | 5% | - | - | 0.90 | 0.90 | |

Cropping(Y-axis) | 20% | 0.64 | - | 0.79 | - |

For geometric attacks, when the rotation angle reaches 10°, the NC value can reach 0.88 respectively, while the NC value of the algorithm [

To sum up, the proposed algorithm has good robustness and invisibility. The algorithm can effectively prevent information leakage and protect personal privacy information.

In recent years, the algorithm for watermarking medical images against geometric attacks has been a hot topic and a challenge in the study of robust watermarking technology. A zero watermarking algorithm based on Resnet50-DCT is designed to withstand geometric attacks in this paper. Resnet50-DCT is used to extract the deep features of medical images, while a two-dimensional discrete cosine transform and a mean-aware hashing algorithm are used to generate the zero watermark. Combining the concepts of Deep Residual Neural Network, Discrete Cosine Transform, and zero watermarking, the algorithm’s design process primarily solves the problem of watermarks resisting geometric attacks. Likewise, the scrambling encryption of the watermark image ensures the algorithm’s safety. According to the aforementioned experimental findings, the proposed algorithm is efficient and trustworthy, and has some practical value for the protection of medical and patient-specific data.

However, the algorithm needs to be improved. From the experimental data, it is difficult for the algorithm to strike a balance between geometric attacks and non-geometric attacks. Not only will this algorithm encounter this problem, but also the same kind of algorithms proposed by others will encounter this dilemma. Therefore, I have some ideas: as the core tools of the algorithm-ResNet50 and DCT transform, the combination of them or changing the type of transformation will also affect the performance of the algorithm. The function of the core tool is to extract image features, and the future research direction may be to find the optimal feature extraction method to balance the performance of the algorithm under geometric and non-geometric attacks.

This work was supported in part by the

The authors declare that they have no conflicts of interest to report regarding the present study.