These authors contributed equally to this work. Jiaqi Shao, Shuwen Chen and Jin Zhou are regarded as co-first authors

As a mainstream research direction in the field of image segmentation, medical image segmentation plays a key role in the quantification of lesions, three-dimensional reconstruction, region of interest extraction and so on. Compared with natural images, medical images have a variety of modes. Besides, the emphasis of information which is conveyed by images of different modes is quite different. Because it is time-consuming and inefficient to manually segment medical images only by professional and experienced doctors. Therefore, large quantities of automated medical image segmentation methods have been developed. However, until now, researchers have not developed a universal method for all types of medical image segmentation. This paper reviews the literature on segmentation techniques that have produced major breakthroughs in recent years. Among the large quantities of medical image segmentation methods, this paper mainly discusses two categories of medical image segmentation methods. One is the improved strategies based on traditional clustering method. The other is the research progress of the improved image segmentation network structure model based on U-Net. The power of technology proves that the performance of the deep learning-based method is significantly better than that of the traditional method. This paper discussed both advantages and disadvantages of different algorithms and detailed how these methods can be used for the segmentation of lesions or other organs and tissues, as well as possible technical trends for future work.

Medical imaging plays a leading role in the diagnosis, and treatment of diseases. It enables localization, qualitative and quantitative analysis of diseases through noninvasive imaging. As an important branch of computer vision in medical image processing, the goal of medical image segmentation is to realize the segmentation of complex medical images of special significance or interest. The precision and accuracy of segmentation will directly affect the subsequent diagnostic work. In addition, time is also a factor that cannot be ignored. When serious diseases are spreading, such as the novel coronavirus pneumonia, rapid automatic segmentation technologies such as deep learning and artificial intelligence technology, will be an alternative to something that more patients will see a doctor in time within the optimal treatment time [

Many researchers have devoted themselves to improving the accuracy as well as the speed of medical image segmentation [

Traditional medical image segmentation algorithms include threshold segmentation, edge detection, region growing and [

Although the underlying feature map can capture rich spatial information to locate the location of organs, it lacks the support of semantic information. On the other hand, although advanced feature maps acquire more semantic information, they lack the ability to perceive details. Therefore, a 2.5D image segmentation method based on U-Net is discussed in this paper [

In the first part of this paper, several methods commonly used in traditional medical image segmentation algorithms are introduced, including threshold segmentation, edge detection and region growing algorithms. The typical problems of traditional segmentation algorithms are discussed. The second part discusses the optimization and improvement based on clustering algorithm in detail, mainly around C-means and K-means. In terms of U-Net, taking the continuous improvement and innovation of the model as the main line, this paper clarifies the subdivision performance and existing problems of the network structure, which has certain significance for the development of U-Net [

Traditional medical image segmentation algorithms include threshold segmentation, edge detection, region growth [

Methods | Features | Disadvantages |
---|---|---|

Threshold segmentation | The grayscale histogram of the medical image is classified by the grayscale threshold. | It only considers the gray value of the image and ignores the complexity of the medical image. |

Edge detection segmentation | Detects pixels in the image that differ significantly from the gray value of adjacent pixels and connects them. | It is difficult to apply to datasets with diverse features. |

Regional growth segmentation | Pixels with similar characteristics are aggregated and formed into a region according to the growth rules. | The selection of seed points and growth criteria is difficult whose performance is unstable. |

Image segmentation method based on energy functional | A continuous curve is used to express the target edge, and an energy functional is defined to include the edge curve as its independent variables. | The segmentation speed is slow and the accuracy is not accurate. |

Image segmentation method based on graph theory | Each node corresponds to each pixel, and the weight of each edge represents the non-negative similarity between neighboring pixels in terms of gray level, color, or texture. | The realization is simple, the speed is relatively fast, the precision is general. |

As one of the segmentation algorithms, threshold segmentation technology divides the gray histogram of an image into several classes with one or more thresholds, and considers that pixels in the image with gray values in the same gray class have the same attributes. The selection of threshold value can be set manually or obtained by specific calculation. The segmentation of more than one threshold value is called multi-threshold segmentation. The pixel points of the image are divided into several classes according to the threshold value, and different sets are obtained through division. The pixels in each set have the same attributes. It is especially suitable for images with different levels of gray values between the target object and the background. The key point is to obtain the optimal threshold value, which directly affects the rationality and effect of image segmentation. There are three typical thresholding-based image segmentation methods, including iterative global threshold segmentation, Otsu global threshold segmentation [

When the contrast between the object and the background is not the same everywhere in the image, different thresholds can be used for segmentation, according to the local features of the image. The study in [

The basic idea of image segmentation method based on edge detection is to determine the edge pixels in the image first, and then connect these pixels together to form the desired region boundary [

Canny applies non-maximum suppression to remove unwanted responses. Under the inspiration of Canny edge detection technology, the authors proposed a gaussian filtering fuzzy C-mean threshold segmentation method for edge detection technology [

The region growth method belongs to the region-based segmentation method. The region-based segmentation method [

In a word, the key of seed region growing method is to select the initial seed pixel and the growth criterion. Vyavahare’s team [

However, there is no way to avoid the problem of noise and uneven gray level. The idea often adopted is to combine the region method with other segmentation methods, Angelina et al. [

This method is based on the active contour model, which uses continuous curves to express the target edge. According to the different curve evolution ways, the active contour model can be divided into boundary based, region based and mixed active contour model. The basic idea of image segmentation based on energy functional is to define an energy functional so that its independent variables include edge curves. Generally, the minimum value of the energy functional is obtained by solving the Euler equation corresponding to the function, and the curve with the minimum energy is the position of the target contour [

Active Appearance Models (ASM) and Active Appearance Models (AAM) are two of the parametric Active contour Models. Both are based on point distribution Models [

After parametric active contours, geometric active contours model is another great development based on curved fireworks theory and level set method. In contrast to parametric active profile models, geometrically active profile models can handle topological changes of curves, insensitivity to initial positions and stability of numerical solutions [

Image segmentation methods based on graph theory are often associated with the problem of minimum cut of graph. Each node in the graph corresponds to each pixel in the image, and each edge is connected to a pair of adjacent pixels. The weight of the edge represents the non-negative similarity between adjacent pixels in terms of gray level, color, or texture. The segmentation problem is transformed into a label problem. The optimal principle of segmentation is to make the divided subgraph keep the maximum similarity inside. The similarity between subgraphs is kept to A minimum [

Medical image segmentation can be popularly understood as clustering by dividing pixels into homogeneous regions. Clustering methods belong to unsupervised machine learning methods. Different medical image segmentation has different requirements [

The choice of K value in K-means algorithm will greatly affect the segmentation effect. Improper selection of K value will lead to over-segmentation or under-segmentation. Many scholars and researchers put forward different optimization methods after many experiments. Combined with spatial region information or other segmentation algorithms, it can solve the problems of fuzzy edges, inaccurate gray distribution, and local optimal solutions.

Methods | Features | Performance |
---|---|---|

Optimized K-means clustering algorithm based on FGO [ |
Addresses an issue that sorts pixels in preprocessed images into cluster spaces. | DiceCo reached 0.907 and JacInd reached 0.912. |

Median filtering, sobel edge detection, and morphological operations are added to K-means [ |
Automatic segmentation of medical images is realized. | Achieve 0.94 high accuracy. |

Four-stage fuzzy K-means algorithm based on automatic deep learning [ |
Artificial neural networks are used to classify images, and fuzzy means are used to segment abnormal images. | Maximum accuracy of 0.94. |

K-means clustering is combined with discrete wavelet transform [ |
The K-means algorithm separates the region of interest from the specified background, and a pair of filters in the discrete wavelet transform, low-pass and high-pass filters, decompose the signal. | Accuracy reached 87.8 percent. |

Darwinian particle swarm optimization technique, K-means algorithm, and morphological reconstruction operation [ |
K-means was used as an intermediate step before threshold, plus morphological manipulation to remove non-brain tissue. | Accuracy, specificity, precision, sensitivity achieved 99.88, 0.9992, 0.9123, and 0.9502. |

As the simplest unsupervised learning method, K-means clusters the data by calculating the average intensity of each category [

Inspired by deep learning and natural simulation algorithms for football game optimization, in the paper [

Mix different segmentation techniques provides more possibilities. The author combined median filtering, K-means clustering, Sobel edge detection and morphological write operation to segment medical images from magnetic resonance imaging (MRI) and computed tomography (CT) under different imaging modes [

Research methods can be applied to a larger database.

Expand future research methods.

K-means clustering and discrete wavelet transform are also a feasible method for medical image segmentation. After the image is acquired and normalized, the gray image or color image after binarization is further eliminated by the color threshold technology. The region of interest is separated from the specified background by K-means algorithm, and then the signal is decomposed by a pair of low-pass and high-pass filters in discrete wavelet transform [

In order to overcome the limitations of the standard K-means algorithm, such as random initialization of cluster centroid and noise sensitivity, Mehidi et al. proved through experiments that the combination of Darwinian particle swarm optimization technology, K-means algorithm and morphological reconstruction operation could segment images more effectively [

Deep learning moves forward in current medical research. Using computers to undertake a large quantity of work of calculation greatly improved the work efficiency. A four-stage fuzzy K-means clustering method based on automatic deep learning is proposed for brain tumor segmentation [

The method mentioned in article [

Intuition-based fuzzy clustering is an extension of fuzzy C-mean [

In order to deal with the problems such as noise and bias field effect in the process of MRI image segmentation, the spatial domain information is applied to the bias corrected intuitionistic fuzzy C-mean based on intuitionistic fuzzy set theory [

As the performance of classical algorithms in medical image segmentation cannot meet the ideal requirements [

When medical images obtained by different scanners or different scanning protocols are often polluted by noise or interfered by outliers, which increases the uncertainty of the boundary among different tissues. The quality of medical images may vary greatly, leading to unsatisfactory segmentation results and bringing great challenges to medical image segmentation [

To solve the problem of inconsistent cluster number between source domain and target domain. Transfer learning plays an important role in fuzzy clustering framework to deal with the distribution, feature space or task difference between source domain and target domain. The maximum mean difference (MMD) inspired an in-depth study of the transmission capability of each cluster belonging to the source domain in the shared potential space, which contributes to transmit knowledge across different domains. Motivation is shown in

Since traditional fuzzy C-means clustering cannot abstain ideal segmentation results. In the paper [

Dynamic correlation analysis can better control the precise correlation between pixels. Membership is used to measure correlation rather than image features, thus avoiding segmentation difficulties caused by noise pollution. In the paper [

When the classical FCM algorithm cannot directly process all the data sets in the multi-task scenario simultaneously, it is usually difficult to find the common information of related tasks by clustering the data of different tasks independently. Zhao et al. [

The main challenges in the segmentation of different brain tissue regions in brain magnetic resonance (MR) images are limited spatial resolution, signal-to-noise ratio (SNR) and rf coil heterogeneity. In the paper [

Kollem et al. [

An optimized fuzzy clustering image segmentation algorithm is proposed [

The study in [

Halder’s team proposed a new clustering technique combining rough set and spatially oriented fuzzy C-means clustering (SKFCM) [

Semi-supervised clustering is the incorporation of a small amount of prior knowledge into the clustering process, in the paper [

On the other hand, the authors believed that image intensity and spatial characteristic information should be fully considered in MRI image segmentation [

Image segmentation is not limited to local information. The author proposes an improved no-parameter fuzzy clustering algorithm NLFCM. In the fuzzy factor, the influence of adjacent pixels on the center pixel is called the damping range [

Fuzzy local information C-means clustering algorithm (FLICM) introduced a fuzzy factor as fuzzy local similarity measure. Although the damping degree of adjacent pixels on the central pixel achieves convergence of the membership degree of pixels in local Windows to similar values, the accuracy of estimating the damping degree of adjacent pixels is relatively low, and the FLICM algorithm only considers local information, which leads to unsatisfactory performance in high-noise images [

In order to overcome the shortcomings of the existing fast generalized fuzzy C-means clustering algorithm, the robustness of image segmentation is improved. In the paper [

In order to alleviate the influence of gray level inhomogeneity and noise pollution, in the paper [

Introducing spatial information to optimize the objective function is beneficial to improve the noise resistance of the model. The introduction of non-local spatial information and local spatial information provides more information for image segmentation. Based on many experiments, many researchers have improved the segmentation effect by introducing prior probability and membership penalty terms. As listed in the paper [

The authors put forward a kind of context based on the reliability of the space of fuzzy C-means (RSFCM) used for image segmentation [

Introducing spatial information into fuzzy clustering algorithm can further improve the accuracy of segmentation of cerebral hematoma. In papers [

Due to the complexity of left ventricular geometry and heart movement, automatic segmentation of clinical cardiac PET images is challenging. The author proposes a novel approach to segmentation of left ventricular medical images [

Computer-aided diagnostic automation, such as automatically finding the location of a tumor by automatically clicking on any MRI image, has been a huge help for neurosurgeons, and for this reason, the article [

As the plane-based soft clustering method can effectively process the data of non-spherical shape, for example, the material in human brain is non-spherical and the overlapping structure of brain tissue is uncertain [

Due to the high noise environment, the existing intuitionistic fuzzy clustering algorithm cannot perform accurate segmentation of medical images. The authors explored a full Bregman divergence fuzzy clustering algorithm based on multi-local information constraints driven by intuitive fuzzy information, aiming to improve the anti-noise robustness and segmentation accuracy of intuitive fuzzy clustering correlation algorithm [

Traditional medical image segmentation methods are difficult to accurately process images with weak edges and complex iterative processes. BIRCH (Balanced Iterative Reduction and Clustering Using Hierarchies) is a multi-stage clustering method using the clustering feature tree [

Non-local information and low-rank prior knowledge [

In view of how to segment human MRI images more effectively, a grouping method combining two types of K-means and fuzzy C-means (FCM) is proposed. It uses FCM to cluster the K-means clustering results which were generated by four categories into three categories again, achieving the requirements of improved accuracy. In the paper [

The above is the segmentation of unimodal images. In paper [

In addition to the mentioned method of combining K-means and C-means, Abraham’s research team proposed a region segmentation and clustering technology of K-region clustering (KRC) image segmentation method [

This section mainly focuses on the continuous optimization and improvement of medical image segmentation algorithms based on U-Net [

Methods | Features | Disadvantages |
---|---|---|

2.5D U-Net | The computation cost and segmentation accuracy are considered, and the training time is reduced. | The segmentation accuracy is not as good as 3D U-Net. |

3D U-Net | It consists of one primary U-net and two secondary U-Nets. | There are dependencies between picture slices. |

Context nested U-Net | Take advantage of more semantic and spatial information. | There is room to improve the output mapping characteristics of the network. |

All connection U-Net | High amplitude activation is ensured. | Less feature space to process. |

RU-Net | Perception of the boundary, edge refinement segmentation. | The accuracy of convolution model needs to be strengthened. |

VGG16 U-Net | Feature capture capability is enhanced in the form of encoder-decoder. | There is an overfitting problem. |

The accuracy of U-Net in image segmentation under the framework of two-dimensional CNN needs to be improved, while 3D CNN requires a lot of computational costs and the convergence speed is slower than that of Two-dimensional U-Net. However, as listed in the paper [

The rapid development of deep learning (DL) has been widely used in medical image segmentation. The author studied the automatic segmentation of human brain by Klaus Strum (CL) [

Sections were processed from 3DMRI.

Select the section of interest in the slice (ROI).

Use machine learning technology to normalize and enhance data operations.

Each MRI contains about 260 slices which are adjusted to have the same 256 × 256 pixel size. In order to ensure that only the slices containing CL are trained in the model, only 36 slices out of the 260 slices need to be left. The filtered slices are superimposed on the slices previously marked as areas of interest. The core of post-processing stage is to improve the accuracy of image segmentation.

Specifically, by removing false positives and again filtering out pixels that fell off from the region, annotated slices used for CL segmentation were finally stacked for 3D reconstruction. Dice scores are used to measure the similarity between the two samples on a scale of 0 to 1, with higher scores indicating better accuracy of the model. Joint crossing (IoU) [

Three consecutive images were combined into a 2.5D slice image by means of spatial constraint information between adjacent slices of MRI brain image sequence. 2.5D section can not only make full use of the information of 3D brain section, but also directly reduce the training time. The authors proposed a Tripple U-Net based whole-brain segmentation method, which consists of a primary U-Net and two secondary U-Nets, with each of which consisting of encoding path and decoding path [

Many researchers have focused on how to accurately segment lesions or regions of interest from CT images, such as liver tumors, kidney tumors, cardiovascular diseases, choroid plexus [

In paper [

The segmentation of risk organs is very important in guiding radiotherapy treatment planning. The segmentation of organs at risk is decomposed into two stages or two sub-tasks. The boundary needs to be located first, and the organs within the bounding box need to be segmented second. Each task uses a 3D U-Net [

Full use of context information and feature representation can achieve more accurate segmentation of lesions. Hu et al. proposed a three-dimensional attention context U-net algorithm for MS lesion segmentation in paper [

In the paper [

Rich multi-scale semantic information can generate more representative feature maps. It is crucial for segmenting the tissue structure of small images, such as blood vessel segmentation and eye diseases [

U-Net is a full convolutional network with simple structure and it is easy to study, as shown in

Based on U-Net, the author introduces a combined loss function which uses weighted binary cross entropy and dice loss function, and proposes a compression encoder which uses 4 × 4 maximum pooling operation which replaces the widely used 2 × 2 pooling layer and three-layer convolution layer [

Qinghua Zhou’s team considered the deep stacked sparse autoencoder as a classifier for the model [

In the paper [

RU-Net is also a kind of medical image segmentation method based on the deep learning, the use of deep learning for medical image segmentation is not a new method, but to raise convolution model accuracy and to locate the boundary of the object in further digging deep learning network structure of medical image matching degree is still the important research aspects of scientific research personnel [

As listed in the paper [

In this formula, l represents the layer feature input, l represents the feature input of the layer, and, li, j (⋅) represents the mapping model of the feature channel at the th position J.

The convolution kernel was set to the size of 3 × 3, and batch normalization and regularization were performed after each layer of convolution. Boundary-aware RU-Net network learned and realized the function of detecting organizational boundaries containing diseases. Ru-net can be regarded as a combination of two U-Nets, which respectively play the role of segmenting images and predicting the final segmentation [

VGG 16 network is a deep convolutional neural network including convolutional layer, pooling layer, and fully connected layer. The combination of VGG 16 network structure and U-Net structure greatly improves the expressive power of the network. Ghosh et al. [

Finally, Dice coefficient, Jaccard coefficient and Precision, which are commonly used in medical image segmentation task, were used to analyze the results. The values of the three indexes were 0.895 ± 0.036, 0.812 ± 0.057 and 0.899 ± 0.062, respectively. In contrast, U-Net results were 0.818 ± 0.081, 0.699 ± 0.112 and 0.728 ± 0.118, respectively. The segmentation effect of VGG16 U-Net is much better than the performance of pure U-Net.

This paper classifies and describes two types of medical image segmentation techniques: one is an improved clustering algorithm, and the other is the deep learning algorithm based on U-Net network. In the cluster analysis, the paper focused on effective and concise fuzzy clustering, which reflects the fuzzy theory on the membership degree of each pixel of the image to each cluster center. The difference in data depends on membership degrees. In order to achieve better performance of manual annotation and the target image, the U-Net algorithm serves as the catalyst for the development. One of the biggest advantages of U-Net is that it uses fewer data sets to achieve more accurate segmentation results. However, problems such as how to reduce U-Net’s dependence on high-quality label datasets without reducing its accuracy and how to compress network models without destroying their stability still need to be solved. Medical segmentation has a broad application prospect in biomedical engineering. Through the research and discussion of the team, thinking space is provided for the development of medical image segmentation technology. Hoping to reap a harvest for further improvement of the existing segmentation methods. It will help doctors to obtain more accurate pathological information and assist in disease diagnosis.

This work was supported partly by the Open Project of

The authors declare that they have no conflicts of interest to report regarding the present study.