In image processing, one of the most important steps is image segmentation. The objects in remote sensing images often have to be detected in order to perform next steps in image processing. Remote sensing images usually have large size and various spatial resolutions. Thus, detecting objects in remote sensing images is very complicated. In this paper, we develop a model to detect objects in remote sensing images based on the combination of picture fuzzy clustering and MapReduce method (denoted as MPFC). Firstly, picture fuzzy clustering is applied to segment the input images. Then, MapReduce is used to reduce the runtime with the guarantee of quality. To convert data for MapReduce processing, two new procedures are introduced, including Map_PFC and Reduce_PFC. The formal representation and details of two these procedures are presented in this paper. The experiments on satellite image and remote sensing image datasets are given to evaluate proposed model. Validity indices and time consuming are used to compare proposed model to picture fuzzy clustering model. The values of validity indices show that picture fuzzy clustering integrated to MapReduce gets better quality of segmentation than using picture fuzzy clustering only. Moreover, on two selected image datasets, the run time of MPFC model is much less than that of picture fuzzy clustering.
Object detection is an important step in image processing. Object detection systems are often integrated to another image processing models as the first step of the progress. Object detection in images has attracted the attentions of many researchers [
Object detection in remote sensing images was performed mainly by artificial intelligent methods such as random forest [
In 2014, [
By the development of cloud computing, data mining, Hadoop and MapReduce [
This paper introduces a model used in remote sensing image segmentation by applying picture fuzzy clustering algorithm (PFC) to increase the accuracy of segmentation results. Moreover, MapReduce procedure is applied in Picture fuzzy clustering in order to reduce the time consuming of PFC in remote sensing image segmentation without decreasing segment quality. MapReduce Picture Fuzzy Clustering (MPFC) model is proposed. Lastly, the evaluations of PFC and MPFC models on two different sets of remote sensing images are also presented.
Picture fuzzy clustering algorithm [
By minimizing this function, the values of
The constraints of this problem are
Using Lagarange multiplier method, based on the objective function
The main steps of PFC are presented in PFC algorithm as in
Data set X including N samples with d attributes; The number of clusters (C); Threshold ε; fuzzifier m; exponent α and the max number of iterations Maxstep > 0  
The matrices, 

1  t = 0 
2  Init: 
3  Repeat 
3.1  t = t + 1 
3.2  Calculate 
3.3  Calculate 
3.4  Calculate 
3.5  Calculate 
3.6  Until 
Introduced by Google, MapReduce is a model used in parallel and distributional processing. This model consists of two procedures, including “Map” procedure and “Reduce” procedure. These two procedures are defined by user as in
The detail of formal representation of MapReduce (
MapReduce formal representation:
As in [
where:
P1, C1 are the types of key and input value of map function. p1, c1 is corresponding objects of types P1, C1
P2, V2 are the type key and input value of map function. They are aslo the type key and input value of map reduce function. p2, c2 is corresponding objects of types P2, C2
P3, C3 are the type key and input value reduce function. p3, c3 is corresponding objects of types P3, C3
We have:
If p1, c1, p2, c2 are defined, we get the input, output of map function. Ussually, for text data, p1 is the offset key of data flow and c1 is the content of data flow.
If p2, c2, p3, c3 are defined, we get the input, output of reduce function.
The formal representation can be rewritten with only p1, c1, p2, c2, p3, c3 as below:
In this part, a combination of MapReduce and picture fuzzy clustering is introduced and applied in image clustering problem.
The integration of picture fuzzy clustering and MapReduce method is performed as follow. Firstly, input image is converted to list type for MapReduce processing. Secondly, centers of clusters are generated randomly. Thirdly, the data is separated into many partitions. Each partition is parallel processed by MapTask. This step aims to calculate the membership degree of each sample in data partition corresponding to the centers of clusters using
Our novel model is named as MapReduce based picture fuzzy clustering (MPFC). The framework of MPFC is given in
In this part, pixel data is converted to various rows formed as a list. These rows include the information of position and following by the values representing for a pixel. The information of position is used to restore clustered images and perform the other tasks such as analysing, evaluating the results. Thus, output of clustering process is the data elements with the information of intensity, median and position.
Then, we define d1, v1 and d3, v3 as below:
d1 is the offset. v1 is the content of data stream (k, j, x_{kj})
d3 is the information of new clustering cnew, v3 is the list of sets (k, j, x_{kj}) of all elements belonging to cluster in d3
Map function assigns data to the nearest cluster. Thus, d2 and v2 can be determined as:
d2 is the index of the nearest cluster to x_{kj}. v2 is the set (k, j, x_{kj})
Then, the formal representation of Map and Reduce procedures is:
Shared centers lstCenter, d1, v1  
lstD2V2 (the list of pairs (d2,v2))  
1  Extract the information of intensity and median xkj 
2  For cen_ind = 0 to lstCenter.length 
3  Calculate 
4  Init (d2,v2): d2 = cen_ind; v2 = v1 
5  Add (d2,v2) to lstD2V2 
6  return lstD2V2 
cen_ind; list(info(k,j, 

Pair (d3,v3)  
1  Init 
2  totalM = 0 
3  For i, j, k in list((i,j, 
3.1  Extract the information of intensity, the positive, the neutral and the negative degrees 
3.2  Calculate 
3.3  Compute totalM += 
4  Divide 
5  d3 = 
9  v3 = list((info(k,j, 
Based on the results of Reduce_PFC procedure, clustered images can be recovered using the information of position, intensity of cluster centers. Apart from that, the evaluation, analysis, recognition or classification can be performed based on the clustering results.
In this research, there are two datasets used in experiments, including:
Satellite images extracted from weather image database of NASA [
Remote sensing images in Hoa Binh province, Vietnam presented as in
Image  Type  Size  

Image 1  Lansat  1596 x 1333  
Image 2  Quickbird  2056 x 2065  
Image 3  SPOT  2201 x 2101 
In this research, Spark tool is used to install MPFC algorithm by MapReduce model.
The run time of two proposed algorithms is evaluated and compared to that of PFC. Clustering quality is also calculated by using validity indices, including Silhouette Width Criterion (SWC) [
The experimental results of PFC, MFC and MPFC models on weather image dataset are shown in
Validity indices  

Methods  PBM+  SWC+ 
PFC  65,273,327  
MFC  56,327,370  0.573 
MPFC 
As shown in
For the detail, time consuming of two models on weather image dataset is given as in
No. of clusters  

Methods  5  7  9  11 
PFC  1,824,251  2,234,326  3,032,378  3,982,237 
MPFC  226,326  378,253  463,329  532,377 
In fact, the run time will take longer when the number of clusters increases. From the results in
The results on remote sensing images in Hoa Binh province are also presented. Fistly, validity indices obtained by applying PFC, MFC and MPFC are calculated and given in
As same as on the weather image dataset, in the case of 5 clusters, validity indices of PFC and MPFC are similar. The values in
Validity indices  

Methods  PBM+  SWC+  
Image 1  PFC  44,657,721  
MFC  42,136,233  0.5621  
MPFC  0.5735  
Image 2  PFC  
MFC  19,826,388  0.5827  
MPFC  23,273,232  0.6023  
Image 3  PFC  
MFC  8,237,632  0.6124  
MPFC  9,452,320 
No. of clusters Methods  

Methods  5  7  9  11  
Image 1  PFC  1,783,364  2,327,362  4,362,327  8,827,237 
MPFC  102,363  373,801  546,327  632,377  
Image 2  PFC  1,728,337  3,363,436  5,938,434  17,347,437 
MPFC  234,327  433,433  843,437  1,272,327  
Image 3  PFC  4,236,327  5,033,437  7,227,372  18,273,237 
MPFC  543,273  921,237  1,392,377  2,387,237 
By applying MapReduce procedure, the runtime of MPFC is much less than the runtime of PFC. It takes about only 12.74% (on average) of the runtime by using PFC.
Thus, the results of proposed model are better in term of segmentation quality comparing with MFC. Apart from that, time consuming of MPFC is less than that of PFC while the quality is the same.
In this paper, an improvement of picture fuzzy clustering applying in object detection on remote sensing images is proposed. In this model, picture fuzzy clustering is integrated to MapReduce method. Three main contributions are given in this paper. Firstly, PFC is applied into remote sensing image segmentation problem to increase the segmentation quality. Secondly, an algorithm named as MPFC is proposed. This algorithm uses MapReduce to shorten computation time of PFC while the clustering quality is guaranteed. Apart from that, the formal representation and details of Map_PFC and Reduce_PFC procedures are also given in this paper. Thirdly, the experiments on satellite image and remote sensing image datasets are performed. From the obtained results, the comparison among PFC, MFC and MPFC is given and analyzed using SWC and PBM indices. The experimental results show that the clustering quality of MPFC is higher than PFC and MFC. Moreover, the time consuming of MPFC is aslo much less than the time consuming of PFC.
In this approach, the image data is used to implement the models. Other kinds of data are not mentioned. In further researches, the proposed model will be applied on varouis kinds of data to evaluate the performance. Thus, the most suitable data for this model will be specified. Moreover, other problems on specific dataset will be solved using this model as well.