Classification of the patterns is a crucial structure of research and applications. Using fuzzy set theory, classifying the patterns has become of great interest because of its ability to understand the parameters. One of the problems observed in the fuzzification of an unknown pattern is that importance is given only to the known patterns but not to their features. In contrast, features of the patterns play an essential role when their respective patterns overlap. In this paper, an optimal fuzzy nearest neighbor model has been introduced in which a fuzzification process has been carried out for the unknown pattern using

Pattern classification has been a challenging task for the last decades. It is used in many practical applications (like pattern recognition, artificial intelligence, statistics, financial gaming, organization data, vision analysis, and medicine) [

Although, in various pattern recognition issues, the categorization of the input pattern depends upon the dataset, where, the actual sample size for every class is limited and perhaps not indicative of the actual probability distributions, regardless of whether they are known. In these conditions, numerous techniques depend on distance or similarity in feature sets, for example, discriminant analysis and clustering [

Meher [

According to the k-nearest neighbor algorithm, the class labels of k-closest patterns decide the input pattern label. K-closest patterns are selected based on distance like Euclidean or Manhattan [

In this paper, a model is proposed for pattern classification. First, the model finds the nearest neighbors to the input pattern using the

The main contributions of this paper are as follows:

A particular problem in the fuzzy k-nearest neighbor algorithm is addressed, i.e., when the data has highly overlapping classes.

The identified problem is resolved by using a membership matrix and considering the importance of each pattern feature rather than considering the significance only.

A pattern classification model is developed using the k-nearest neighbor algorithm. The model’s accuracy is verified using different classification models with the vowel data set.

The organization of the paper is as follows: In Section 2, steps of the proposed model have been discussed; in Section 3, the data set and the result and analysis are discussed, and the proposed model is compared with the fuzzy k-nearest neighbor algorithm, and the proposed model is also compared with five other classification models and conclusion is drawn in Section 4.

In this section, a model is proposed to classify unknown patterns, and the various steps are shown in

In this section, the nearest neighbours of the input pattern are chosen using the

The pattern

In this section, fuzzification of features of a pattern is processed by using

where

where

The fact that the sum of a feature’s membership values in the

For example, when

The output of the fuzzification process is a membership matrix

Finally, the rescaled vector is obtained as

If

In this section, we will discuss the data set and performance of the proposed model. The performance of the presented model is shown in the context of percentage accuracy (PA), where percentage accuracy is the proportion of the testing data that the proposed model effectively categorizes. The known class label of testing data is compared with classified results from the proposed model for the model’s accuracy. The training and testing data of the data set are selected at random by partitioning the data set into two parts. Testing data is independent of training data.

This paper verifies the proposed model on the benchmarked Telugu vowel data set [

The performance evaluated for different percentages of training data is illustrated in

Percentage of training data | Classification accuracy in percentage (%) |
---|---|

10 | 76.02 |

30 | 81.31 |

50 | 84.86 |

70 | 87.79 |

80 | 89.35 |

90 | 90.91 |

The proposed model’s performance is compared with various classification models. The models stated below have the benchmarked accuracy for the vowel data Set at 50% and 80% training data set [

Models used on 50% training data set

Model 1: Low, medium, and high (LMH) fuzzification (Meher [

Model 2: LMH with fuzzy product aggregation reasoning rule (FPARR) classification (Meher [

Model 3: Neuro-fuzzy (NF) classifier (Ghosh et al. [

Model 4: LMH and Pawlak’s rough set theory with FPARR (Meher [

Model 5: LMH and neighborhood rough set with FPARR (Meher [

Model 6: A pattern classification model for vowel data using fuzzy nearest neighbor (This model).

Models used on 80% training data set

Model 1: Neuro-fuzzy (NF) classifier (Ghosh et al. [

Model 2: Class dependent fuzzification with Pawlek’s rough set feature selection (Pal et al. [

Model 3: Class dependent fuzzification with neighborhood rough set (NRS) feature selection (Pal et al. [

Model 4: NRS fuzzification and neural network classifier with extreme learning machine algorithm (Meher [

Model 5: SSV decision tree (Duch et al. [

Model 6: A pattern classification model for vowel data using fuzzy nearest neighbor (This model).

For the 50% and 80% training data set, the performance of all the classification models is shown in

Percentage accuracy (PA) | Percentage accuracy (PA) | ||
---|---|---|---|

Model | 50% training data set | Model | 80% training data set |

1 | 80.01 | 1 | 79.87 |

2 | 81.13 | 2 | 82.56 |

3 | 81.79 | 3 | 84.05 |

4 | 82.76 | 4 | 86.0 |

5 | 83.88 | 5 | 86.76 |

6 | 84.86 | 6 | 89.35 |

The percentage accuracy of the presented model is compared with the fuzzy k-nearest neighbor algorithm proposed by Keller et al. [

Splits of sampling | Percentage accuracy (PA) at 50% training data set | Splits of sampling | Percentage accuracy (PA) at 80% training data set | ||
---|---|---|---|---|---|

Fknn | Proposed model | Fknn | Proposed model | ||

1 | 85.09 | 86.01 | 1 | 89.66 | 90.80 |

2 | 85.09 | 84.86 | 2 | 85.06 | 85.06 |

3 | 81.19 | 82.80 | 3 | 85.06 | 90.80 |

4 | 85.32 | 85.32 | 4 | 85.06 | 90.80 |

5 | 86.93 | 87.38 | 5 | 80.46 | 85.06 |

From

The pattern classification model for the vowel data using fuzzy set theory has been proposed, exploring the advantage of the explicit fuzzy classification technique and improving the model’s performance. Thus, the model explores the collective benefits of these techniques, which provide better class partition details, helpful for significantly overlapping data sets. The proposed model generates a membership matrix that represents the importance of features of input patterns belonging to all classes rather than just one class. As a result, the ability to generalize is improved. The efficiency of the proposed model was calculated through the percentage accuracy (PA), which was measured for a completely labeled vowel data set. Classification accuracy of the proposed model is also compared with the previous classification models and the fuzzy