Diabetes mellitus is a metabolic disease in which blood glucose levels rise as a result of pancreatic insulin production failure. It causes hyperglycemia and chronic multiorgan dysfunction, including blindness, renal failure, and cardiovascular disease, if left untreated. One of the essential checks that are needed to be performed frequently in Type 1 Diabetes Mellitus is a blood test, this procedure involves extracting blood quite frequently, which leads to subject discomfort increasing the possibility of infection when the procedure is often recurring. Existing methods used for diabetes classification have less classification accuracy and suffer from vanishing gradient problems, to overcome these issues, we proposed stacking ensemble learning-based convolutional gated recurrent neural network (CGRNN) Metamodel algorithm. Our proposed method initially performs outlier detection to remove outlier data, using the Gaussian distribution method, and the Box-cox method is used to correctly order the dataset. After the outliers’ detection, the missing values are replaced by the data’s mean rather than their elimination. In the stacking ensemble base model, multiple machine learning algorithms like Naïve Bayes, Bagging with random forest, and Adaboost Decision tree have been employed. CGRNN Meta model uses two hidden layers Long-Short-Time Memory (LSTM) and Gated Recurrent Unit (GRU) to calculate the weight matrix for diabetes prediction. Finally, the calculated weight matrix is passed to the softmax function in the output layer to produce the diabetes prediction results. By using LSTM-based CG-RNN, the mean square error (MSE) value is 0.016 and the obtained accuracy is 91.33%.

Diabetes mellitus is a chronic disorder that causes certain illnesses that result in high sugar levels in the circulatory system. There is no permanent cure for diabetes and uncontrolled diabetes may lead to death. Individuals suffering from type 1 diabetes are prone to fatal comorbidities such as peripheral vascular disease, stroke, a heart disease involving the coronary artery, dyslipidemia, facial hypertension, and obesity [

The current Convolutional Gated Recurrent Network suffers from a vanishing gradient problem. In this problem, the hidden layer is affected by a given input [

In this paper, ensemble Learning-Based Convolutional Gated Recurrent Neural Network for Diabetes Miletus predictive have been proposed. Initially, our suggested technique uses the Gaussian distribution method to discover and eliminate outlier data, and the Box-cox method to appropriately arrange the dataset. Instead of being discarded after outlier identification, missing values are filled in simply taking the mean of the data. The suggested Meta model CGRNN algorithm and softmax function are utilized for diabetes classification and prediction, and finally, we compared the proposed method it outperforms earlier methods.

This section presents a review of different machine learning methods for Diabetes prediction is presented. The Database (DB) technology monitoring was developed in the field of management and is designed to help treat patients with type 1 diabetes mellitus (T1DM) [

The main model consists of hybrids wavelets neural network (HWNNs) and a self-organizing map (SOM network), which constitutes an ensemble of the following subsampling methods. Chi-square tests, binary logistic regression analysis, and screening for T2DM risk predict the most important risk factors for diabetes. Synthetic Minority Over-sampling Technique (SMOTE) is used to balance cross-section data [

Prevention of diabetes or late onset of diabetes is very important. A framework is proposed to [

The combinative approach involving Infra-Red Spectroscopy coupled with multi-dimensional analysis has been put forward as a promising method for diagnostics. To examine the probability of an accurate and correct diagnosis, certain approaches such as Partial Least Squares (PLS), Cluster analysis, regression [

CNNs towards EEG multispectral images are used to improve classification performance. The distribution model was developed in this study based on the inception V1 multi-view convolutional neural network (MVCNN) [

It is shown in the (CNN) also known as Unified Convolutional Neural Network and the (M-bCNN) known as Matrix-based Convolutional Neural Network to meet this challenge, while standard feature selection algorithms are used as a relief [

They created a semi-automated framework that uses a machine-learning algorithm to increase recall while reducing false positives. They suggested a methodology that uses engineering and machine learning to identify subjects with or without T2DM from EHR. To quantify individual performance measures, the following machine learning models are examined and compared: Random Forest, Nave Bayes, Logistic Regression, K-Nearest Neighbor, and Support Vector Machine. They took a random sample of 300 patient records from the EHR repository’s 23,281 diabetic patient data [

The implementation of gating mechanisms through Recurrent Neural Networks and their sophisticated units has emerged as powerful techniques for Diabetes Data Prediction.

The above-proposed method block diagram represented in

The method involved here is the Box-Cox technique. Based on this, the matrix sets which are dependent upon the surrounding subset feature value are created. This method involves the Box-Cox technique, the purpose is to construct the matrix sets which are derived from the feature values of the surrounding subset. It gives an estimation of what could be the maximum likelihood value for every matrix for efficient normalization of the data.

Calculation of lambda by using maximum likelihood estimation.

The normalization of data using a Box-Cox transformation

Let’s assume the λ optimal value selected which varies from −5 to 5, x_{i} is transformed data, and Pd represents preprocessed dataset. In order to perform the initial most check to know if the outliers exist, the check is conducted in these two variables. There is a necessity for modification of some variables that are anomalous and then with the help of Box-Cox, convert them to attain normal distribution through

Calculate the maximum likelihood of the Gaussian Probability Density Function with different mean and standard deviation values.

Normalization of data through calculation of its z-score as mentioned below:

Z score can be calculated using

to find the mean values, let’s assume n refers to the number of data points.

where

The amount of data required for the process of normalization is considerably less when compared with quantification. It is the expected error pertaining to prediction to be in coherence with the mean of 0,1 and the confidence interval. The calculation process with respect to the Variables’ mean is dependent on assumptions for example normal distribution: as represented in

The Bayesian classification is based on Bayes’ theorem. These Naïve Bayesian Classification algorithms characterize simple bases comparable to the classification of end trees and selected networks when used in a large database. Naive Bayes classification allows representation between a subset of dependent attributes. In this method, the posterior probability

where the

Bagging stands for bootstrap aggregation; it combines multiple learner’s methods for decreasing the variance of the estimates. For example, the Random Forest Decision Tree, can train different random M subsets of data on different trees and vote for final predictions. To use this methodology, we must first develop numerous models on the data sets using the Bootstrap sampling method. A Bootstrap sampling approach, on the other hand, involves the creation of enormous training sets from the original datasets. The number of training sets (N) and original datasets (M) are equal in the count. The training set is made up of random subsamples of the main dataset that might comprise duplicate entries or be missing certain records. The test dataset will be the original dataset. It can train M different trees on different subsets of the data for classification which is expressed in the following equation.

where

The chief distinguishing factor that separates AdaBoost from Bagging is that in bagging, trees, all the iterations involve the generation of trees, and the tree with the most votes is deemed the best-performed tree, but in Adaboost, trees are created based on previous incorrect classification errors.

While constructing the next model, each model learns from the flaws or mistakes made in the prior model and corrects them as shown in

where w_{i} is the weight matrix, N is the sample size, and yi is the target variable.

The weighted error E in the previous question has been self-normalized by us, and it now lies between 0 and 1.

Through this we can calculate the ‘importance of say’ with the help of the equation:

α is the importance of say, E is the weighted error

If this technique is further continued for n iterations, we will have n classifiers with weighted votes in the last iteration.

A weighted dataset is created when the weight of a single instance in the dataset is determined by the prior base classifier results for each of these occurrences. If they misclassify an instance, the weight of that instance will grow in subsequent models, but the weight will remain the same if the classification is right.

The final decision is achieved by weighted voting of the basic classification, determined by the model’s weight, which depends on the misjudgment rate.

If the model has higher classification accuracy, it gets low weight. If it has poor classification accuracy, it gets the highest weight.

where

The methodology for classifying type 1 diabetes mellitus using the same data set training and test data is followed by K10-cross-validation. The comparison of predictions is made by each tree with the actual labels in the training set. The tree training samples that correspond to this feature are categorized as the following trees in the forest. Let assume initial weights of

The recurrent neural network method is an approach that is mainly used for processing sequential data. The attribute involved in this approach is the neural networks which are feed-forward, possessing cyclic connections. The complete input history in the network has been mapped by it for the prediction of every single output by maximizing the advantage of temporal relationships between the data at every single instance of time.

The architecture diagram of a simple Convolutional Gated Recurrent Neural Network is shown in

The final weight matrix generated is then passed through a softmax function. The softmax function is given by

where

The CG_RNN is capable enough of handling the variable-length sequence as it comprises a hidden state of recurrent nature whose activation is dependent upon that of the previous time. The architecture of LSTM is shown in

value,

_{t−1}) and the content input (X_{t}) and gives the output as the number either 0 which means omit this or 1 which means keep this for each number in the cell state Ct−1.

where _{t} = Cell State,

An ensemble learning algorithm predicts a good machine learning model from multiple combinations to learn the best way. Meta models are predictive training created from sample data on the base model.

The data left unused (which hasn’t been used to train the basic model) is fed to the basic model. Predictions are made, and the training data is the expected output together for the set used to fit the Metamodel. These predictions have provided input and output pairs. Therefore, as compared with NB, RBF, and Adaboost decision tree-the proposed CG-RNN gives a better result.

We have used the Austin Public Health Diabetes database for classification and analysis. These data are Multivariate and Time-Series based Data sets from the UCI repository consisting of 10000 records of different people. The dataset consists of several medical predictors (independent variables), target (association) variables, and results. Independent variables include the patient’s BMI number, insulin level, age, etc. This analysis results have taken (A, B, C, D, and E) different ratios (85/15, 80/20, 75/25, 70/30, 60/40) of the training and testing dataset. Austin Public Health Diabetes dataset was taken for testing and training data set.

The proposed Naive Bayes method has a 0.82% accuracy for 10-folds and 0.70% of f1-score, 0.68% of recall rate, and 0.72% Precision rate for 10-fold data validation. The result comparison is illustrated in

The above analysis shows that the proposed Adaboost for the decision tree model provides a 0.87% accuracy for 10-folds and 0.78% of f1-score, 0.74% of recall rate, and a 0.82% Precision rate for 10-fold data validation. The different k-fold cross-validation of Adaboost for the decision tree method is shown in

Proposed method performance analysis is given in

Parameters | Proposed CG-RNN | |
---|---|---|

Precision (%) | 87.52 | |

Sensitivity (%) | 88.0 | |

Specificity (%) | 85.67 | |

Accuracy (%) | 91.33 | |

FNR | 0.080 | |

FDR | 0.0548 | |

MCC | 0.8670 | |

F1-score | 0.9124 | |

Time complexity (Sec) | Training time (Sec) | 6 |

Testing time (Sec) | 3 |

Methods | Precision per unit | Recall per unit | f1-score per unit | Accuracy per unit |
---|---|---|---|---|

K-NN | 0.72 | 0.46 | 0.44 | 63 |

SVM | 0.82 | 0.57 | 0.67 | 77.3 |

NB | 0.716 | 0.652 | 0.68 | 82.9 |

ADT | 0.752 | 0.752 | 0.753 | 81.1 |

BRF | 0.836 | 0.804 | 0.814 | 89.8 |

CGRNN | 0.87 | 0.89 | 0.895 | 91.33 |

Type 1 diabetes mellitus (T1DM) is a serious and fast-growing health problem worldwide. T1DM is a long-term, progressive metabolic disorder primarily attributed to hyperglycemia due to inhibited secretion of insulin and insulin action resistance. The proposed CGRNN Meta model method predicts diabetes mellitus more accurately than other methods. Initially, the Gaussian distribution method is applied to perform outlier detection to remove outlier data. The preprocessing method involved here is majorly concerned with the evaluation of the standard deviation, and mean along with the likelihood values for every parameter for filling the missing value. The stacking ensemble machine learning algorithms like Naïve Bayes, Adaboost decision tree, and bagging with Random Forest are used for training and updating the weightage of the datasets concerning diabetes mellitus. The proposed CGRNN Meta model method analyzes the best classifier model based on the accuracy voting method for the accurate prediction of Type 1 diabetes mellitus (T1DM). The proposed NB methods’ overall classification accuracy is 81.2%, BRF provides 88.8%, ADT provides 82.9% and CGRNN produced better accuracy of 91.33%. The accuracy of the system can be further improved if the number of records is scaled up in the dataset.

The authors with a deep sense of gratitude would thank the supervisor for his guidance and constant support rendered during this research.