Currently, the decision of aortic valve replacement surgery time for asymptomatic patients with moderatetosevere aortic stenosis (AS) is made by healthcare professionals based on the patient’s clinical biometric records. A delay in surgical aortic valve replacement (SAVR) can potentially affect patients’ quality of life. By using ML algorithms, this study aims to predict the optimal SAVR timing and determine the enhancement in moderatetosevere AS patient survival following surgery. This study represents a novel approach that has the potential to improve decisionmaking and, ultimately, improve patient outcomes. We analyze data from 176 patients with moderatetosevere aortic stenosis who had undergone or were indicated for SAVR. We divide the data into two groups: those who died within the first year after SAVR and those who survived for more than one year or were still alive at the last followup. We then use six different ML algorithms, Support Vector Machine (SVM), Classification and Regression Tree (C and R tree), Generalized Linear (GL), ChiSquare Automatic Interaction Detector (CHAID), Artificial Neural Network (ANN), and Linear Regression (LR), to generate predictions for the best timing for SAVR. The results showed that the SVM algorithm is the best model for predicting the optimal timing for SAVR and for predicting the postsurgery survival period. By optimizing the timing of SAVR surgery using the SVM algorithm, we observed a significant improvement in the survival period after SAVR. Our study demonstrates that ML algorithms generate reliable models for predicting the optimal timing of SAVR in asymptomatic patients with moderatetosevere AS.
AS is one of the most common valvular heart and cardiovascular disease in North America and Europe [
Elderly asymptomatic patients with moderatetosevere AS (effective orifice area is less than 1.5 cm^{2}) may suffer from reduced systemic arterial compliance (SAC) with preserved ejection fraction (EF) [
Cardiologists are frequently confronted with the dilemma of whether to perform surgery on asymptomatic SAS patients. Lowering operational risk by completing valve replacement sooner, avoiding sudden cardiac death, and preventing irreversible myocardial damage are the prognostic factors for suggesting surgical intervention for asymptomatic patients. Unfortunately, for elderly patients, the risk of surgery may outweigh its benefits [
SAVR and transcatheter aortic valve replacement (TAVR) are surgical treatment procedures for patients [
Because many patients are examined as outpatients, the delay between the cardiologist’s recommendation for SAVR and the actual operation date may put the patient at risk of heart failure progression or death. Furthermore, specific individuals may postpone SAVR until their symptoms worsen or new technology becomes available [
Almost half of all patients with severe aortic stenosis are asymptomatic at the time of diagnosis. It remains uncertain and challenging to determine the optimal timing of intervention to minimize early morbidity and mortality in these patients. Numerous techniques for optimizing valve replacement surgery time in asymptomatic individuals with AS have been developed for this purpose [
Kang et al. recently published the results of an eightyear followup study in which patients with asymptomatic, very severe aortic stenosis were randomly assigned to surgical aortic valve replacement or conservative therapy. The researchers discovered that the patients who underwent early surgical aortic valve replacement had a much lower incidence of morbidity and/or mortality compared to those assigned for conservative therapy [
With the progress and enhancement of artificial intelligence (AI) and ML algorithms over the years, these technologies have evolved from concepts to practice. Numerous applications of AI have been demonstrated in medicine. AI technology has emerged as a significant component that may influence the growth of the medical industry and enhance the quality of healthcare services. AI can assist doctors in disease diagnosis and improve therapeutic quality. When integrated into standard medical processes, AI has the potential to minimize misdiagnoses and enhance diagnostic effectiveness. Currently, AI technologies are used in cardiology for purposes such as precision medicine, clinical forecasting, cardiac imaging assessment, and intelligent robotics. The application of AI in cardiovascular therapy has promising potential [
Clinicians often deal with binary outcomes in which patientspecific cases are not generally considered. This issue is so prevalent in the biomedical literature that statisticians refer to it as “dichotomania”. In essence, dichotomizing continuous data leads to the loss of critical information regarding the strength of associations. It is better to estimate an individual patient’s probability rather than to generate binary categories. ML algorithms may provide clinical benefits by delivering more accurate predictions of the outcome likelihood [
Many studies have used ML algorithms in cardiology over the last decade, such as in forecasting the fatality rate of coronary artery disease, utilizing an SVM for the forecasting and computation of the American College of Cardiology and American Heart Association risk scores, implementing ML algorithms for echocardiographic variables to differentiate hypertrophic cardiomyopathy from physiological hypertrophy in athletes, evaluating left ventricular diastolic dysfunction, and estimating 12 kinds of heart rhythms [
To enhance the reliability of the decisionmaking on the best SAVR timing for asymptomatic patients with moderatetosevere AS, several ML algorithms were tested in the current study using 24 parameters representing patients’ clinical characteristics, Doppler echocardiographic data of the left ventricle (LV) geometry and function, and systematic arterial indexes.
The main contribution of this study is to optimize SAVR time, which will help asymptomatic AS patients live longer. In other words, the patient may live longer if the surgery is performed at the right time.
Different combinations of two ML models were created in a novel way to accomplish our goal. Six ML algorithms were used in the first ML model to predict the best time for SAVR for the patient population. We then applied a second ML model (which also contained six ML algorithms) on the predicted SAVR to determine how the optimized SAVR timing affected patients’ ability to survive longer after surgery.
To our knowledge, this is the first study to investigate the connection between the prediction of SAVR operation time and the impact of the predicted SAVR timing on a patient’s survival period. The significance of this study lies in its potential to improve decisionmaking and enhance patient outcomes after the surgical operation of valve replacement. By using ML algorithms to predict the optimal timing for SAVR, we have developed a more accurate and objective approach that has the potential to overcome the limitations of traditional methods based on clinical biometric records and healthcare professional judgment. Our findings have important implications for patients, healthcare professionals, and the broader scientific community as they demonstrate the potential of ML algorithms to improve patient outcomes and optimize medical decisionmaking.
The data in this study were retrospectively collected and reviewed for 1,154 individuals referred to the Quebec Heart Institute echocardiographic laboratory between August 1999 and March 2005 for AS examination [
The data were divided into two groups based on the survival period after SAVR. Patients who underwent SAVR and survived for more than one year were included in the modeling group (111 patients). In contrast, those who died less than one year after SAVR were included in the implementation group (65 patients).
As shown in
The modeling group was initially separated into a training set (70 percent of the data) and a testing set (the remaining 30 percent of the data). Then, the modeling group was applied into six different machine learning algorithms to predict the optimal time of SAVR (ML model 1). In parallel, the same modeling group was applied into ML model 2 (which also contained six ML algorithms) to predict the survival period after SAVR.
The ML algorithm with the best correlation from model 1 was applied to the implementation group (
Finally, the impact ratio (the ratio of the number of patients with updated SAVR times (
The central illustration in
The dataset used in this study contained 26 variables; those with more than 20% of missing values were excluded from the ML algorithms. These variables included diabetes and dyslipidemia. The IBM Statistical Package for the Social Sciences (SPSS) Modeler was used to perform the feature ranking method.
Extracting or ranking features is an effective method to reduce the complexity and difficulty of ML algorithms. ML applications often include hundreds if not thousands of categories that can be utilized as inputs. Consequently, considerable time and effort were expended to determine the categories or variables to be included in the algorithm. The Feature Selection technique in the IBM SPSS Modeler was used to identify the most essential fields. The IBM SPSS Modeler disabled this stage for the SVM algorithm.
The data were then divided into different categories. The first category included clinical characteristics such as age, sex, body surface area (BSA), obesity (Body Mass Index (BMI) >30 (kg/m^{2})), hypertension (blood pressure greater than 140/90 mm Hg), and the presence of CAD.
The second category included Doppler echocardiographic and systematic arterial indices such as aortic valve area (AVA), peak transvalvular pressure gradient (PG), mean transvalvular pressure gradient (MG) less than 30 mm Hg, systolic arterial pressure, diastolic arterial pressure, heart rate (HR), SAC as calculated by dividing the stroke volume index over the pulse pressure of the branchia, and systemic vascular resistance, which is the ratio of mean arterial pressure multiplied by 80 over the cardiac output (CO).
The third group is Doppler echocardiographic data of LV geometry and function, such as relative wall thickness (RWTH), which was calculated by multiplying 2 by the ratio of posterior wall thickness over left ventricle diastolic diameter, LV mass (The American Society of Echocardiography’s revised formula was used to determine LV mass, which was then indexed for body surface area), CO, stroke volume (SV), mean volume flow rate (Qmean), overall EF as a parameter to assess how much blood the left ventricle of the heart pumps into the body for each heartbeat, paradoxical lowflow (PLF) as defined by a lowflow condition of the normal left ventricle ejection fraction (stroke volume index ≤35 mL/m^{2}), transaortic peak instantaneous velocity (Vmax), LV hypertrophy (defined as an LV mass index greater than 134 g/m^{2} for males and an LV mass index greater than 110 g/m^{2} for females), and valvuloarterial impedance (Zva) (which was calculated by adding systolic arterial pressure to the mean transvalvular gradient and the result divided by the stroke volume index).
Patients in the implementation group appeared to have a significantly higher Vmax, PG, MG, systemic vascular resistance, and LV mass. However, there were no significant differences among the other variables used in this study (
No.  Variable  Overall population 
Implementation group 
Modeling group 


Clinical characteristics  
1  Age (years)  66.4 ± 12.4  66.6 ± 12.4  66.3 ± 12.4  NS 
2  Gender (female/male)  176 (63/113)  (22/43)  (41/70)  NS 
3  Body surface area (BSA), 
1.86 ± 0.21  1.88 ± 0.2  1.84 ± 0.19  NS 
4  Obesity (BMI > 30)  54  23  31  NS 
5  Coronary artery disease (CAD)  113  40  73  NS 
6  Hypertension  70  26  44  NS 
Doppler echocardiographic and systematic arterial indexes  
7  Aortic valve area (AVA), 
0.93 ± 0.3  0.87 ± 0.23  0.96 ± 0.31  NS 
8  Peak gradient (PG), mm Hg  60 ± 21.1  64.9 ± 23.9  57.1 ± 20.6  <0.001 
9  Mean gradient (MG), mm Hg  35.7 ± 14  38.8 ± 15.4  33.8 ± 12.8  <0.001 
10  Systolic arterial pressure, mm Hg  133.5 ± 21.6  133.4 ± 20.1  133.5 ± 22.4  NS 
11  Diastolic arterial pressure, mm Hg  73.2 ± 10.9  73.3 ± 11  73.2 ± 10.9  NS 
12  Heart rate (HR), (beats per minute)  66.9 ± 11  65.9 ± 9.9  67.6 ± 11.8  NS 
13  Systemic arterial compliance (SAC), mL.m^{−2}.mmHg^{−1}  1.41 ± 0.47  1.4 ± 0.46  1.42 ± 0.47  NS 
14  Systemic vascular resistance (RESISTvao), dyne.s.cm^{−5}  197.1 ± 907  2211 ± 1078  1871 ± 766  0.016 
Doppler echocardiographic data of LV geometry and function  
15  Relative wall thickness, %  46.3 ± 10.5  47 ± 10  46.0 ± 10.9  NS 
16  LV mass, g  217.5 ± 67.8  233 ± 77.5  208.9 ± 60.4  0.029 
17  Cardiac output, 
5.13 ± 1.2  5.1 ± 1.16  5.15 ± 1.34  NS 
18  Left ventricle hypertrophy  97  36  61  NS 
19  Stroke volume, mL  79 ± 17.5  78.9 ± 16.9  79.1 ± 18  NS 
20  Qmean, mL/s  248.7 ± 59.3  245.56 ± 56.3  250.6 ± 61.3  NS 
21  Overall ejection fraction, %  66.5 ± 6.6  65.6 ± 6.1  66 ± 7.01  NS 
22  Paradoxical lowflow  33  13  20  NS 
23  Transaortic peak instantaneous velocity, m/s  3.79 ± 0.69  3.95 ± 0.72  3.7 ± 0.65  <0.008 
24  Valvuloarterial impedance ( 
4.13 ± 1  4.23 ± 1  4.05 ± 1  NS 
The six most frequently supervised ML models were selected to predict the optimal operation time and the expected survival period after optimizing SAVR operation time. These algorithms include regression, CHAID, SVM, ANN, C and R trees, and GL.
The regression algorithm aims to directly describe the relationship between the input variables and outputs, mainly in mathematical functions with variables derived from the data. These methods are often very effective in predicting the correlations between specific inputs and outputs.
CHAID algorithms are based on two or more category criteria variables and those algorithms involve both descriptive and predictive analyses. The number of independent variable categories is determined by the chisquared test findings. The most significant independent variable appears in the first node of the classification of the resultant tree. When there is no significant association between the variables, the node generation and segment construction procedures are complete [
SVM is a group of linked supervised learning methods used to solve regression and classification tasks. Owing to its robust theoretical underpinning, SVM has grown in popularity since its launch a quartercentury ago [
ANN algorithms have recently gained popularity in various fields as valuable models for categorization, segmentation, pattern classification, and forecasting. ANN are now commonly used for general function approximation in quantitative approaches owing to their excellent qualities of selflearning, adaptability, reliability, nonlinearity, and improvement in inputtooutput modeling [
C and R trees are wellknown statistical learning approaches that have been used in many applications owing to their model precision, adaptability to large datasets, and link to guideline decisions. These characteristics are regarded as a need for patient care in domains such as healthcare. The C and R trees begin by systematically splitting the object dataset and assigning a forecast feature for classification models or an actual value to each division of the regression models [
It is possible to utilize GL in the linear case and with a nonlinear probabilistic output such as logistic regression. Although GL is theoretically simple to understand and apply, determining its variables without using computational equipment becomes increasingly difficult as the number of variables increases [
In this study, a number of existing methods have been integrated in order to combine their strengths and address specific challenges or problems that were not wellsolved by the individual methods such as giving a guidance to decisionmaking and enhance patient outcomes after the surgical operation of valve replacement. To prove the effectiveness of the integrated methods, simulations were conducted to evaluate the performance of the integrated system on a specific task or set of tasks. Also, the performance of the integrated system to the performance of the individual components or methods that make up the system was compared.
The twotailed Student’s ttest was used to examine continuous variables. Statistical analyses were performed using IBM SPSS version 25 and a
For ML model 1, six machine learning algorithms were used for the patient modeling group as listed in
ML algorithm  Correlation coefficient (R)  Number of used variables  Mean absolute error (MAE) 

SVM  0.81  24  0.45 
CHAID  0.77  8  0.58 
GL  0.64  24  0.77 
Regression  0.64  23  0.77 
C and R tree  0.58  15  0.75 
ANN  0.5  24  0.88 
The SVM algorithm used 24 variables and showed the highest correlation (
ML model 2 (
ML algorithm  Correlation coefficient (R)  Number of used variables  MAE 

SVM  0.80  24  0.67 
CHAID  0.74  3  0.95 
GL  0.7  24  1 
Regression  0.69  23  1.05 
ANN  0.63  23  1.09 
C and R tree  0.0  24  1.57 
The SVM algorithm was used to generate the optimal SAVR time (ML model 1) and to predict the survival period following SAVR surgery (ML model 2).
After applying the dataset of the implementation group to ML model 1, the SVM algorithm predicted the new time of SAVR and the average of the predicted times for SAVR was earlier than the average of the actual times for SAVR (the average of actual SAVR time = 1.5 years ± 1.46 years. The average of predicted SAVR time = 1.2 years ± 0.76 years). The results showed that this algorithm could predict a new SAVR time for 89 percent of patients, which means that 58 patients will have a revised SAVR operation schedule.
To confirm that the revised SAVR time predicted by ML model 1 was the best time for surgery, ML model 2 used to estimate the survival period after the intervention time for the 58 patients was revised.
As shown in
Overall model number  ML algorithm in Model 1  ML algorithm in Model 2  Survival period enhancement ratio (= number of patients with predicted time of death more than the actual time of death)/total number of patients  Average enhancement time (years)  Is the enhancement ratio of Overall model 1 significant from the other Overall models? ( 

1  SVM  SVM  93%  3.15  
2  SVM  CHAID  88%  2.51  No (0.131) 
3  SVM  GL  69%  2.21  Yes (<0.001) 
4  SVM  Regression  69%  2.23  Yes (<0.001) 
5  SVM  ANN  69%  1.95  Yes (<0.001) 
6  CHAID  SVM  90%  2.67  No (0.349) 
7  CHAID  CHAID  77%  2.51  Yes (<0.001) 
8  CHAID  GL  69%  2.21  Yes (<0.001) 
9  CHAID  Regression  69%  2.25  Yes (<0.001) 
10  CHAID  ANN  69%  1.95  Yes (<0.001) 
11  GL  SVM  69%  2.81  Yes (<0.001) 
12  GL  CHAID  77%  2.51  Yes (<0.001) 
13  GL  GL  69%  2.25  Yes (<0.001) 
14  GL  Regression  69%  2.22  Yes (<0.001) 
15  GL  ANN  69%  1.89  Yes (<0.001) 
16  Regression  SVM  77%  2.21  Yes (<0.001) 
17  Regression  CHAID  69%  2.23  Yes (<0.001) 
18  Regression  GL  77%  2.62  Yes (<0.001) 
19  Regression  Regression  69%  2.29  Yes (<0.001) 
20  Regression  ANN  69%  1.95  Yes (<0.001) 
21  ANN  SVM  69%  2.02  Yes (<0.001) 
22  ANN  CHAID  64%  2.32  Yes (<0.001) 
23  ANN  GL  69%  2.32  Yes (<0.001) 
24  ANN  Regression  69%  2.33  Yes (<0.001) 
25  ANN  ANN  69%  2.01  Yes (<0.001) 
According to recent research, ML algorithms may effectively predict mortality after cardiac procedures such as transcatheter aortic valve replacement (AVR). In 2018, HernandezSuarez et al. predicted inhospital deaths following an AVR operation with an Area Under Curve (AUC) value of 0.92 [
This study describes the ML algorithms used in the verification and validation of the enhanced survival period based on predicting the optimal SAVR operation time for moderatetosevere asymptomatic AS patients.
By using different ML algorithms, a modeling group (training and testing) was created to evaluate the potential of different ML algorithms to predict the actual time of SAVR and actual survival period. After selecting the ML algorithm with the best correlation, the implementation group was tested to explore the effect of the newly predicted SAVR time on the enhancement of the survival period.
In the modeling group, SVM, CHAID, and GL had the most potent capacity to predict the actual SAVR operational time (ML model 1), utilizing 24, 8, and 24 variables, respectively. The SVM algorithm is preferred over the CHAID and GL algorithms because it has the highest correlation coefficient value, can also be used for linear and nonlinear ML regression, and uses all important AS predictor variables.
Moreover, despite having a high correlation coefficient, CHAID utilizes a lower number of variables, which may affect the accuracy of the resulting model when new patients are evaluated because the unused variables may carry out key elements that strengthen the forecasting results [
For predicting the actual survival period in the modeling group (ML model 2), among the other methods, SVM also had the highest correlation coefficient and utilized 24 variables. Therefore, it was selected as the preferred method to predict survival.
To evaluate the potential of the ML algorithms in predicting the optimal time of the SAVR operation, different combinations of the proposed ML algorithms were tested on an implementation group to predict the time of SAVR and the related enhancement in the survival period. We found that using the SVM algorithm for both models 1 and 2 had a significant improvement in the survival period following the new predicted time of SAVR compared to other combinations.
One factor that can impact the complexity of ML algorithms is the number of variables or features used in the model. The number of variables can affect the accuracy and efficiency of the resulting model as more variables can provide more information and increase the complexity of the model, but may also increase the risk of overfitting. For example, although CHAID had a high correlation coefficient, it used a lower number of variables, which may affect the accuracy of the resulting model when new patients are evaluated as the unused variables may carry out key elements that strengthen the forecasting results.
This suggests that the SVM algorithm may be effective at predicting survival outcomes while also being able to handle a relatively large number of variables. In general, the number of variables used in the ML algorithms can impact the complexity and performance of the resulting models.
Several models have been proposed and developed using ML methods and algorithms to estimate the best timing for SAVR in asymptomatic patients with moderatetosevere AS (ML model 1) and forecast the survival period for the same patients following the newly predicted SAVR operation time (ML model 2). Using the SVM algorithm to utilize models 1 and 2 yielded the most promising results in forecasting the optimal timing of SAVR among several tested ML algorithms (ML model 1). The postoperative survival time was significantly longer as demonstrated by the ML model 2. The results showed that ML algorithms have a great potential to help clinicians make decisions that will help patients with complicated cases live longer.
We would like to express our most tremendous appreciation to Dr. Philippe Pibarot (Canada Research Chair in Valvular Heart Diseases and Head of Cardiology Research at the Institut Universitaire de Cardiologie et de Pneumologie de Québec (IUCPQ), Université Laval) for granting access to the institute’s dataset.
The authors received no specific funding for this study.
The authors declare that they have no conflicts of interest to report regarding the present study.