This paper presents a novel approach to using supervised learning with a shallow neural network to increase the efficiency of the finite element analysis of holes under biaxial load. With this approach, the number of elements in the finite element analysis can be reduced while maintaining good accuracy. The neural network will be used to predict the maximum stress for holes of different configurations such as holes in a finite-width plate (2D), multiple holes (2D), staggered holes (2D), and holes in an infinite plate (3D). The predictions are based on their respective coarse mesh with only 2 elements along the whole quarter perimeter. The result shows the prediction errors are under 5% for all the listed hole configurations. In contrast, the conventional FEM with the respective coarse mesh has errors above 20%. To achieve similar accuracy, the conventional FEM would require finer mesh with at least 6 elements along the perimeter. Furthermore, this setup is also effective in predicting the maximum stress for the 3D problem. With the aid of supervised learning, the FEM analysis of a coarse mesh with only 2 elements along the quarter perimeter can attain prediction errors of less than 2%. For the same coarse mesh, the conventional FEM has errors above 35%. To achieve similar accuracy for the 3D problem, the conventional FEM would require finer mesh with more than 8 elements along the perimeter. This result shows that supervised learning has great potential to enhance the efficiency of finite element analysis with fewer elements while attaining satisfactory results.

The use of finite element analysis (FEA) to solve problems of engineering is popular in many engineering industries due to its ability to represent complex geometry and capture local stress concentration effects. Furthermore, due to the increasing computational power available and the decreasing costs of FEA software, the use of FEA has grown significantly over the years. Yet, modeling a large segment of a component can still require high computational costs. To keep the computational costs practical, coarser mesh and linear elements are often used. As a result, the accuracy of the model is poor, and it is insufficient for fatigue analysis to be carried out. To achieve the accuracy needed, the sub-modeling of the FEA model is performed in high-stress areas. Sub-modeling is the process of creating a solid model of the local geometry at the region of interest within the global model. Subsequently, the local solid model must undergo mesh refinement together with a transfer of boundary conditions from the global model. Unfortunately, sub-modeling is usually performed manually. Therefore, sub-modeling often is a labor-intensive process especially when the part has many features that require many sub-models. As a result, performing these complex 3-D design iterations with local geometry changes or topology optimizations often requires significant time and resources. Moreover, even with sub-modeling, the choice of 3D elements is often linear due to computational limitations. Hence, a huge amount of effort and resources can be saved if machine learning techniques could be applied to increase the accuracy of the finite element analysis of coarse 3D mesh and reduce the need for sub-modeling.

Machine learning is the usage of models to learn from data and make assessments without being explicitly programmed. The goal is to create a trained neural network model based on the pattern of the training data and be able to make predictions involving different sets of data with similar patterns. Improvement in machine learning techniques over the years has made it an effective tool to be used across many different industries such as finance, social research, and medicine. Improvements can be categorized into two areas, software and hardware. For instance, the algorithm using the Scaled Conjugate Gradient (SCG) is a more robust variant of the original conjugate gradient method (CG). It is less computationally complex and faster. SCG is used as a backpropagation algorithm to minimize errors in the neural network. Moller [

As for hardware advancements, neural processing units (NPU) are increasingly adopted for machine learning tasks which are traditionally performed by the central processing unit (CPU) or graphics processing units (GPU). NPUs are dedicated processing units optimized for power and area efficiency for matrix math. For instance, the 700 MHz NPU used by Google, known as a Tensor Processing Unit (TPU) has a matrix multiplication unit with over 65,000 arithmetic logic units and can process 92 trillion 8-bit operations per second [

Hashash et al. [

The main objective of this paper is to provide the optimal approach to effectively utilize supervised learning to optimize the FEA of holes under biaxial load in a diverse set of problems. This study also evaluates the extent of reduction in errors by using supervised learning. In this paper, SCG, LM, and BR algorithms were used to train the NN. In addition, pure linear and tangent sigmoid transfer functions will be evaluated. The displacement nodal solutions of the nearest 6 nodes to the quarter-hole’s edge were used as the training parameters. The input data set used to train the NN will consist of a small set of 20 different course mesh of a hole in an infinite-width plate. There will only be 2 elements along the quarter-hole perimeter in the coarse mesh. An analytical solution of a hole under biaxial load will be used as the output target data. Subsequently, the NN will be used to predict the maximum stress of holes under a more diverse range of problems such as holes on a finite-width plate, multiple holes, staggered holes, and holes on the 3D plate.

An artificial neural network consists of neurons, weights, and biases. Using an iterative minimization procedure based on the different back-propagation techniques, the weights are adjusted to reduce the error [

Many studies have been done to optimize the weight initializations to curb problems of vanishing gradients and exploding gradients [

Weight initialization will instead be focused on reducing training time. The Nguyen-Widrow algorithm will be used to initialize the weights and biases for all 3 backpropagation methods. This eliminates most of the weight adjustments, hence only small adjustments are needed to be made during training. The Nguyen-Widrow algorithm has been demonstrated to be able to accelerate the training process and testing accuracy [

The following three sections present the three different back-propagation algorithms used in the study. These three backpropagation methods were chosen as they are the most common methods used in MATLAB machine learning.

The Conjugate Gradient (CG) method is implemented as an iterative algorithm in the minimization problem. It can be regarded as being between the gradient descent algorithm and Newton’s method. The gradient descent algorithm requires the calculation of first order derivative which is the gradient to find the minimum of the cost function. It has a slow convergence rate where subsequent steps often contradict previous steps. On the other hand, Newton’s method requires the calculation of second order derivative which has high computational cost. CG method accelerates the convergence rate of the steepest descent by performing the search along conjugate directions such that subsequent steps never contradict previous steps. It also has lower computational costs compared to Newton’s method. The Scaled Conjugate Gradient (SCG) method which is faster and more robust will be used [

Consider a quadratic function:

where

The algorithm is started by choosing the initial weight vector

The weight vector and residual are then updated:

The direction vector of the next step,

such that

CG method will fail and converge to nonstationary point if matrix

where

Similar to SCG, the Levenberg-Marquardt (LM) method is an iterative process such that the final weight is obtained through iterations with step

Consider the cost function:

where

where

where

The minimum point of the cost function has a gradient of 0. By taking the derivative of

where

To reduce overfitting, this method uses regularization parameters

When

The application of Bayesian Regularization (BR) within the framework of the LM algorithm will be used [

This section will first present the analytical stresses and displacement of a hole in an infinite-width plate under uniaxial load. Next, it will introduce the approach to how the analytical solution can be integrated with supervised learning for both uniaxial load and biaxial load.

In

where

The input data set used to train the NN will consist of a set of 20 course mesh of a hole in an infinite-width plate. An analytical solution of a hole under biaxial load will be used as the output target data. A coarse mesh of the quarter-hole is shown in

The material property is based on Aluminum, where

where

Total sampling size* | Parameters | Min | Max | Interval |
---|---|---|---|---|

40 | 1.5 | 3 | 0.5 | |

0.6 | 1.4 | 0.2 |

Note: *The total sampling size is obtained after applying a linear scale of 70% and 150% as shown in

The neural network consists of a single hidden layer as shown in

10 neurons will be used for the SCG and LM method, while 2 neurons will be used for the BR method as the BR method requires fewer neurons to perform effectively as demonstrated by Kayri [

where

where

Since deformation is small, assuming that the response is linear, the principle of superposition can be applied to the uniaxial load solution [

where

By applying superposition on plane stress displacements under uniaxial load given by

where

Aside from plane stress conditions, this paper also evaluates the effectiveness of a 2D coarse mesh training data set generated under general plane strain conditions. Under general plane strain conditions, the strain in the z-direction

Using

Notice that

To obtain

The neural network is trained with the same set of 2D coarse mesh used in the uniaxial load case.

Total sampling size* | Parameters | Min | Max | Interval |
---|---|---|---|---|

1.5 | 3 | 0.5 | ||

0.6 | 1.4 | 0.2 | ||

200 | −1 | 1 | 0.5 | |

440 | −1 | 1 | 0.2 | |

840 | −1 | 1 | 0.1 |

Note: *The total sampling size is obtained after applying a linear scale of 70% and 150% as shown in

The target data consists of analytical solutions of

where

When substituting

Finally, by using

In this study, Matlab’s library [

For the prediction of 2D mesh, plane stress condition is used to generate the 2D coarse mesh training data set. For the prediction of 3D mesh, the general plane strain condition is used to generate the 2D coarse mesh training data set [

The accuracy of the test result based on the displacement at all 6 nodes as training parameters is shown in

Model | RMSE (%) under Uniaxial approach* | RMSE (%) under Biaxial approach** | ||
---|---|---|---|---|

Tangent sigmoid | Pure linear | Tangent sigmoid | Pure linear | |

SCG | 45.87 | 34.52 | 15.54 | 13.08 |

BR | 2.32 | 60.98 | 1.65 | 19.23 |

LM | 6.26 | 2.80 | 2.10 | 1.52 |

Note: *Based on 20 mesh variations and 2 load variations, training sample size of 40; **Based on 20 mesh variations and 10 load variations, training sample size of 200; Maximum error with 300 trials.

To further improve the result, the study evaluates the displace field on the element attached to the maximum stress location at Node 1. In this element, Node 2 and Node 5 have non-zero

Model | RMSE (%) under Uniaxial approach* | RMSE (%) under Biaxial approach** | ||
---|---|---|---|---|

Tangent sigmoid | Pure linear | Tangent sigmoid | Pure linear | |

SCG | 22.56 | 19.96 | 14.55 | 7.93 |

BR | 2.52 | 51.19 | 2.16 | 13.76 |

LM | 9.47 | 5.42 | 2.28 | 2.23 |

Note: *Based on 20 mesh variations and 2 load variations, training sample size of 40; **Based on 20 mesh variations and 10 load variations, training sample size of 200; Maximum error with 300 trials.

The neural network model that is trained based on the hole in an infinite-width plate problem will be used to predict the maximum stresses on a hole in a finite-width plate problem. In the infinite-width model, the side wall is far from the hole such that there is no interaction between the stress concentration of the hole with the free edge on the side. However, with the finite-width geometry, the hole is closer to the free edge on the side. As a result of the interaction between the hole and the free edge, the stress concentration of the hole would increase. The analysis of the finite-width model is to evaluate if supervised learning can be used to support the finite element method to calculate stress accurately with an alternate geometry with coarse mesh size. Tests are conducted for a range of

The mesh is constructed by linear elements and only has 2 elements at the quarter-hole perimeter. The result is compared against the Roark’s solution [

where

The result in

Model | Absolute error (%) under Uniaxial approach* | Absolute error (%) under Biaxial approach** | ||
---|---|---|---|---|

Tangent sigmoid | Pure linear | Tangent sigmoid | Pure linear | |

SCG | 87.40 | 77.51 | 81.51 | 19.80 |

BR | 34.97 | 18.39 | 18.27 | 13.62 |

LM | 74.79 | 21.40 | 112.30 | 16.45 |

Note: *Based on 20 mesh variations and 2 load variations, training sample size of 40; **Based on 20 mesh variations and 10 load variations, training sample size of 200; Maximum error with 300 trials, conducted for a range of

To evaluate if the trained model can be further improved by focusing the training on just the element with the highest stress, the training is based on

Model | Absolute error (%) under Uniaxial approach* | Absolute error (%) under Biaxial approach** | ||
---|---|---|---|---|

Tangent sigmoid | Pure linear | Tangent sigmoid | Pure linear | |

SCG | 66.17 | 27.40 | 57.70 | 6.39 |

BR | 55.81 | 13.96 | 4.26 | 3.98 |

LM | 86.82 | 13.02 | 325.27 | 3.55 |

Note: *Based on 20 mesh variations and 2 load variations, training sample size of 40; **Based on 20 mesh variations and 10 load variations, training sample size of 200; Maximum error with 300 trials, conducted for a range of

In

Kt | Error | Kt | Error | Kt | Error | Kt | Error | ||
---|---|---|---|---|---|---|---|---|---|

Roark | 3.460 | 3.230 | 3.135 | 3.088 | |||||

Biaxial approach* | 2 | 3.340 | −3.47% | 3.115 | −3.55% | 3.112 | −0.73% | 3.031 | −0.85% |

FEM | 2 | 2.552 | −26.2% | 2.420 | −25.1% | 2.404 | −23.3% | 2.377 | −23.0% |

FEM | 4 | 3.281 | −5.2% | 3.072 | −4.9% | 2.984 | −4.8% | 2.940 | −4.8% |

FEM | 6 | 3.388 | −2.08% | 3.188 | −1.30% | 3.099 | −1.15% | 3.057 | −1.85% |

Note: *Using

Both the LM and BR methods offer comparable results. Since the LM method has lower computational costs, the LM method would be the preferred method. The LM method will thus be applied on mesh with very different configurations such as multiple holes and staggered holes on finite-width plates for further tests. The biaxial approach will be used since it offers lower error than the uniaxial approach and has the advantage of being sensitive to biaxial loads.

The holes are separated by distance,

Min | Max | Interval | |
---|---|---|---|

2 | 4 | 1 |

Prediction error (%) under Biaxial approach* | |||
---|---|---|---|

Sampling size of 200 | −13.47 | −4.88 | −3.02 |

Sampling size of 440 | 0.48 | 3.01 | 2.07 |

Sampling size of 840 | 0.37 | 2.89 | 1.97 |

Note: *Using

To ensure that the NN is also effective in predicting similar problems, the prediction error of a quarter finite plate with 3 holes as shown in

Prediction error (%) under Biaxial approach* | |||
---|---|---|---|

Sampling size of 440 | −0.74 | 3.27 | 2.14 |

Note: *Using

A sensitivity study will be conducted for the case of 5 holes since it has higher prediction errors.

Kt | Error | Kt | Error | Kt | Error | ||
---|---|---|---|---|---|---|---|

FEM (quadratic element, Plane 183) | 16 | 2.512 | 2.693 | 2.827 | |||

Biaxial approach* (linear element, Plane 182) | 2 | 2.493 | −0.74% | 2.781 | 3.27% | 2.887 | 2.14% |

FEM (linear element, Plane 182) | 2 | 1.940 | −22.77% | 2.132 | −20.85% | 2.227 | −21.25% |

FEM (linear element, Plane 182) | 4 | 2.344 | −6.67% | 2.528 | −6.15% | 2.661 | −5.90% |

FEM (linear element, Plane 182) | 6 | 2.469 | −1.68% | 2.664 | −1.11% | 2.796 | −1.11% |

Note: *Using

There is a stagger of

Min | Max | Interval | |
---|---|---|---|

1 | 3 | 1 |

Since Node 5 is in the vicinity of the other hole, its stresses may be affected by the stress field of the other hole. This may increase discrepancies from that of a single hole.

Prediction error (%) under Biaxial approach* | |||
---|---|---|---|

8.12 | 6.81 | 4.38 | |

4.61 | 4.07 | 3.11 |

Note: *Based on the training sampling size of 440, Levenberg-Marquart algorithm, pure linear function, maximum absolute error over 300 trials.

Similar to

Prediction error (%) under Biaxial approach* | |||
---|---|---|---|

Sampling size of 200 | 31.95 | 16.96 | 5.35 |

Sampling size of 440 | 4.61 | 4.07 | 3.11 |

Sampling size of 840 | 4.55 | 3.97 | 2.98 |

Note: *Using

Kt | Error | Kt | Error | Kt | Error | ||
---|---|---|---|---|---|---|---|

FEM (quadratic element, Plane 183) | 16 | 3.465 | 2.988 | 2.878 | |||

Biaxial approach* (linear element, Plane 182) | 2 | 3.625 | 4.61% | 3.110 | 4.07% | 2.968 | 3.11% |

FEM (linear element, Plane 182) | 2 | 2.773 | −19.98% | 2.386 | −20.14% | 2.289 | −20.47% |

FEM (linear element, Plane 182) | 4 | 3.394 | −2.06% | 2.857 | −4.40% | 2.739 | −4.84% |

FEM (linear element, Plane 182) | 6 | 3.512 | 1.36% | 2.988 | 0.01% | 2.864 | −0.47% |

Note: * Using

The 3D mesh model is a quarter-hole in an infinite plate that is bisected in half along the z-axis. The coarse mesh is constructed with linear element Solid 185. Since this is the problem of predicting the stresses of a single hole in an infinite-width plate using a biaxial load trained NN like in

Min | Max | Interval | |
---|---|---|---|

6 | 18 | 4 |

Kt | Error | Kt | Error | Kt | Error | Kt | Error | ||
---|---|---|---|---|---|---|---|---|---|

FEM (quadratic element, Solid 186) | 12 | 3.053 | 3.038 | 3.033 | 3.032 | ||||

Biaxial approach* (linear element, Solid 185) | 2 | 3.112 | 1.93% | 3.091 | 1.74% | 3.082 | 1.62% | 3.079 | 1.55% |

FEM (linear element, Solid 185) | 2 | 1.963 | −35.70% | 1.958 | −35.55% | 1.957 | −35.48% | 1.956 | −35.49% |

FEM (linear element, Solid 185) | 4 | 2.586 | −15.30% | 2.573 | −15.31% | 2.569 | −15.30% | 2.568 | −15.30% |

FEM (linear element, Solid 185) | 8 | 2.844 | −6.85% | 2.827 | −6.95% | 2.825 | −6.86% | 2.824 | −6.87% |

FEM (quadratic element, Solid 186) | 4 | 2.95 | −3.37% | 2.935 | −3.39% | 2.93 | −3.40% | 2.93 | −3.36% |

FEM (quadratic element, Solid 186) | 8 | 3.043 | −0.33% | 3.028 | −0.33% | 3.023 | −0.33% | 3.022 | −0.33% |

Note: *Using

The result shows that with the aid of supervised learning, an error of less than 5% can be obtained even for a coarse mesh with only 2 elements along the whole quarter perimeter. Training is done using the analytical solution for a coarse 2D mesh of a single hole in an infinite-width plate (2D). The prediction is successfully applied to different geometries such as holes in finite-width plates (2D), multiple holes without stagger (2D) and staggered holes (2D). A preliminary study done in this paper also demonstrates that the approach can be applied to 3D models with minor adjustments.

Levenberg-Marquardt method (LM) with pure linear transfer function and Bayesian Regularization (BR) with tangent sigmoid transfer function can achieve low prediction errors for the hole in infinite-width and finite-width problems. However, LM is faster and less calculational complex than BR. Thus, LM is extensively tested in this study. The result shows good prediction accuracy for the different problems.

Nevertheless, various factors can affect the prediction results. Firstly, to train neural networks to be sufficiently sensitive to biaxial loading, the interval of

For staggered holes (2D), the prediction errors are higher due to the influence of shear stress which the neural network is not trained to model. In this case, the interaction of nearby holes may affect the prediction accuracy. Nodes chosen are encouraged to be further away from the other holes to reduce the influence of the stress field of the other adjacent holes. For example, for the staggered holes problem, removing the nodes that are in proximity of the adjacent hole from the training parameters drastically reduced the prediction errors to 4.61%. In contrast, conventional FEM with a coarse mesh of 2 elements at the quarter-hole perimeter has an error of 20.47%. To achieve similar accuracy by FEM without the aid of supervised learning, at least 6 elements at the quarter-hole perimeter would be necessary.

This paper has also demonstrated that the 3D problem consisting of a hole in an infinite-width 3D plate can be accurately predicted using supervised learning. The neural network training is performed using the 2D coarse mesh training set with generalized plane strain condition. When applied to the 3D plate that has a coarse mesh with just 2 elements at the quarter-hole perimeter, the maximum prediction error is 1.93%. In contrast, using the conventional FEM without any supervised training gives an error of 35.7%. To achieve similar accuracy by conventional FEM without supervised learning, more than 8 elements at the quarter-hole perimeter would be necessary.

Hence, it is demonstrated that supervised learning can be used effectively to aid finite element analysis such that even coarse mesh can offer satisfactory accuracy. As a result of using this approach, when performing complex 3-D design iterations with many holes, the need to perform many sub-modeling analyses (for greater accuracy) is significantly reduced.

The authors thank the guidance and encouragement from Prof S.N. Atluri over the years.

The authors received no specific funding for this study.

The authors confirm their contribution to the paper as follows: Study conception and design: W.T. Chow; analysis and interpretation of results: J.T. Lau, W.T. Chow; draft manuscript preparation: J.T. Lau. All authors reviewed the results and approved the final version of the manuscript.

Data will be available upon request.

The authors declare that they have no conflicts of interest to report regarding the present study.