The cost of highway is affected by many factors. Its composition and calculation are complicated and have great ambiguity. Calculating the cost of highway according to the traditional highway engineering estimation method is a completely tedious task. Constructing a highway cost prediction model can forecast the value promptly and improve the accuracy of highway engineering cost. This work sorts out and collects 60 sets of measured data of highway engineering; establishes an expressway cost index system based on 10 factors, including main route mileage, roadbed width, roadbed earthwork, and number of bridges; and processes the data through principal component analysis (PCA) and hierarchical cluster analysis. Particle swarm optimization (PSO) is used to obtain the optimal parameter combination of the regularization parameter

According to traditional highway engineering estimation method, calculating its cost is an extremely perplexed task. With the rapid development of mathematical modeling methods and computer technology, experts at home and abroad have studied various mathematical models or computer simulation means for project cost forecasting. Regression analysis methods were commonly used [_{,} and big data [_{,} for the cost prediction of engineering projects.

A large number of documents apply BP neural network [

Particle swarm optimization (PSO) algorithm uses real numbers to find the optimal parameters. The algorithm has strong versatility, fast convergence, and is easier to leap to local optimal information. It has been widely used in parameter optimization. Consequently, the PSO algorithm is used to determine the optimal parameters of LSSVM and improve calculation accuracy [

Through preliminary research on the aforementioned algorithms, this work sorts out and collects the data of existing highways, establishes a sample set, processes the samples through hierarchical cluster analysis and principal component analysis (PCA), builds a PCA-PSO-LSSVM [

PCA is an index dimensionality reduction method based on mathematical ideas. It uses the orthogonal transformation in linear programming to reduce the given variables with correlation to a small number of uncorrelated comprehensive variables. These new comprehensive variables carry most of the important information of the original indicators, and the relationship of complex matrix is simplified to achieve the dimensionality reduction of indicators [

Step 1: Select the initial sample. Assuming that population

Step 2: Standardize the original data. The formula is expressed as follows:

where

^{th} variable;

^{th} variable.

Step 3: Calculate the correlation coefficient matrix of

Step 4: Obtain M

Step 5: Calculate the principal component contribution rate and cumulative contribution rate. Compute the contribution rate of the

Kennedy and Eberhart proposed PSO in 1995. This algorithm has the advantages of simplicity, easy implementation, no gradient information, and few parameters. It is particularly suitable for real number optimization problems. It also has a profound intelligent background that is suitable for scientific research, particularly for engineering applications [

where

The main principle of the mathematical model of the LSSVM regression algorithm is presented as follows. The training sample set

where

Different from SVM, LSSVM selects the square of the error

where

The Lagrangian function is established to solve the above-mentioned problem:

The optimal solution satisfies the KKT optimization condition, and the partial derivatives of

After transforming the above-mentionedconditions using the same solution, variables

where

The final decision function of LSSVM is:

The kernel function adopts the Gaussian radial basis kernel function and is expressed as:

The PSO algorithm is used to determine the optimal solution of the key parameters

The steps, which are based on the PCA-PSO-LSSVM model, are presented as follows:

Step 1: Sort and collect samples and perform systematic cluster and principal component analyses on the data.

Step 2: Initialize the particle swarm. The regularization parameter

Step 3: Train the generated parameter combinations of each generation

Step 4: Compare the current fitness value

Step 5: Construct the PCA-PSO-LSSVM training model, the fitness graph, and the sample regression curve figure.

Step 6: Input the test sample and obtain the prediction result.

Sorting out and collecting 60 groups of highway data in different regions, the main factors that affect highway project cost, namely, main route mileage

First, hierarchical cluster analysis is used to classify the samples, and several projects with higher similarity can be selected to improve prediction accuracy. A total of 60 groups of highway engineering data are standardized in the SPSS software (

1 | –1.49427 | 2.05733 | –0.57764 | –0.33949 | 6.66921 | 1.38474 | –0.86545 | 0.388 | –1.28160 | 1.98326 | 5.115842 |

2 | –1.41891 | 5.60477 | 0.73310 | 3.57918 | –0.30064 | –1.13625 | –0.86724 | 0.388 | –1.10422 | –0.49582 | 11.32167 |

3 | –1.22782 | 2.06324 | 0.35260 | 0.24037 | 0.28477 | 1.27352 | –0.86622 | 0.388 | –1.28160 | –0.49582 | 3.72445 |

4 | –0.9374 | 0.29247 | 0.35261 | –0.70322 | 0.45566 | 1.01169 | –0.86549 | 0.388 | –1.28160 | –0.49582 | 3.62909 |

5 | –0.92144 | 0.29837 | –1.00874 | –0.70322 | –0.30336 | 0.53900 | –0.86587 | 0.388 | –1.28160 | 1.98326 | 3.89426 |

6 | –0.85565 | –0.44535 | –0.17051 | 0.95306 | –0.30264 | –0.60322 | 3.09739 | 0.388 | 1.37916 | –0.49582 | 8.89118 |

7 | –0.77088 | –0.44535 | 0.55095 | 1.18386 | –0.30246 | –0.87488 | 0.70101 | 0.388 | 1.11309 | –0.49582 | 8.224901 |

60 | 0.40435 | –0.44535 | 1.72603 | 0.71118 | –0.30267 | 0.15205 | 0.6807 | 0.388 | 0.49224 | 1.98326 | 8.78174 |

After hierarchical cluster analysis, the 10 sets of data (e.g., 1, 2, 43, 15, 29, 23, 28, 27, 36, and 16) were screened out, and the remaining 50 sets of data were standardized to obtain the data in

1 | –1.3900 | 3.5456 | 0.4081 | 0.3647 | 0.8043 | 2.1664 | –0.9197 | 0.2021 | –1.4069 | –0.4321 | 3.72445 |

2 | –1.0321 | 0.7406 | 0.4081 | –0.8649 | 1.1534 | 1.7735 | –0.9190 | 0.2021 | –1.4069 | –0.4321 | 3.62909 |

3 | –1.0124 | 0.7410 | –1.0432 | –0.8649 | –0.3969 | 1.0641 | –0.9194 | 0.2021 | –1.4069 | 2.2683 | 3.89426 |

4 | –0.9314 | –0.4281 | –0.1499 | 1.234 | –0.3542 | –0.6502 | 2.8866 | 0.2021 | 1.2678 | –0.4321 | 8.89118 |

5 | –0.8269 | –0.4281 | 0.6199 | 1.5942 | –0.3951 | –1.0578 | 0.5853 | 0.2021 | 1.0003 | –0.4321 | 8.224901 |

6 | –0.7716 | –0.4281 | –0.1570 | 1.0171 | 0.3951 | –1.1075 | 1.2213 | 0.2021 | 0.9112 | –0.4321 | 9.680698 |

7 | –0.7195 | –0.4235 | –0.1750 | –1.2092 | –0.3968 | 0.3478 | –0.9184 | 0.2021 | 0.5993 | –0.4321 | 4.39933 |

8 | –0.6843 | –0.4281 | –0.0703 | 1.6030 | –0.3951 | –1.3109 | 1.4166 | 0.2021 | –1.4069 | –0.4321 | 6.79503 |

9 | –0.3762 | –0.4281 | 0.3579 | –1.1465 | 0.3969 | –0.1773 | –0.9190 | 0.2021 | –1.4069 | –0.4321 | 4.46452 |

10 | –0.1870 | –0.1944 | –1.1110 | –1.1634 | 0.3969 | –0.4068 | –0.9192 | 0.2021 | –1.4069 | –0.4321 | 4.07081 |

50 | 0.6215 | –0.4281 | 1.8723 | 0.9782 | –0.3955 | 0.4834 | 0.5658 | 0.2021 | 0.3763 | 2.26826 | 8.78174 |

Ingredient | Eigenvalues | Contribution rate/% | Cumulative contribution rate/% |
---|---|---|---|

1 | 2.774 | 27.744 | 27.744 |

2 | 1.520 | 15.197 | 42.941 |

3 | 1.476 | 14.756 | 57.697 |

4 | 1.225 | 12.250 | 69.947 |

5 | 0.811 | 8.106 | 78.052 |

6 | 0.724 | 7.243 | 85.296 |

7 | 0.548 | 5.475 | 90.771 |

8 | 0.450 | 4.501 | 95.271 |

9 |
0.297 |
2.970 |
98.241 |

1 | 2 | 3 | 4 | 5 | 6 | |
---|---|---|---|---|---|---|

–0.03660 | 0.24245 | 0.60285 | –0.16337 | 0.53826 | –0.15261 | |

–0.29835 | –0.14402 | –0.37438 | 0.44428 | 0.13241 | 0.00973 | |

0.20789 | 0.37533 | 0.22249 | 0.48797 | –0.20796 | 0.52819 | |

0.46073 | –0.02943 | –0.03134 | 0.41220 | –0.11068 | –0.26665 | |

–0.25574 | 0.18524 | 0.44497 | 0.25020 | –0.49556 | –0.43301 | |

–0.32196 | 0.18671 | –0.05762 | 0.47734 | 0.51664 | –0.16687 | |

0.48084 | –0.24122 | –0.04715 | 0.18820 | 0.26038 | –0.27967 | |

–0.13289 | –0.50348 | 0.36945 | 0.17009 | 0.12506 | 0.52798 | |

0.47095 | –0.03821 | 0.16807 | –0.00255 | 0.14270 | 0.04872 | |

0.12073 | 0.62947 | –0.27820 | –0.11432 | 0.14287 | 0.22860 |

Y | |||||||
---|---|---|---|---|---|---|---|

1 | –2.84115 | –0.24979 | –1.85119 | 3.301447 | –0.15993 | –0.14825 | 3.72445 |

2 | –2.54631 | 0.268266 | –0.36882 | 1.389823 | –0.57841 | 0.011945 | 3.62909 |

3 | –1.90084 | 1.007213 | –2.08352 | –0.35267 | 0.522654 | 0.649525 | 3.89426 |

4 | 2.943145 | –1.57143 | –0.34164 | 0.636575 | 0.190138 | –0.7438 | 8.89118 |

5 | 2.136404 | –0.77696 | –0.02976 | 0.49193 | –0.89906 | 0.264725 | 8.224901 |

6 | 1.986912 | –1.19738 | –0.19329 | –0.03775 | –0.51672 | –0.1739 | 9.680698 |

7 | –0.68971 | –0.32717 | –0.13401 | –0.67825 | –0.08684 | 0.743458 | 4.39933 |

8 | 2.515087 | –1.25283 | –0.10724 | 0.171007 | –0.58139 | –0.30968 | 6.79503 |

9 | –0.39249 | –0.14302 | 0.221569 | –0.70136 | –0.29185 | 1.043547 | 4.46452 |

10 | –1.65322 | –0.64778 | –0.4021 | –1.45658 | –0.25661 | 0.186208 | 4.07081 |

50 | 1.586654 | 2.078496 | 0.197199 | 1.037475 | 0.776276 | 1.204973 | 8.78174 |

The PCA-PSO-LSSVM prediction model is established using the MATLAB2016(a) simulation platform, and the initialization parameters of the prediction model are set as follows: population size

The regression fitting of the training samples proves that the PCA-PSO-LSSVM model has good learning ability. To verify whether the model also has excellent generalization ability, the prediction is performed by inputting 10 sets of test sample data and by comparing them with the unoptimized LSSVM model and BP neural network model (

Preliminarily,

Output | Actual | PCA-PSO-LSSVM model | LSSVM model | BP neural network | |||
---|---|---|---|---|---|---|---|

Predictive value | Relative error/% | Predictive value | Relative error/% | Predictive value | Relative error/% | ||

Highway cost/10 million yuan · km^{−1} |
6.49211 | 6.21329 | 4.2947 | 6.41092 | 1.2506 | 6.61376 | 1.8738 |

4.06781 | 4.11126 | 1.0681 | 4.19473 | 3.1202 | 3.65821 | 10.0693 | |

8.260201 | 8.21326 | 0.5682 | 8.26570 | 0.0665 | 8.34940 | 1.0798 | |

4.89237 | 4.89725 | 0.0998 | 4.74419 | 3.0287 | 4.83629 | 1.1463 | |

6.79466 | 6.79510 | 0.0065 | 8.03372 | 18.2358 | 7.07926 | 4.1886 | |

3.62897 | 3.63017 | 0.0331 | 3.58667 | 1.1657 | 4.48618 | 23.6212 | |

3.70859 | 3.70905 | 0.0123 | 3.72601 | 0.4698 | 3.11486 | 16.0097 | |

3.54613 | 3.55405 | 0.2232 | 3.45936 | 2.4470 | 3.55175 | 0.1585 | |

5.3321 | 5.32938 | 0.0511 | 4.55616 | 14.5522 | 3.97144 | 25.5182 | |

8.78174 | 8.64644 | 1.5407 | 8.55743 | 2.5543 | 8.62063 | 1.8346 |

Predictive model | Unit cost | |
---|---|---|

MRE | RMSE | |

BP neural network | 8.55% | 56.92% |

LSSVM model | 4.69% | 47.35% |

PCA-PSO-LSSVM model | 0.79% | 10.01% |

Based on the principal component analysis method, the least squares support vector machine prediction model is established. It combined with the PSO algorithm to optimize the regularization parameter

Through the predictive analysis of highway engineering, the PCA-PSO-LSSVM model has the average relative error of 0.79% and the root mean square relative error of 10.01%. Compared with the BP neural network and the unoptimized LSSVM model, the PCA-PSO-LSSVM model has better learning generalization ability and prediction accuracy.