In this paper, we propose a intrusion detection algorithm based on auto-encoder and three-way decisions (AE-3WD) for industrial control networks, aiming at the security problem of industrial control network. The ideology of deep learning is similar to the idea of intrusion detection. Deep learning is a kind of intelligent algorithm and has the ability of automatically learning. It uses self-learning to enhance the experience and dynamic classification capabilities. We use deep learning to improve the intrusion detection rate and reduce the false alarm rate through learning, a denoising AutoEncoder and three-way decisions intrusion detection method AE-3WD is proposed to improve intrusion detection accuracy. In the processing, deep learning AutoEncoder is used to extract the features of high-dimensional data by combining the coefficient penalty and reconstruction loss function of the encode layer during the training mode. A multi-feature space can be constructed by multiple feature extractions from AutoEncoder, and then a decision for intrusion behavior or normal behavior is made by three-way decisions. NSL-KDD data sets are used to the experiments. The experiment results prove that our proposed method can extract meaningful features and effectively improve the performance of intrusion detection.

Intrusion detection systems are important to prevent security threats and protecting networks from attacks. In recent year, Industrial Control Network security is the hot topic of network security. Industrial control network systems are widely used in many enterprises that are the lifeblood of national economy, as petroleum, electric power, transportation, water conservancy. Information security of Industrial control network is related to Internet network security and social stability. Industrial control networks are interacting with Internet and external network more and more closely with the continuous progress of integration of normalization and industrialization. The traditional industrial control network that considered as closed and isolated now are broken, and the security problems of Internet and external network are also introduced to industrial control network system more severe.

Stuxnet Virus [

The deep learning concept was proposed in 2006 by Hinton et al. (Hinton, GE, Osindero, S, & Teh, YW, 2006; Hinton, GE, & Salakhutdinov, RR, 2006). The deep learning theory is a multi-level deep learning model by imitating human thinking patterns. Machine learning is one of the main research field of artificial intelligence. Machine learning uses a variety of intelligent algorithms to enable machines to learn the potential rules from a large amount of identification data and use these rules as a basis to identify and classify new samples. Deep learning has a perfect ability to solve complex problems. So we introduce deep learning method to solve the problem of network intrusion detection system.

Intrusion detection of industrial control network has become a hotspot in industrial control security research field. In recent year, Professor Geoffrey Hinton [

AutoEncoder is a representative model in deep learning. Like traditional machine learning, deep learning can be divided into supervised learning and unsupervised learning. AutoEncoder is a common unsupervised learning method. It can be used for features extraction and data generation. The AutoEncoder reconstructs the input as possible as original data by learning the best parameters. Compared with the linear features obtained by traditional shallow feature extraction methods such as principal component analysis (PCA), the features obtained from the encoder are non-linear, and the expression ability of the features is more powerful. The AutoEncoder [

The training algorithm of the AutoEncoder mainly adopts the unsupervised learning algorithm of backward propagation, the optimization goal is to make the target output as equal as possible to the model input. The AutoEncoder process can be divided into two steps of encoding and decoding, it is usually not used alone. Encoding can expressed as the encoder

The encoder can be mapped into the input vector x to representation h (x map to a set of binary hidden expressions v, v (i) ∈ [0, 1]) of the encoder hidden layer with the mapping as following:

Here the

In the process of using AutoEncoder, we choose the denoising AutoEncoder [

By adding noise in the input data can relieve the overfitting problem of AutoEncoder;

By adding noise in the input data can avoid the AutoEncoder to learn simple mapping function;

The denoising AutoEncoder can learn the robustness represent.

Therefore, we choice the denoising AutoEncoder to process the data dimensionality reduction in our approaches. The denoising AutoEncoder reconstructs the data vector

Three-way decisions theory come from rough sets theory [_{PP}, λ_{BP}, λ_{NP} denote the loss or cost function under the three actions of ɑP, ɑB, and ɑN when x belongs to Χ. Let λ_{PN}, λ_{BN}, λ_{NN} represent the loss function of three types of actions when x does not belong to Χ when ɑP, ɑB, ɑN are taken. Therefore, the expected loss values under the three actions ɑP, ɑB and ɑN are expressed as following:

According to Bayesian criterion, for the action set A = {ɑP, ɑB, ɑN} with the smallest expected loss value is selected as the best decision, and we use the POS (Χ), BND (Χ), and NEG (Χ) to represent the positive domain, the boundary domain, and the negative domain, respectively. It is generally assumed that: 0≤λ_{PP} ≤λ_{BP} <λ_{NP}, 0≤λ_{NN} ≤λ_{BN} <λ_{PN}, then the conditions of the three decision criteria (POS), (BND), (NEG) are shown in

Decision criterion | Condition 1 | Condition 2 |
---|---|---|

(POS) | ||

(BND) | ||

(NEG) |

For a sample or event, there may be three decisions: Acceptance or positive domain (POS), Rejection or negative domain (NEG) and Deferred decision or boundary domain (BND). There are two possible states: a positive domain (POS) that a sample belongs to positive domain or negative domain (NEG) that a sample not belong to the domain. According to three-way decisions theory, the relevant cost λ for these three possible decisions and two possible states are shown in

Decisions | States | |
---|---|---|

POS | NEG | |

POS | ||

BND | ||

NEG |

The probability that a sample x belongs to a set C is P (C|x), we can define C as a positive domain, then P (Cc|x) is the probability that a sample x does not belong to domain C, the probability that x belongs to the negative domain Cc. According to the relevant theorem [

If

Where α, β are threshold parameters that can be defined by the following formulas:

Since intrusion detection is a decision-making system, it is also a typical three way decision-making system. So the three-way decision theory provides a useful method for intrusion detection system.

In the process of three-way decisions, the key is to set the decision threshold

Decision | States | |
---|---|---|

POS | NEG | |

POS | 0 | 0.7 |

BND | 0.3 | 0.3 |

NEG | 1 | 0 |

The loss function has been determined, the relevant threshold

The intrusion detection method based on AutoEncoder and three-way decisions proposed in this paper can be divided into two parts, namely feature extraction part and intrusion detection part. The intrusion detection algorithm’s overall flowchart is shown in

As

The dataset used in intrusion detection and industrial control network intrusion detection is NSL-KDD dataset and NSL-KDD dataset. These data sets must be preprocessed before used in intrusion detection program. First, we need to convert character data into numeric data. That is to convert the character attribute of the protocol to a numeric value. For an example, the network protocol that attribute values (TCP, UDP, ICMP, character data) can be expressed as (1, 2, 3, numeric data). Secondly, it is necessary to standardize the data that to eliminate the errors that caused by different dimensions or large differences in the data. In this paper, we use

In the equation, x is the i-th attribute value of column, mini is the minimum value of the i-th attribute column, and maxi is the maximum value of the i-th attribute column.

In the intrusion detection algorithm AE-3WD, we assumes that input data set are X = {x_{1}, x_{2},…, x_{n}}, the reconstructed data set are X′ = {x_{1}′, x_{2}′,…, x_{n}′}. The object of the AutoEncoder is to make the result X′ as possible as to close the original input data set X, so that the original data features can be extracted by AutoEncode in the hidden layer. So AutoEncoder is to minimize the reconstructed error function to obtain optimized network parameters weights and offsets. The error function as shown in

The gradient descent method is used to update the weight parameters of the whole network, and the optimal solution of the objective function is obtained [

The algorithm AE-3WD is described as follows.

The final decision is made for the input data as samples that are belong to positive domains (POS) or negative domains (NEG). But the data in the boundary domain (BND) are need to decision again after obtaining additional information. If some samples are still divided into the boundary domain again, this decision process must be continued until all samples are divided into positive or negative domains.

In this section evaluates the performance of the proposed algorithm. All experiments are implemented in the PC, environment OS is Windows10, hardware is Intel (R) core (TM) i5-8250 CPU @ 1.60Ghz 1.80 GHz, 8GB DDR2-DRAM. The algorithm is implemented in Python 3.7.

The datasets used to our approach experiment is the intrusion detection dataset NSL-KDD. There are 41 feature attributes and 1 class label in the NSL-KDD dataset. For different network attack behavior, the dataset NSL-KDD includes a train set and a test set for different type, as shown in

Dataset | Behavior type | Training set | Test set |
---|---|---|---|

NSL-KDD | Normal | 13449 | 9711 |

DOS | 9234 | 7458 | |

Probe | 2289 | 2421 | |

R2L | 209 | 2754 | |

U2R | 11 | 200 |

In the NSL-KDD dataset, there are normal data and attack data, the attack behavior data can be divided into the four types: DoS (Denial of Service), Probe, U2R (User to Root), and R2L (Remote to Local). In the experiments, we selected 20% of the training set data and all the test data for the experiments.

Because of the data set is uneven distribution, so it is not appropriate that only use accuracy to judge the advantages and disadvantages of the algorithm. In the intrusion detection system, there are two important evaluation indicators: False Alarm Rate and False Negatives Rate. The accuracy rate indicates how many of the network behaviors predicted to be abnormal are really abnormal behaviors; F1 score comprehensively considers the calculation results of model precision and recall, and is an important indicator to reflect the quality of the algorithm. Therefore, in our experiments, we choose five indicators that are used to judge the performance of machine learning algorithm to evaluate the intrusion detection algorithm’s performance, namely Accuracy (ACC = (TP + TN)/(TP + FP + TN + FN), Detection Rate (DR = TP/(TP + FN)), Precision (PR = TP/(TP + FP)), False Positive Rate (FPR==FP/(TN + FP)), and F1-score (F1 = 2TP/(2TP + FP + FN)). Where the TP and TN are the network attack records and normal records have been correctly classified; FP is the normal records that were mistaken to classify as attacks; FN is attack records that were mistaken to classify as normal records.

When comparing with other algorithms, in this paper we mainly consider the performance of AutoEncoder and three-way decisions. We have down two Experiment to verify the availability and effectiveness of our algorithm AE-3WD, we conducted two experiments: Experiment 1 is to verify whether AutoEncoder is better than the traditional method, and also to prove whether the classification method based on three-way decisions is better than the traditional two-way decisions approach. Experiment 2 is mainly to compare the performance of our algorithm AE-3WD with other intrusion detection algorithms. The experiments are repeated 10 times to eliminate the impact of randomness, and the mean value of each indicator are recorded.

Experiment 1 mainly explores the impact of introducing the three-way decisions theory into the field of intrusion detection. Under the same condition of using AutoEncoder for feature extraction, it compares the performance of the three-way decisions with traditional two-way decisions classification methods in the intrusion detection system field. So we chose the PCA (Principal Component Analysis) and ICA (Independent principal Component Analysis) as the comparative reference method of AutoEncoder, their classification is same approach based on three-way decisions. The SVM (Support Vector Machine) and KNN (K-Nearest Neighbors) are used to compare with the method based on three-way decisions, and the denoising AutoEncoder is used to extract features of the input data. The results of different methods are shown in

Method | ACC | DR | FPR | PR | F1 |
---|---|---|---|---|---|

PCA-3WD | 92.75 | 86.62 | 4.34 | 91.10 | 88.74 |

ICA-3WD | 92.65 | 90.53 | 4.67 | 96.31 | 93.35 |

DAE-SVM | 84.18 | 76.35 | 5.45 | 94.94 | 84.57 |

DAE-KNN | 80.43 | 68.94 | 4.46 | 95.44 | 80.00 |

From the

From the

Experiment 2 is mainly to compare the performance of the AE-3WD algorithm with other intrusion detection algorithms. So we choose the typical intrusion detection algorithms SFID [

Method | ACC | DR | FPR | PR | F1 |
---|---|---|---|---|---|

ICA-DNN | 92.28 | 86.26 | 4.85 | 89.86 | 88.10 |

SFID | 92.39 | 85.80 | 3.06 | 93.45 | 89.43 |

SNADE | 92.65 | 90.55 | 4.65 | 96.28 | 93.35 |

The experimental results shown that AE-3WD algorithm is better than other algorithms in these indicators: Accuracy, Detection Rate and F1 score, especially in the Detection Rate. Although the AE-3WD does not perform as well as some methods in the PR, but from the overall, AE-3WD is still better than others.

The ROC curves of these algorithms are shown in the

From the

According to the existing research, we propose a intrusion detection algorithm based on AutoEncoder and three-way decision for the security problem of industrial control network in this paper. The denoising AutoEncoder is used to extract the features from the original input data, and then the three-way decisions are used to make the classification decisions. The simulation results show that the performance of our algorithm is better than other algorithms.

In the process of classification decision-making, the three-way decision theory is used. In AE-3WD algorithm, there is not consider the time cost for repeat processing of the boundary domain when using three-way decisions theory for classification. In future research, time cost can be considered to be included in the disposal of boundary areas.

The authors would like to show their deepest gratitude to the anonymous reviewers for their constructive comments to improve the quality of the paper.