The traditional recommendation algorithm represented by the collaborative filtering algorithm is the most classical and widely recommended algorithm in the practical industry. Most book recommendation systems also use this algorithm. However, the traditional recommendation algorithm represented by the collaborative filtering algorithm cannot deal with the data sparsity well. This algorithm only uses the shallow feature design of the interaction between readers and books, so it fails to achieve the high-level abstract learning of the relevant attribute features of readers and books, leading to a decline in recommendation performance. Given the above problems, this study uses deep learning technology to model readers’ book borrowing probability. It builds a recommendation system model through the multi-layer neural network and inputs the features extracted from readers and books into the network, and then profoundly integrates the features of readers and books through the multi-layer neural network. The hidden deep interaction between readers and books is explored accordingly. Thus, the quality of book recommendation performance will be significantly improved. In the experiment, the evaluation indexes of HR@10, MRR, and NDCG of the deep neural network recommendation model constructed in this paper are higher than those of the traditional recommendation algorithm, which verifies the effectiveness of the model in the book recommendation.

With the application of new information technology represented by new technologies such as big data and artificial intelligence, information technology has promoted the continuous development of business informatization. Most modern libraries say goodbye to the traditional means of manual registration, borrowing, and returning. They have applied the book information system to carry out book borrowing and returning related business so that readers can more conveniently check and borrow books [

The recommendation system first collects readers’ historical borrowing data and then recommends books that readers may be interested in after analysis. The traditional recommendation algorithm represented by the collaborative filtering algorithm is the most classic and widely used in practical industrial applications. Most book recommendation systems also use this algorithm [

Given the above problems, this study first converts the embedded representation of user information and book information to be used as the neural network input. A text convolution neural network extracts text information for book titles, and deep learning technology is used to build a network recommendation model for predicting recommendations. This model solves the traditional recommendation problem and improves book recommendation quality. The subsequent chapters of this paper will be arranged as follows: Chapter 2 will introduce the review of relevant studies, Chapter 3 will introduce the specific construction methods of the model, Chapter 4 will introduce the experiments conducted to verify the model’s effectiveness, and Chapter 5 will summarize.

The recommendation system is a helpful method of dealing with information overload, providing an efficient way for users to find their favorite items. In addition to being widely used in e-commerce, the recommendation system is also valued and applied by libraries. In 1998, the Cornell University Library developed a recommendation system called Mylibrary [

Traditional recommendation system algorithms include collaborative filtering, content-based, and hybrid recommendation algorithms. The collaborative filtering algorithm is the most widely used in all areas. It can be seen in personalized recommendation service providers of various e-commerce websites. Still, the collaborative filtering algorithm also shows problems, such as data sparsity and cold-start. In addition, the traditional collaborative filtering algorithm cannot learn the in-depth features of users and items. Although the academic community has proposed some methods to alleviate these difficulties [

Because of its revolutionary progress, deep learning has received extensive attention in many fields in recent years [

This study constructs a deep learning book recommendation model using a deep neural network to solve the traditional recommendation algorithm’s problem of data sparsity and cold-start. Based on the library’s lending data, this model uses reader feature attributes and book feature attributes to learn the hidden features between readers and books and deeply explore the interaction between readers and books to improve recommendation performance.

After data preprocessing, according to the characteristics of book borrowing data, this study builds a deep learning book recommendation network based on various features. The overall framework of the model is shown in

First, the reader attribute data and the book attribute data are input into the embedding layer to obtain the embedded vector of the reader attribute and the book attribute. Embedding is a successful application of deep learning, which represents discrete variables as continuous vectors. Embedding representation uses a low-dimensional vector to describe an object, which can be a word, a commodity, a book, etc. The property of this embedded vector is to make objects corresponding to vectors with similar distances have similar meanings.

Second, the embedded vector representations of reader and book attributes are spliced and fused, respectively. Among these, the text information features of book titles and other embedded features of book attributes obtained after being processed by a convolutional neural network [

Third, the fused reader features and the book features are input into a multi-layer neural network to predict the probability of readers borrowing books. The cross-entropy loss function trains the model and adjusts the parameters. Also, the Adam function is used for optimization.

Last, sort the obtained predicted probability values, and recommend books to readers according to the probability.

Integrate various attributes of users to obtain the features of readers

Integrate various attributes of books to obtain the features of books

Among them,

After acquiring the features of readers and books from the above steps, the features of readers and books are fused and input into a multi-layer neural network to predict whether readers are going to borrow books. According to the model framework shown in

The function

The first hidden output value expression is:

Among them

The activation function of the output layer is the sigmoid function.

The objective function is the cross-entropy loss function:

To train the model and recommend books to readers according to the probability of borrowing predicted by the model.

The dataset used in this study is the borrowing data of the readers of Nanning University Library. After processing and exporting from the library information system, the data information shown in

To advance the experiment, this study uses the “leave-one-out” evaluation method, which has been widely used in several algorithm verification experiments [

We need to tidy the data without abnormality to guarantee the recommender system’s recommendation results quality. For this reason, we clean the borrowing data of readers and remove the out-of-spec and invalid data. In data cleaning, we follow the steps below:

First, we sort and filter the data with incomplete information in the borrowing data of readers. In particular, we pay attention to the completeness of the user’s number and the ISBN attribute dictionary of the book.

Second, filtering the sample data of readers who seldom borrow books. It is difficult to capture readers’ behavioral preferences and interests with too few borrowings. Therefore, after the overall analysis of the borrowing data, samples of readers with small borrowings are filtered and removed.

Third, sorting the classification numbers. The book classification numbers reflect the classification types of books and play an essential role in representing the features of books. Generally, books are classified according to the “Chinese Library Classification” [

Fourth, since some text data contains verbs connected with the author’s name, such as ‘Author’ and ‘Editor’ before or after the author’s name, it is necessary first to filter the data of authors by regularization rules. Then, we remove other symbols and words except for the author’s name before and after so that it only remains the author’s name. The processing effect is shown in

Fifth, since the text of the book title contains rich information, it is necessary to preprocess the text information of the book title as follows and then use it as the input of the neural network to extract the rich features of the content of the book. When preprocessing the text information of the book title, we first use the jieba participle to segment the book title text. After passing the jieba participle, the book title text is processed to remove the stop words. It aims to remove useless or meaningless words in the book title, such as auxiliary words, modal particles, etc. After removing the stop words, it is used as the input data of the neural network.

A | Marxism, Leninism, Mao Zedong Thought, Deng Xiaoping Theory | B | Philosophy, Religion |

C | Social Sciences | D | Politics, Law |

E | Military Science | F | Economy |

G | Culture, Science, Education, and Physical Education | H | Languages |

I | Culture | J | Art |

K | History, Geography | N | Natural Science |

O | Mathematical Sciences, Chemistry | P | Astronomy, Earth Science |

Q | Biology | R | Medicine, Health |

S | Agricultural Science | T | Industrial Technology |

U | Transportation | V | Aerospace |

X | Environmental Science, Safety Science | Z | Comprehensive Books |

Before processing | After processing |
---|---|

Author Zhang Lijun | Zhang Lijun |

Editors Wang Hailin, Zhang Yuxiang, et al. | Wang Hailin, Zhang Yuxiang |

Editor An Hongzhang | An Hongzhang |

Authors Liu Jicai, Tang Sisi, Liu Yongsheng | Liu Jicai, Tang Sisi, Liu Yongsheng |

Author Charles Dickens (England) | Charles Dickens (England) |

Editor Mirror Photography | Mirror Photography |

Editor Chen Weizhen, Wei Yuping | Chen Weizhen, Wei Yuping |

Editor Huang Daqing, Yin Xueyun | Huang Daqing, Yin Xueyun |

Author Osamu Dazai (Japan) | Osamu Dazai (Japan) |

Editors Zhan Xuejun et al. | Zhan Xuejun et al. |

(1) The hardware

A single laptop: CPU: i7-7700HQ; Graphics card: GTX 1050TI; Memory 8G; Hard disk 1TB.

(2) The software

Python version: Python 3.6.3

jieba version: jieba 0.39

Anaconda version: Anaconda 5.0.1

Windows version: Windows 10 64-bit

Tensorflow Version: Tensorflow 1.9.0 GPU

The construction and training of multi-layer neural networks use the Tensorflow framework, one of the most popular machine learning and deep learning frameworks [

To verify the validity and superiority of the model proposed in this study, we compare the following typical traditional recommendation algorithms widely used in developing recommendation systems.

The first one is the popularity-based algorithm, which is used in the early stage of the development of the recommendation system. The popularity-based algorithm is straightforward and rude, similar to major news. It recommends items to users based on their popularity, such as hot news, trending topics on Weibo, etc. [

The other one is an Item-based collaborative filtering algorithm. This algorithm is currently the most widely used in practice. The algorithm utilizes the co-occurrence law of items in user behavior, analyzes user behavior, and recommends items similar to the items he/she liked before [

To verify the effectiveness of the auxiliary information for the recommendation of the deep neural network model, this study uses the deep neural network recommendation model that only contains behavioral interaction between users and items. No auxiliary information, such as attribute features of users or items, is added to the model [

In terms of selecting evaluation indicators, this study chooses to use HR (Hit Ratio), MRR (Mean reciprocal rank), and NDCG (Normalized Discounted Cumulative Gain). HR reflects the recommendation accuracy of the recommendation list. MRR and NDCG reflect the recommendation quality of the recommendation list. These two indicators suggest that people hope that the positions of the items they are interested in in the recommendation list are always at the front.

(1) Hit Ratio. This indicator is commonly used to measure the recall rate in top-K recommendations [

The denominator

(2) Mean reciprocal rank. Its core idea is that the quality of the recommendation list is related to the position of the first item that correctly matches the user’s interest [

(3) Normalized Discounted Cumulative Gain. It is an evaluation indicator to measure the quality of sorting, which considers the correlation of all elements [

(3) Normalized Discounted Cumulative Gain. NDCG is an evaluation indicator to measure the quality of sorting, which considers the correlation of all elements. The formula is as follows:

r_{i} represents the level correlation at the i-th position, which can generally be handled with 0/1. If the item at this position is in the test set, then r_{i} = 1; Otherwise, r_{i} = 0. In addition, Z_{K} is the normalization coefficient, representing the reciprocal of the sum in the best case of the following cumulative sum formula: the sum of the following formulas when r_{i} = 1 is satisfied. It aims to calculate the value of i by NDCG within 0–1.

The data set used in this experiment belongs to implicit feedback. In the model, the samples borrowed by readers are positive samples and the books that readers do not borrow need to be selected as negative samples. According to the borrowing amount of each reader, we randomly choose the books that the reader has not borrowed as a negative sample and iterate them through 15 rounds. The hidden layer parameters are configured with 32RuLU + 16ReLU + 8ReLU, and the ratio of positive and negative samples of readers is 1:1, 1:2, ..., 1:10, etc. We observe the impact of different ratios of negative sample sampling on the model’s performance with a recommended length of 10. Experiments were carried out on the dataset, and the HR@10 values of the experiments are shown in

The experimental results suggest that taking only one negative sample per positive sample under this dataset is not enough to achieve the best performance. It can be identified that more negative sampling is helpful to the improvement of system performance. However, after the negative sampling is 6, the recommendation effect of the model gradually flattens, and the increasing effect is not apparent anymore. This shows that excessive negative sampling will not bring about a significant increase in the recommendation effect and may even have a bad impact on the recommendation effect of the model. In addition, excessive negative sampling will increase the number of training samples in the training set and the training time cost. Considering the effect improvement brought by the increase in negative samples and the cost of system calculation, the ratio of positive and negative samples to 1:6 is very cost-effective.

In analyzing the influence of the hidden layer in the multi-layer neural network on the model recommendation effect, we use different hidden schemes for analysis and comparison. The parameter configuration is that the negative sample parameter is set at a ratio of 1:6, and the number of iterations is 15 times. The results are shown in

Hidden layer | HR@10 |
---|---|

None | 0.262 |

8 ReLU | 0.405 |

16 ReLU | 0.420 |

32 ReLU | 0.425 |

16 ReLU + 8 ReLU | 0.427 |

32 ReLU + 16 ReLU | 0.430 |

32ReLU + 16ReLU + 8 ReLU | 0.452 |

Epoch, the number of training iterations of the model, may have a specific impact on the model’s performance. In this study, the iterative training is carried out on the premise that the ratio of positive and negative samples is 1:6. The impact of the number of iterations on the model effect is observed. The convergence of the loss function when the model is trained for 50 iterations is shown in

According to the training results of the experiment, the squared error is about 0.6× at the beginning. After 50 iterations, the error tends to be flat after the squared difference drops to about 0.3×. It suggests that after integrating reader features and book features, the feature representation of the data is complete and can be better fitted. It can be seen from

When observing the effect of different iterations of training on the performance of HR@10, the experimental values are shown in

In the dataset, we can find that the number of iterations continues to increase, and the HR@10 of the dataset also continues to grow, indicating that the recommendation effect improves with the number of iterations. After 5 rounds of iterations, the HR@10 value decreases as the number of iterations increases. It means that too many iterations will cause overfitting, and the effect of model recommendation will be weakened. Therefore, the model that chooses the number of iterations as 5 rounds is the best.

To verify the recommendation performance of the model proposed, this paper compares it with the traditional recommendation algorithm commonly used in the industry with a recommended length of 10. According to the previous experiments, we determine the parameter: the negative sampling ratio is 1:6, the loop iteration is 5 rounds, and the hidden layer uses the result of 32 RuLU + 16ReLU + 8ReLU. The popularity-based algorithm is referred to as most_TOP, and the item-based collaborative filtering algorithm is referred to as item_CF. The multi-layer neural network recommendation model proposed in this study is referred to as dnn_REM. To verify that integrating the auxiliary information of readers and books in this model can improve the effectiveness of the network model, we also set up a network recommendation model dnn_ONLY that only uses user and book interaction behavior as input compared with it. Of course, the parameter settings are the same during the comparison.

The experimental results are shown in

Algorithm | HR@10 | MRR@10 | NDCG@10 |
---|---|---|---|

most_TOP | 0.274 | 0.113 | 0.150 |

item_CF | 0.342 | 0.202 | 0.235 |

dnn_ONLY | 0.449 | 0.304 | 0.338 |

dnn_REM | 0.476 | 0.220 | 0.280 |

The multi-layer neural network recommendation model proposed in this study performs better in MRR@10 and NDCG@10 than in traditional personalized recommendation algorithms. The reason may be it adds multiple auxiliary attribute information of readers and books and fully non-linear fusion interaction of reader and book features through multi-layer neural networks. This reflects that the application of deep learning technology in book personalization is effective and superior. Under the same parameter configuration, dnn_ONLY without reader and book auxiliary information is compared to dnn_REM with reader and book auxiliary information. When the recommendation quality is roughly the same, the recommendation accuracy of the deep neural network model integrating reader and book auxiliary information is significantly higher than the deep neural network recommendation model using only the interaction behavior of readers and books. It shows that the deep neural network book recommendation model helps improve the recommendation effect after integrating the auxiliary information of auxiliary readers and books.

The length of the recommendation list also impacts the recommendation results. In this experiment, different recommendation lengths topN were set to 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. We observed the impact of different recommendation list lengths on the effect of recommendations. The experimental results are shown in

TopN |
Top1 | Top2 | Top3 | Top4 | Top5 | Top6 | Top7 | Top8 | Top9 | Top10 |
---|---|---|---|---|---|---|---|---|---|---|

HR | 0.118 | 0.187 | 0.247 | 0.293 | 0.335 | 0.364 | 0.395 | 0.423 | 0.447 | 0.476 |

MRR | 0.118 | 0.153 | 0.173 | 0.184 | 0.193 | 0.197 | 0.202 | 0.205 | 0.208 | 0.220 |

NDCG | 0.118 | 0.162 | 0.192 | 0.211 | 0.228 | 0.238 | 0.248 | 0.257 | 0.264 | 0.280 |

As seen from

The current study uses deep learning to model the probability of readers borrowing books and constructs a multi-layer neural network recommendation model. After the information of readers and books is processed by the embedded table and text convolutional neural network, it is input into the multi-layer neural network to deeply explore the hidden deep related features between readers and books. The probability of readers borrowing books is predicted, the recommendation list is generated, and the personalized recommendations are completed. With the rigorous design of the experiment, the experimental results are compared and analyzed, which verifies the effectiveness and superiority of the deep learning multi-feature fusion recommendation model proposed in our study. It provides an effective way for a personalized recommendation for book borrowing.

However, this paper proposes that the model is only verified on the data in the actual work of the work unit, without using more extensive general recommended data. In the future, it can be considered to conduct research on the more general and extensive data of the model to improve its wider applicability.

