The application field of the Internet of Things (IoT) involves all aspects, and its application in the fields of industry, agriculture, environment, transportation, logistics, security and other infrastructure has effectively promoted the intelligent development of these aspects. Although the IoT has gradually grown in recent years, there are still many problems that need to be overcome in terms of technology, management, cost, policy, and security. We need to constantly weigh the benefits of trusting IoT products and the risk of leaking private data. To avoid the leakage and loss of various user data, this paper developed a hybrid algorithm of kernel function and random perturbation method based on the algorithm of non-negative matrix factorization, which realizes personalized recommendation and solves the problem of user privacy data protection in the process of personalized recommendation. Compared to non-negative matrix factorization privacy-preserving algorithm, the new algorithm does not need to know the detailed information of the data, only need to know the connection between each data; and the new algorithm can process the data points with negative characteristics. Experiments show that the new algorithm can produce recommendation results with certain accuracy under the premise of preserving users’ personal privacy.

As an emerging product, the Internet of Things has a more complex architecture, no unified standards, and more prominent security issues. While providing various convenient services, it is inevitable that more and more detailed personal information needs to be provided in order to obtain better services. On the other hand, more and more personal information is made public, which makes human beings have no privacy. In particular, government agencies and public service agencies are increasingly publishing data containing personal information. Whether it is privacy in data distribution or location privacy in location services, the protection of users’ personal information is particularly important [

Personalized recommendation is the technical support that the Internet relies on today. In 2014, Min et al. [

In terms of privacy-preserving, in 2014, Wang et al. [

Based on above-mentioned development of personalized recommendation and privacy protection, the personalized recommendation privacy protection technology in the current IoT mobile service is still in an early stage, and most of the results have not been discussed in an effective recommendation system for privacy protection [

The main idea of this paper is as follows. The related work section mainly introduces preliminary knowledge related to the new algorithm, including the basic idea kernel method. The third section gives the main steps of the new kernel method. The fourth section shows the results of experiment and analysis of relevant results. Finally, the fifth section summarizes this paper and briefly explains the follow-up research work arrangement.

NMF is to find a non-negative base-matrix and a coefficient-matrix, and it meet equation of

The raw matrix

where the choice of

The literature [

Kernel method firstly performs data mapping:

Then do the normal NMF in the new space:

Because it does not know the specific mapping function, the paper need to use the kernel method to get:

It is easy to see that the left side of the above equation is the definition of the kernel. The paper does not need to know the detailed mapping definitions, just need to know their inner product. The choice of kernel can make Gaussian kernel, polynomial kernel and so on. By replacing the above formula with a nuclear function, it can get:

where,

Kernel function is to map data from low-dimensional to a new space, to facilitate the conversion of data that cannot be linearly segmented into data that can be linearly segmented, usually can be described as:

It can be seen that the equation and the ordinary non-negative matrix factorization have the same representation, so the same method can be used for solving.

Given a new matrix

where,

The new algorithm is mainly composed of two parts: The data privacy protection process and the data personalization recommendation process. The data protection process is mainly based on random interference technology to hide data; the personalized recommendation process of data is mainly based on the similarity between data to obtain personalized recommendation results.

The flow diagram of privacy-preserving process of user data is shown in

First, random perturbation techniques are used in the privacy-preserving of mining. An intuitive method is to add a number of

Given a recommendation system with

The server determines the specific distribution (uniform distribution or normal distribution) of the disturbance data and the corresponding parameters

Each user fills the unsorted items with the mean of their existing ratings.

For each user

Finally, the server uses the

The flow diagram of the recommendation process of user data is shown in the

The steps of NMF for privacy-preserving are as follows:

The rating matrix

Given a new user

Then use the new feature matrix

Finally, by setting the number of neighbors, some similar users are found and compared to the raw data, it can weighted to obtain the rating of the item

Kernel method has improved the basic idea of NMF. The privacy-preserving step is similar to the NMF, and finally a rating matrix of hidden information is obtained. The kernel method of the hidden matrix is decomposed as follows.

A matrix for mapping the data to new spaces through nonlinear mapping. This paper selected the Gaussian kernel function for mapping, and then the NMF is performed to obtain the matrices

Given a new user and its rating information for the items

Then, using the new feature matrix

Finally, by setting the number of neighbors, some similar users are found and compared to the raw data, it can weighted to obtain the rating of the item

This algorithm consists of two parts: offline data preparation and online data processing. The first part is offline data preparation. It mainly does the nonnegative matrix decomposition of the hidden matrix. Because this part is offline, it is prepared for the second part in advance, which does not affect the online recommendation generation, so the time complexity of this part will not be considered. The second part includes the calculation of

MovieLens is the oldest recommendation system. The MovieLens dataset contains rating data for multiple movies from multiple users, as well as movie data information and user attribute information. This data set is often used as a test data set for recommendation systems, machine learning algorithms. The content in the file contains the rating of each user for each movie. The data in our experiments consists of 1,000,000 ratings for 6,040 users with 3,952 movies, and all users must rate at least 20 movies.

The mean absolute deviation is a statistic that describes the degree of data dispersion. The mean absolute error (MAE) is:

where,

It can be known from the

This paper demonstrates the effectiveness of the proposed algorithm by comparing the kernel method privacy-preserving algorithm with the non-negative matrix factorization privacy-preserving algorithm. At first, find the mean of the user’s rated movie to fill the corresponding unrated movie. Then, each user creates a new privacy-preserving data rating matrix by creating

This experiment mainly verifies the performance of the recommendation system by predicting the rating of the known movie and comparing the real rating of the movie with the predicted rating. Basically, it can be understood that the rating value of the predicted movie is set to null, and the prediction rating is performed by using the hidden matrix and the kernel method-based privacy-preserving algorithm proposed in this paper. Since the matrix decomposition results close to the hidden matrix each time, but the results of each decompose are different, this paper performs 20 matrix decomposition to narrow the difference of prediction results, and finally takes the mean value of the prediction results as the movie rating prediction value, it can narrows the difference of the results.

First, this paper examines the influence of the value of

In

In

When

When

Because non-negative matrix factorization and kernel method predictions mainly use the similarity between new users and existing scoring users to make recommendations, the impact of the number of neighbors on prediction accuracy is investigated. Using the data set experiment of 6040 by 3952, the number of neighbors is changed separately, and the corresponding prediction result is obtained. The results of experiment are shown in

In

In

When

When

In order to examine the influence of the dispersion degree of the disturbance data on the prediction accuracy, a 6040 by 3952 data set experiment was used to change the variance of the disturbance data and examine the results of the corresponding prediction. The results of experiment are shown in

Obviously, the degree of interference of raw data has a greater impact on the prediction. When the interference degree of the raw data is small, the accuracy of the prediction result of the recommended system is better.

From above experiments, the results of the perturbation data using kernel method are slightly better than non-negative matrix factorization. Moreover, from the previous validity analysis, no matter what kind of disturbance distribution, the mean absolute error using the same algorithm should converge to the error of the undisturbed data in the large samples.

This paper proposed a kernel method-based privacy-preserving collaborative filtering algorithm that is easy to implement and can guarantee effectiveness of recommendation. This algorithm mainly involves two parameters, one is the dimension value

This paper developed an algorithm of kernel nonnegative matrix factorization and random perturbation technology. The algorithm has a privacy-preserving function, which enables the Internet of Things service system to easily collect the necessary personalized recommendation data while protecting the privacy of users. The actual analysis demonstrates that the kernel method is not sensitive to k and intermediate dimension size t on the basis of preserving the user’s privacy. It can improve recommendation accuracy, achieve the effectiveness of recommendation, and meet the needs of the recommendation system. Of course, there are still some problems in the algorithm that need further research, such as the value of users or projects in kernel nonnegative matrix factorization.