The number of mobile devices accessing wireless networks is skyrocketing due to the rapid advancement of sensors and wireless communication technology. In the upcoming years, it is anticipated that mobile data traffic would rise even more. The development of a new cellular network paradigm is being driven by the Internet of Things, smart homes, and more sophisticated applications with greater data rates and latency requirements. Resources are being used up quickly due to the steady growth of smartphone devices and multimedia apps. Computation offloading to either several distant clouds or close mobile devices has consistently improved the performance of mobile devices. The computation latency can also be decreased by offloading computing duties to edge servers with a specific level of computing power. Device-to-device (D2D) collaboration can assist in processing small-scale activities that are time-sensitive in order to further reduce task delays. The task offloading performance is drastically reduced due to the variation of different performance capabilities of edge nodes. Therefore, this paper addressed this problem and proposed a new method for D2D communication. In this method, the time delay is reduced by enabling the edge nodes to exchange data samples. Simulation results show that the proposed algorithm has better performance than traditional algorithm.

With the development of the Internet of Things (IoTs) and Artificial Intelligence (AI) [

At the same time, various intelligent applications in the future rely on artificial intelligence technology (such as deep learning, etc.), and use locally obtained data samples for artificial intelligence model training and intelligent deduction [

In edge intelligence technology, federated learning [

Efficient implementation of federated learning faces a series of technical challenges. On the one hand, the computing resources of edge nodes are relatively limited. On the other hand, the implementation of federated learning relies on frequent parameter update and aggregation between edge nodes and edge servers, and as the number of edge nodes increases and the dimension of AI models increases, the above parameter update and aggregation will lead to communication overhead very large. Therefore, the limited computing and communication resources are the main bottlenecks for the improvement of joint learning performance. In the actual network, different edge nodes have heterogeneity in computing power, and the data sample sizes that different edge nodes need to process are also different. Therefore, the calculation execution time for local model update will be different. At the same time, due to the differences in deployment locations and the fading characteristics of wireless channels, the channel states between different edge nodes and edge servers are also different, resulting in differences in the performance of model parameters during upload and download. Therefore, how to optimize the communication and computing resource allocation of the network is an important means to improve the performance of joint learning. In the existing work, reference [

In the MEC system, offloading is an important technology, which can effectively improve the computing power of edge nodes and alleviate the situation that the computing and communication capabilities of edge nodes do not match the task load. Generally, according to different offloading objects, offloading techniques can be divided into two types, namely offloading between devices and infrastructure (such as base stations) [

This paper studies federated learning in MEC networks. The D2D computing task offloading for federated learning in the MEC network is shown in

In this paper, it is assumed that the D2D task offloading between different edge nodes and the uploading of the model parameters of the uplink adopts the frequency division multiple access (FDMA) protocol to avoid the mutual interference of different edge nodes in the transmission process. Based on this, the goal of this paper is to optimize the amount of computing tasks offloaded by all edge nodes, so as to minimize the total time consumed by the offloading process of computing tasks and the training process of the joint learning model. However, since the computational task offload of edge nodes is a discrete variable, this problem is a non-convex optimization problem that is difficult to solve. In order to facilitate the solution, this paper converts the non-convex optimization problem into a convex optimization problem by means of continuous discrete variables, and then rounds the obtained continuous solution to obtain the original problem. The simulation results show that the proposed D2D computing task offloading scheme can reduce the impact of the heterogeneity of computing and communication capabilities of edge nodes on the training of the joint learning model. It greatly reduces the time consumed by the joint learning model training and improve the training efficiency of the joint learning model. At the same time, it reduces the impact of the non-IID characteristics of the data, and improve the accuracy of model training.

This paper studies the federated learning system based on MEC. As shown in _{i}_{i}_{d}_{d}_{d}_{d}

The research method in this paper can also be extended to other machine learning models. On edge node

Among them,

On edge servers, the global loss function is defined as:

Among them,

Finding the global model parameter

It is assumed that the FDMA access protocol is used when uploading model parameters between edge nodes and edge servers, and different edge nodes use different frequency bands, so there is no mutual interference between edge nodes. In addition, the uplink/downlink for model parameter transmission uses Time Division Duplex (TDD) technology. Due to the reciprocity of the channel, the channel state information of the uplink/downlink is consistent. This paper assumes that the wireless channel during the training of the federated learning model is static and does not change.

At the beginning of each frame, that is, when the federated learning model training starts or after the edge server completes the global aggregation operation, the edge server needs to send the global model parameters to each edge node. The information transfer rate for global model parameter download is determined by the user with the worst channel gain [

Among them,

Therefore, the time it takes for the edge server to send the global model parameters to each edge node is:

Among them,

After receiving the global model parameters, all edge nodes overwrite the original local model parameters with the global model parameters. The process that edge nodes use their local datasets to update local model parameters using gradient descent method is called local model update operation. All edge nodes can perform one or more local model update operations. In this paper, all edge nodes are set to use batch gradient descent (BGD) method to train the model [

For any edge node

In order to analyze the latency performance of the local model update, it is assumed that any edge node uses a data sample to perform a local model update operation that requires

During the training process of the joint learning model, it is difficult to count the number of floating-point operations of the piecewise function in ^{2} + 3^{2} + 3^{2} + 5^{2}.

Edge node _{i}_{i}

Among them,

When each edge node completes

Each edge node uploads local model parameters according to the FDMA access protocol, and the transmission bandwidth allocated by all edge nodes is B/K. Therefore, the information transmission rate at which the edge server receives the local model parameters uploaded by the edge node

Among them,

Since the number of bits of the local model parameters is the same as the number of bits of the global model parameters, the time consumed by the edge node

After receiving the local model parameters uploaded by all edge nodes participating in the joint learning, the edge server performs a weighted average operation on all the local model parameters, and the process of obtaining new global model parameters is called global aggregation.

Since the computing power of the edge server is strong enough, and the global aggregation operation only performs a weighted average operation on all the received local model parameters, which requires less computation, the time

After completing the model training of M-frame joint learning, the total time consumed by the system is:

Among them,

Due to the heterogeneity of edge nodes, that is, the difference in CPU clock frequency of different edge nodes, the difference in the number of floating-point operations that can be performed in one CPU clock cycle, the difference in the number of stored data samples, and the difference in transmit power. There is a difference in the time taken to complete the local model update operation process and the local model parameter upload process in one frame.

It can be seen from _{i}_{j}

The greater the heterogeneity of different edge nodes, the longer the edge nodes with stronger computing and communication capabilities (or smaller data samples) need to wait, which will cause the computing and communication resources of edge nodes with strong computing and communication capabilities. At the same time, it prolongs the time consumed by the joint learning model training process and reduces the training efficiency of the joint learning model.

In order to reduce the impact of the heterogeneity of computing and communication capabilities of edge nodes on the training efficiency of the joint learning model, this paper proposes a D2D computing task offloading scheme for joint learning. The process of performing D2D computing task offloading is defined as the 0th frame of joint learning. The D2D computing task offloading for joint learning is shown in

In machine learning, the number of data samples reflects the amount of computing tasks required for model training. Therefore, the offloading of computing tasks for D2D is actually the offloading of data samples between edge nodes.

The number of data samples unloaded from edge node _{ij}

The FDMA access protocol is used when computing tasks are offloaded between edge nodes to avoid mutual interference between different links. Since there is no task transfer between two edge nodes in the process of offloading computing tasks between edge nodes, the bandwidth allocated to the communication link between any two edge nodes is

Let

Among them,

Therefore, the time it takes for edge node

Among them,

After the D2D computing task is unloaded, the time consumed by the edge node

Substituting

Based on the above D2D computing task offloading scheme, this paper considers the problem of computing task distribution among different edge nodes, minimizes the total time consumed by the computing task offloading process and the joint learning model training process, and realizes the task offloading process time consumption and model training process time consumption is the optimal compromise between the them. Therefore, the goal of this paper is to minimize the total latency consumed by the joint learning training process by optimizing the amount of data sample unloading

Among them,

Although the D2D computing task offloading scheme for federated learning increases the time consumed by the D2D computing task offloading process, it can effectively reduce the impact caused by the heterogeneity of computing and communication capabilities between different edge nodes, making each edge node more efficient. The amount of computing tasks is matched with its computing power, reducing the time consumed in the training process of the joint learning model, thereby achieving the goal of minimizing the total time consumed by the entire system. This scheme can achieve the optimal compromise between the time consumption of the computing task unloading process and the time consumption of the model training process.

Since the variables

To facilitate the solution, first relax

Since the problem (P1.1) is a convex optimization problem, the mature convex optimization tool CVX can be used to solve the problem (P1.1), and the continuous solution

In this section, simulation experiments are used to verify that the D2D computing task offloading scheme proposed in this paper achieves a large performance gain for the time consumption of the joint learning process.

This paper simulates a MEC environment on a cellular network consisting of an edge server and several edge devices. The distribution of edge nodes in the MEC system is shown in

In the MEC system, there are three heterogeneous edge nodes for three different types of smartphones to participate in the joint learning model training, which are named as edge node I, edge node II and edge node III. The CPU clock frequency If of the edge node I is 1.5 GHz, the number of floating-point operations

The public data set used for joint learning model training is the MNIST data set, with a total of 70,000 images of handwritten digits with white characters on a black background, of which 60,000 pictures are training data samples and 10,000 pictures are test data samples [

Before the start of the experiment, all edge nodes have 4500 data samples, and the data set owned by each edge node is assumed to be non-IID. That is, the labels of the data samples owned by each edge node only have the MNIST data set labels.

The training model used in the experiment in this paper is SSVM, and a data sample can be obtained to perform a local model update operation, which requires 2 × 784^{2} = 1229312 floating-point operations, that is,

Given the number of global aggregations

It can be seen from

The training accuracy of the joint learning model before and after the D2D computing task is unloaded is shown in

It can be seen from

This paper considers the joint learning model of joint edge server and multiple edge nodes in the MEC system, studies the influence of edge node communication and computing heterogeneity on the training efficiency of joint learning model, and proposes a D2D computing task for joint learning. The unloading scheme, by allocating the number of data samples of edge nodes participating in joint learning, realizes the optimal compromise between the time consumed by data sample unloading and the time consumed by joint learning model training, so as to minimize the time consumed by the system. The simulation results verify that the D2D computing task offloading scheme for joint learning proposed in this paper can effectively reduce the impact of the heterogeneity of edge node computing and communication capabilities, significantly improve the training efficiency of the joint learning model, and at the same time, it can reduce the non-independence of data. The influence of the same distribution characteristic improves the training accuracy of the joint learning model. The future work is to consider different other parameters and evaluate the proposed method.

The authors would like to thanks the editors and reviewers for their review and recommendations.

The authors received no specific funding for this study.

The authors declare that they have no conflicts of interest to report regarding the present study.