As shown in Fig. 1, we consider a wireless edge cache-enabled network, in which BS is equipped with a storage and connected to M relays {Rm|m=1,2,…,M} with cache space through backhaul link. Among them, the relay node covers N MDs {MDn|n=1,2,…,N}, and the buffer space of each relay node is C. The MEC server is located in BS, which regularly predicts the popularity of files through the collected historical data and updates the caching strategies of the relay nodes.
2.1 System model
Let {Ti(αi,βi,γi)|i=1,2,…,I} denote the information of computational task store at the BS, where αi is the size of input computational task, βi is the number of CPU cycles required to accomplish the task, and γi denotes the size of computation result of the task. In order to maximize the use of the BS and relays with limited space, and to satisfy the needs of most MDs, we need to accurately predict the request contents of MDs around the BS, and then to compute the contents in advance at the BS. Therefore, we use cache hit rate to measure the performance of prediction. In particular, the cache hit rate Phit is defined as
$$\begin{array}{*{20}l} P_{hit}=\frac{\sum_{n=1}^{N}\sum_{i=1}^{I} x_{n,i}}{U}, \end{array} $$
(1)
where U is the total number of requests sent by users around the BS, and xn,i is the caching strategy defined as,
$$ x_{n,i}= \left\{ \begin{array}{lr} 1 \ \text{if\ the\ file}\ i \text{\ result\ requested\ by\ the\ user}\ n\ \text{is\ cached\ at\ the\ BS}, \\ 0 \ \text{otherwise}. \end{array} \right. $$
(2)
The BS gets the files that users may request by predicting the file popularity, calculates these files in the edge server, and then sends the results to the corresponding relay node for caching. We assume that the data rate of the wireless link between the BS and relay Rm based on the Shannon theory is given by,
$$\begin{array}{*{20}l} C_{B,m}=W_{B,m}\log_{2}\left(1+\frac{P_{B,m}|h_{B,m}|^{2}}{\sigma_{B,m}^{2}}\right), \end{array} $$
(3)
where WB,m denotes the wireless bandwidth and \(h_{B,m}\sim \mathcal {CN}(0,\epsilon _{B,m})\) denotes the channel gain between the BS and relay Rm [20–22]. PB,m is the transmit power at the BS and \(\sigma _{B,m}^{2}\) is the variance of the additive white Gaussian noise at the BS [23–25]. Let fB denote the computational capability of the BS, and express the computational latency as
$$\begin{array}{*{20}l} L_{compute}^{i} = \frac{\beta_{i}}{f_{B}}. \end{array} $$
(4)
The transmission latency, caused by the BS sending the task result Ti to the relay Rm, can be calculated as,
$$\begin{array}{*{20}l} L_{B,m}^{i} = \frac{\gamma_{i}}{C_{B,m}}. \end{array} $$
(5)
Similarly, the data rate of the wireless link between the relay Rm and the user MDn based on the Shannon theory is given by,
$$\begin{array}{*{20}l} C_{m,n}=W_{m,n}\log_{2}\left(1+\frac{P_{m,n}|h_{m,n}|^{2}}{\sigma_{m,n}^{2}}\right), \end{array} $$
(6)
where Wm,n denotes the wireless bandwidth and \(h_{m,n}\sim \mathcal {CN}(0,\epsilon _{m,n})\) denotes the channel gain between the relay Rm and the user MDn. Pm,n is the transmit power at the relay Rm and \(\sigma _{m,n}^{2}\) is the variance of the additive white Gaussian noise at the relay Rm. The transmission latency, caused by the relay Rm sending the task result Ti to the user MDn, can be calculated as,
$$\begin{array}{*{20}l} L_{m,n}^{i} = \frac{\gamma_{i}}{C_{m,n}}. \end{array} $$
(7)
In this paper, the BS uses idle time to compute the task and transmit the result of the task to the nearby relay in advance according to the prediction in the proposed scenario. When the user request the task results, the user can get the correspond results from the nearby relay without waiting. The waiting latency is mainly caused by computation latency at the BS and transmission latency from the BS to the relay. We reduced waiting latency of the user by increasing the predictive cache hit rate. So, the latency reduction can be expressed as
$$\begin{array}{*{20}l} L_{re}^{i} = x_{n,i} \left(L_{compute}^{i} + L_{B,m}^{i} \right). \end{array} $$
(8)
The higher \(L_{re}^{i}\) indicates higher cache hit rate.
2.2 Problem formulation
The problem in this study consists of two subproblems: maximizing cache hit ratio and minimizing request latency.
2.2.1 Maximizing cache hit ratio
The problem of maximizing the cache space of the BS can be translated into the problem of maximizing cache hit rate, which can be written as
$$\begin{array}{*{20}l} \max_{} \quad &P_{hit} \end{array} $$
(9a)
$$\begin{array}{*{20}l} \text{s.t.} \quad &\text{C\_{1}}: \sum_{i=1}^{I} T_{i} \leq L, \end{array} $$
(9b)
where L is the maximum cache space of BS. C1 indicates that the number of files cached at the BS which cannot exceed the cache space limit of the BS. Similarly, Phit gets higher while \(L_{re} = \sum {L_{re}^{i}} \) gets higher.
2.2.2 Minimizing request latency
When there are multiple relays around the user, the BS needs to consider which relay should send the results after predictions, so that the user would take less time to get the file results. From (6), we can see that by considering the transmission channel condition and bandwidth between the relays and the user, the user MDn would take less time to get the results of task Ti. For the considered system, the goal of latency in the process of contents placement can be expressed as
$$\begin{array}{*{20}l} \min \quad L_{pl} = \sum_{m=1}^{M}\sum_{n=1}^{N}\sum_{i=1}^{I} L_{m,n}^{i}. \end{array} $$
(10)
The conventional approach for predictive modeling involves complex feature engineering and deep analysis of the data by hand. In this paper, we aim to improve hit rate by learning user preferences through large historical data. A deep neural network (DNN) [26–28] based predictive framework needs to be developed to learn user preferences. We can feed the data directly into the networks without manual processing and can mine more information from the data. So we use DNN to solve the problem of content popularity prediction, which is given as follows.