Joint optimization of dynamic resource allocation and packet scheduling for virtual switches in cognitive internet of vehicles

The rapidly evolving machine learning technologies have reshaped the transportation system and played an essential role in the Cognitive Internet of Vehicles (CIoV). Most of the cognitive services are computation-intensive or storage-intensive, and thus they are usually deployed in edge or cloud data centers. In today’s data center networks, the virtual machines hosted in a server are connected to a virtual switch responsible for forwarding all packets for the cognitive services deployed on the virtual machines. Therefore, the virtual switches will become a performance bottleneck for cognitive services without an efficient resource allocation and data scheduling strategy. However, the highly dynamic characteristics of cognitive services make the resource allocation and packet scheduling problem for virtual switches surprisingly challenging. To guarantee the performance of cognitive services, we investigate the joint optimization problem of dynamic resource allocation and packet scheduling for virtual switches. We first model the joint optimization problem of dynamic resource allocation and packet scheduling for virtual switches as a mathematical optimization problem. Then, we analyze the problem with Lyapunov Optimization Framework and derive efficient optimization algorithms with performance tradeoff bounds. At last, we evaluate these algorithms on a testbed and a network-wide simulation platform. Experiment results show that our algorithms outperform other designs and meet the theoretical performance bound.

We demonstrate the architecture of CIoV framework in Fig. 1. The CIoV architecture consists of cognitive engines, cloud data centers, edge data centers, and vehicles. The cognitive engine is a set of software systems running in the cloud data center and edge data center, which provides software runtime environment, service management, and task scheduling for the above machine learning tasks. Cloud and edge data centers offer centralized and distributed hardware environments for cognitive engines, respectively. Cloud data centers centrally host high-performance servers and accelerators for machine learning model training and inference. Edge data centers are deployed at the edge of networks and geographically distributed in the transportation system. Servers in cloud data centers are connected through the data center networks, and servers in edge data centers are connected to data centers via high-speed backhaul links. ❶ All data transmissions of cognitive services rely on the CIoV network system. The cognitive service request generated by a vehicle is first received by an edge data center server. ❷ For emerging and latency-sensitive service requests, edge data centers handle them in local virtual machines and forward other requests to cloud data centers. ❸ When a service request demands a larger amount of computation and storage resources, the service request needs to traverse through data center networks and acquire resources on other servers. Then, the request is forwarded to a server through the data center network, which hosts the cognitive service in a virtual machine. After that, the cognitive service replies the request along the reverse path to the vehicle.
All cognitive service requests and responses are transmitted through the edge or cloud data center networks. The edge and cloud data center networks play an irreplaceable role in CIoV. Although high-performance ASIC-based switches and sophisticated network scheduling algorithms are well equipped in these data center networks to handle dynamic and urgent requests. A critical missing component, the software-based virtual switch, becomes a hindrance to system optimization. Servers in edge and cloud data centers typically host all machine learning services in virtual machines through a virtualized environment. In Fig. 2, we illustrate the architecture of a virtualized server in edge and cloud data centers. The virtualized server hosts multiple virtual machines (VMs), a hypervisor, and a vswitch from the application layer to the system layer. Cognitive service providers rent edge and cloud data centers' compute, storage, and bandwidth resources for deploying machine learning services in this architecture. And the cloud manager virtualizes the hardware resources and manages the virtual machines through the hypervisor. Furthermore, the cloud manager usually deploys a vswitch on each virtualized server to provide numerous network services, such as Access-Control List (ACL), Quality of Service (QoS), and Virtual Private Cloud (VPC), etc. The vswitch is a system software that takes over the physical Network Interface Cards (NICs) plugged into the server and virtual (software) NICs created by virtual machines. According to the cloud manager's configuration, the vswitch forwards packets between these NICs by forwarding rules. All cognitive service requests and replies are forwarded by virtual switches, which affects the quality of service of each cognitive service. The virtual machines, the hypervisor, and the vswitch share hardware resources on a server. The virtual machines occupy most of the CPU cores on the server and bring significant revenue for the cloud manager. The hypervisor occasionally takes up a small number of CPU cores to marinate the virtualization system. The vswitch takes up a considerable amount of CPU resource to provide line-rate packet forwarding for the VMs. In addition, with the evolution of data center networks, the number of NICs and NIC bandwidth are growing substantially. Thus, the cloud manager has to allocate more CPU resource to the vswitch. However, allocating more CPU resource to the vswitch reduces the revenue of cloud services and increases operational expenses (OpEx).
To tackle this problem, we need to efficiently allocate resource to each vswitch in edge and cloud data centers. However, the rapid development of edge and data center networks in CIoV also creates enormous challenges in resource allocation for vswitches. On the one hand, the cloud manager can hardly predict the users' requirements accurately in real-time. Virtual machines deployed by cognitive service providers may receive and transmit data in an arbitrary time. For example, cognitive service requests may be generated at any time when vehicles pass through an edge network node, which makes requests unpredictable. Thus, it is difficult for the cloud manager to accurately predict when virtual machines receive requests and reply data. Although it is possible to summarize traffic patterns statistically and allocate resource to vswitches based on prior knowledge, the prediction and resource allocation is coarse-grained and inaccurate, which leads to improper resource allocation in ICoV.
On the other hand, the emerging hardware architecture of virtualized servers in data centers also complicates the dynamic resource allocation problem for vswitches. In recent years, the cloud manager will plug multiple physical NICs on each server to increase physical network bandwidth, achieve load balance, and improve server access reliability. These physical NICs are connected to Top-of-Rack (ToR) switches, and the vswitch will take over the NICs in the same server. However, deploying multiple NICs on each server will further complicate the dynamic resource allocation of vswitches in data centers. First, the packet scheduling decisions of a vswitch should be made based on the amount CPU resource allocated to the vswitch. The vswitch executes packet scheduling on CPU cores in runtime for classifying and forwarding each packet to NICs. The packet scheduling decision determines which NIC should a packet be forwarded to, and it impacts the incoming workload of a NIC and the number of allocated CPU cores on a NIC. For example, the vswitch could schedule more packets to a NIC allocated with more CPU cores to reduce CPU idling. Therefore, to achieve better packet delivery performance, we need to jointly optimize the dynamic resource allocation problem and the packet scheduling problem. Second, to enhance the packet forwarding throughput for cognitive services with high QoS requirements, the resource allocation and packet scheduling strategies should be made in a very short time according to the dynamic traffic pattern, which makes the joint optimization problem more difficult.
To address the above challenges, we jointly model the dynamic vswitch resource allocation problem and the packet scheduling problem in CIoV. The model jointly optimizes the resource allocation strategy for each vswitch NIC and the packet scheduling strategy for each packet in each time slot. To solve the joint optimization problem efficiently, we propose dynamic resource allocation and packet scheduling algorithms for vswitches based on Lyapunov optimization.
The main contributions of this paper can be summarized as follows.
• We model the problem into an optimization problem, which minimizes the time expenditure of resource allocation on vswitches under resource constraints and network constraints. • We apply the Lyapunov Optimization Framework [16][17][18][19] to transform the problem into a discrete-time queueing system, which converts the solution of the optimization problem into a queue stability problem. Lyapunov optimization is an optimization framework for dynamic systems, and it can make decisions using dynamic queue backlogs (lengths) to stabilize the entire system in real-time. It does not require prior knowledge about the input data and only acquires queue backlogs to optimize the system in real-time. • We derive the performance tradeoff O(1/V) and O(V) between the time-average expectation of resource allocation and queue backlogs and design low-complexity distributed scheduling algorithms to solve the problem. The factor V is a positive penalty factor, which indicates the relation between the system stability and the object of the problem. Lyapunov optimization framework can also provide a performance bound between the optimization object and queue backlogs, enabling us to balance the stability of the system and the optimization object. We also propose extended algorithms to improve queue backlogs without breaking the performance tradeoff. • We evaluate our packet scheduling and resource allocation algorithms on virtualized servers and network-wide simulations. Experiment results show that our algorithms outperform other methods in real-world scenarios and workloads.

Related works
Cognition IoV empowers Artificial Intelligence with IoV through integrating data mining, reinforcement learning, and deep learning in its architecture. CIoV frameworks usually provide cognitive services by deploying services in edge and cloud data centers. Cloud data centers provide high-powered computation and deploy more accurate machine learning models. It also allows global resource optimization to provide more efficient and dynamic cognition services. On the other hand, edge data centers handle a part of lightweight computation and storage tasks, which reduce transmission latencies and network congestion. Chen et al. [20] first introduce a Cognition Layer to enhance intelligence in IoV architecture. Cognition Layer handles vehicle tasks, including traffic condition quality analysis, driver behavior modeling, etc. The Cognition Layer is deployed in edge and cloud data centers. The cognition layer processes latency-sensitive tasks in edge data centers. Otherwise, it forwards nonlatency-sensitive tasks or computation-intensive tasks to cloud data centers. Qian et al. [21] present a three-layer CIoV design that deploys cognitive services in a Cognitive Engine Layer. The Cognitive Engine Layer handles cognition requests in data centers, which are generated from Road Side Units (RSU) and vehicles. Hasan et al. [22] propose a five-layer CIoV model, which includes an Edge Computing and Data Pre-Processing Layer and a Cognition and Control Layer. The Edge Computing and Data Pre-Processing Layer is deployed in edge data centers and collects data through intra-vehicular and inter-vehicular communications. It forwards the preprocessed data to the Cognition and Control Layer. The Cognition and Control Layer is deployed in cloud data centers and provides data storage, computing, and machine learning-based data processing of cognition services. Lu et al. [11] design a CIoV architecture for Autonomous Driving. It preprocesses real-time cognitive requests in Fog Layer and offloads others requests to Cloud Layers. Therefore, all above CIoV architectures rely on networks between edge and cloud data centers, and the optimization of the network resource allocation is essential in CIoV. The data networks of edge and cloud carry all cognitive service requests and responses, which plays a critical role in CIoV. However, as a key component in data center networks, software-based vswitches require a significant amount of computing resources on servers to provide high-performance packet processing. Some works have studied the dynamic resource allocation of a single vswitch. Shenango [23] designs a congestion detection algorithm to track the duration time of each packet in the ingress queue of vswitch as an indicator of core allocation. It reduces the granularity of core allocation time intervals to microsecond timescales with a polling kernel module. Arachne [24] provides a dynamic CPU core allocation algorithm for vswitch, which scales up the number of CPU core allocation by CPU load factor and scales down by hysteresis factor. Snap [25] allocates CPU cores to vswitch through a centralized resource controller, which supports allocating CPU cores to reduce the tail latency of packet forwarding for vswitches. In contrast, our network model and algorithms can dynamically allocate CPU cores to virtual switches based on the analysis of Lyapunov optimization framework and provide performance bounds on queue length and resource allocation.
On the other hand, network-wide vswitch resource allocation also reduces the resource overhead of cognitive services and improves the performance of cognitive services from a global perspective. A lot of researches have investigated network-wide vswitch resource optimization on edge and cloud data centers. Yang et al. [26] joint optimize the vswitch deployment and network routing to improve the network load balancing in Software Defined Networks (SDNs). It formulates the optimization problem as an Integer Linear Problem (ILP) and solves the problem with approximation algorithms. However, the algorithm only optimizes resource allocation on virtual switches based on statically, and all cognitive service requests are dynamic and hard to predict and collect in real-time in CIoV networks, which prevents the deployment of these methods in real-world scenarios. Yang et al. [27,28] optimize the resource allocation of virtual switches and network embedding problems together in an online model and design a heuristic algorithm to solve the relaxed problem. The real-time resource allocation algorithm requires solving the resource allocation problem for each vswitch in a centralized network controller. CIoV networks are large-scale and geographically distributed, which complex the solution of the problems and introduce significant transmission latency. Instead, our virtual switch resource allocation algorithm can dynamically allocate resources to virtual switches without prior knowledge. Moreover, we can decompose the network-wide virtual switch resource allocation into distributed algorithms through Lyapunov Optimization Framework. And each virtual switch only executes the algorithm on its own server to adjust the local resource allocation, which does not require a centralized resource scheduling.

Network model and performance analysis
In this section, we propose a mathematical model of the dynamic vswitch resource allocation problem and the packet scheduling problem in CIoV. It optimizes the time expenditure of resource allocation on vswitches and packet scheduling for each network connection. Besides, we apply Lyapunov Optimization Framework [16][17][18] to transform the joint optimization problem into a discrete-time queuing system. It converts the model constraints to virtual queues and defines the Lyapunov Function to represent the stabilization of the system. To solve the optimization problem and to stabilize the queue system, we minimize the drift of the Lyapunov Function and derive the optimization policies from the function. In addition, we deduce the performance tradeoff of the policies from the theoretical analysis, and it shows that the time-average expectation of vswitch resource allocation and queue backlogs con-

Network model
We provide an instance of the dynamic vswitch resource allocation problem and the packet scheduling problem in CIoV in Fig. 3. All cognitive services are deployed in virtual machines with virtualized servers, and service requests and responses can be considered as network connections in this model. The data center operator manages the system in slotted time. ❶ The vswitch in the server s 1 creates a vswitch virtual port u for the virtual machine VM1 and binds NICs as vswitch physical ports. The vswitch in server s 2 also creates a vswitch virtual port v for the virtual machine VM2 and binds NICs as vswitch physical ports. ❷ Each vswitch port p enqueues received packets to a Receive Queue (Rx-Queue), and a packet classifier dequeues packets at the rate of f (r p (t)) . The vswitch resource allocation decision variable r p (t) represents that the cloud operator allocates resource r p (t) to vswitch port p on slot t. The performance function f (·) describes the relationship between resource allocation and packet processing rate, which can be measured on servers. The cache miss and the data bus bandwidth limit the forwarding performance of vswitch [29][30][31], which conduct f (·) to a nondecrease concave function [32]. The cloud operator allocates CPU resource to port u by setting the vswitch resource allocation variables r u (t) to port u for forwarding packets on the server s 1 .
In the time slot t, ❹ VM1 creates a connection m to VM2 that requires networks bandwidth A m (t) . The required bandwidth A m (t) is a stochastic variable and i.i.d over slots. The cloud operator should not only allocate the resource to vswitches but also decide packet scheduling decisions on this slot. For the packet scheduling decisions, the cloud operator needs to choose a forwarding path that connects s 1 to s 2 for the connection m. In this example, the cloud operator selects the forwarding path y ij m (t) , which connects the port i in s 1 to port j in s 2 . The connection occupies the bandwidth A m (t) on the forwarding path (i, j). The capacity of the forwarding path is β ij . ❺ The vswitch in server s 2 handles the connection m on port j and enqueues packets in the queue Q j (t) . ❻ The cloud operator also needs to allocate the resource r j (t) to port j for forwarding packets. ❼ VM2 receives packets of the connection m from vswitch port v, and handles the cognition service request. The cognition service response also can be presented as a similar process. Without loss of generality, we model all bidirectional service requests and responses as two connections ( Table 1).
The packets in connection m transverse through port u and are enqueued in Rx-Queue Q u (t) . The classifier dequeues packets in Rx-Queue Q u (t) and forwards packets to port i. The packets are forwarded through the path (i, j) in DCN. After being received by port j, the packets are enqueued in Rx-Queue Q j (t) and forwarded to port v. We define the vswitch virtual port backlog update Eq. (1) and the vswitch physical port backlog update Eq. (2), which describe the backlog of vswitch queues in each slot.
To optimize the dynamic vswitch resource allocation problem and the packet scheduling problem on time average, we define the optimization problem (3)(4)(5)(6)(7)(8)(9)(10)(11)(12). The object of the optimization (3) is to minimize the time-average expectation of resource allocation on vswitches, when the resource capacity constraint (4), the network path capacity constraint (5), the flow conservation constraint (6), the single path constraint (7,8) and the queue stable constraint (9) hold. The resource allocation constraint (4) restricts that the  The bandwidth of network path (i, j), i, j ∈ P NIC r u (t), r j (t) The vswitch resource allocation decision variables, which denotes the resource allocated to vswitch port u ∈ P s NIC , j ∈ P s VM on time slot t, s ∈ S y ij m (t) The packet scheduling decision variable, which denotes the connection m transverses through the network path (i, j) on time slot t f (·) A nondecrease, concave function f (·) describes the relationship between the resource allocation on a vswitch port and the forwarding rate of the port V The tradeoff factor indicates the relation between the system stability and the object of the problem, V > 0 resource allocation for all ports of a vswitch is not greater than the capacity of the server. The forwarding path capacity constraint (5) guarantees that the traffic on each forwarding path does not exceed the bandwidth of the path. The flow conservation constraint (6) ensures that the incoming traffic is equal to the outgoing network traffic for each forwarding path. The single path constraint (7)(8) indicates that the network does not support multi-path. The queue stable constraint (9) states that the queue backlog will not increase to infinite. Queue mean rate stable can be defined as lim x(τ ).

Transformed problem
The constraints of the above optimization can also be transformed as queue backlog update equations, which have similar forms as the queue update equation of vswitch ports. We transform the constraints (4-7) to virtual queues (13)(14)(15)(16). And if a virtual queue is mean rate stable, it denotes that the corresponding constraint holds in the optimization problem.
We define a vector (17) which includes all vswitches queues and virtual queues. We can define the Lyapunov function (18) and the Lyapunov drift (19) from the vector. The Lyapunov function is the sum of the squares of all vswitches queue backlogs and virtual queue backlogs, which are always nonnegative. If the Lyapunov function is a large number, it denotes that at least one queue in the vector (t) is large. It represents that too many resources are allocated to the virtual switches, or some constraints are not satisfied in the optimization problem. We define the Lyapunov drift as the difference of the Lyapunov function with the previous time slot.
To reduce the expenditure of resource allocation and to stabilize the system, we apply Lyapunov Optimization Framework [16][17][18][19] which minimizes the drift-plus-penalty (20) on each slot. The drift-plus-penalty is defined as the weighted sum of the Lyapunov drift and the object of the optimization problem. The factor V is a positive penalty factor, which indicates the relation between the system stability and the object of the problem. Applying the fact that if a = max(b, 0) , then a 2 ≤ b 2 to the definition of queues (1-2, 13-16), we obtain the inequation (21) which is the bound of the drift-plus-penalty. Because these stochastic variables are bounded (10-11), B is a constant independent from the factor V, which contains the remaining square terms of the inequation (21).
To minimize the drift-plus-penalty, we minimize the right side of the inequation (21) on each slot, which could deduce the vswitch resource allocation problem (22)(23) and packet scheduling problem (24). According to the above analysis, we can design distributed resource allocation policies for the dynamic vswitch resource allocation problem and the packet scheduling problem. On each slot, the vswitch on each server can solve the optimization problems (22)(23) independently to allocate resources for each vswitch virtual port and vswitch physical port. The cloud operator solves the optimization problem (24) to select the physical path for each connection. In addition, the resource allocation and packet scheduling only depend on vswitches queue backlogs and virtual queue backlogs on the current slots that will not require prior information about the distribution of the traffic.

Performance analysis
Based on the above transformation and optimization policies, we provide a theorem that reveals the tradeoff [O(1/V), O(V)] between the time-average expectation of resource allocation and queue backlogs. We can evaluate the system's stability with the optimization object by tuning the factor V, which balances the vswitch resource allocation and the satisfaction of optimization constraints. (10)(11), the initial queue backlogs are finite ( E[L(0)] ≤ ∞ ), and the Slater's condition holds.

Theorem 1 Assume the optimization problem (3-8) is feasible, the bounded restraints hold
If a C-additive algorithm applies on every slot to minimize the right side of inequality (21) and gets the optimal solution p∈P r 1 Proof 1 We insert the Slater's condition on the right side of (21) and apply the C-additive algorithm on each slot to get the optimal solution, which yields (27). We get (28) from taking expectation of (27) with the law of total expectation, and applying the law of telescoping sum from slot 0 to t − 1 . Then, we rearrange the inequation (28)- (29), divide (29) by Vt and take limits t → ∞ to prove (25). Similarly, we rearrange the inequation (28)- (30), divide ηt into (30) and take limits t → ∞ to prove (26).

Algorithm design
According to the performance analysis in Sect. 3.3, we can derive algorithms to optimize the problems (22)(23)(24). We first design Algorithm 1 to solve the dynamic vswitch resource allocation problem and the packet scheduling problem. The cloud operator initializes the vswitch queue backlogs (1-2), virtual queue backlogs (13)(14)(15)(16), resource allocation decision variable r p (t) and packet scheduling decision variables y ij m (t) . On each slot, vswitches execute Algorithm 2 to allocate the resource individually, and the cloud operator executes Algorithm 3 to schedule packets for each connection. Finally, the cloud operator updates vswitch queue backlogs (1-2) and virtual queue backlogs (13)(14)(15)(16). Then, Algorithm 2 solves the vswitch resource allocation problem (22)(23). On slot t, each vswitch s ∈ S calculates f ′ (r p (t)) for each vswitch physical port and vswitch virtual port independently. The function f (·) is a nondecrease concave function. Therefore, we can set r p (t) to minimize the problem (22)(23). The time complexity of Algorithm 2 is O(n) for each vswitch, and n is the number of vswitch physical ports and vswitch virtual ports on a vswitch.
At last, we design Algorithm 3 to schedule packets (24) for each connection. For each connection, the cloud operator updates the factor (31) on each forwarding path. If the factor is positive, the cloud operator will not select the corresponding forwarding path for the connection and vice versa. The algorithm's time complexity for each connection is O(l), and l is the number of possible forwarding paths for a connection. Besides, our dynamic vswitch resource allocation and packet scheduling algorithm is compatible with data center networks' existing control plane protocol. For example, OpenFlow provides a standard network control plane abstraction. On the one hand, it supports per-flow meters to measure each flow size in a fine-grained granularity on each vswitch. Algorithm 2 can update the vswitch queue backlogs by utilizing the measured flow size. On the other hand, Algorithm 3 can configure the forwarding path of each flow by setting flow forwarding rules in the OpenFlow protocol. Thus, our algorithm is compatible with data center network control plane protocols and can be deployed in data center networks.

Extended algorithm
In the real-world CIoV network, if the derivative of the optimization problem (22)(23) is greater than 0 (32) for the specific vswitch port u, the vswitch will not allocate any CPU resource to the port u according to Algorithm 2. The packets in Q u (t) could not be dequeued until the vswitch allocates resource to port u, which delays packets forwarding and increases the queue backlog of port u.
To decrease the queue backlog of port u, we deceive the vswitch by inserting the queue backlog place-holder [33] (33) to the derivation of the optimization problem (22)(23). The queue backlog place-holder P u (t) is a nonpositive parameter to decrease the derivation of (22-23) until the derivation is not greater than 0.
The queue backlog place-holder of port u will not break the performance tradeoff (25)(26). A queue backlog place-holder can be considered as an invisible queue backlog (34) which decreases the resource capacity virtual queue (13) for the port u. The cloud operator will allocate more resources to port u. To stabilize the resource capacity virtual queue, Algorithm 2 will decrease the resource allocation for other ports on the vswitch. The dynamic vswitch resource allocation and packet scheduling algorithm also can use the extended resource allocation algorithm to decrease queue backlogs violating the performance tradeoff.

Experiments and evaluation
In this section, we first measure the performance of vswitches on a virtualized server and evaluate the vswitch resource allocation algorithm on the testbed. Second, we evaluate the dynamic vswitch resource allocation and the packet scheduling algorithm in the network-wide simulation on a real-world data center network topology. Finally, we verify the algorithm performance under various experiment settings and validate our performance bounds.

Single-node evaluation
We first evaluate the performance of the virtual switch on a hardware testbed. The hardware testbed consists of a packet generation server and an under-test server. Each server equips two CPUs (E5-2620v2, 12 cores) and connects to each other with 10 Gbps NICs (Intel 82599). We implement the vswitch resource allocation and packet scheduling algorithms on OpenvSwitch (OVS) [ Fig. 4a. First, we find that vswitch can process more network packets as the number of allocated CPUs increases. The allocation of CPU resources and the throughput rate of vswitch is a nondecrease concave function, which is consistent with our assumptions in Sect. 3.1. Then, we find that the virtual switch requires at least five CPU cores to process line-rate network traffic with the current hardware settings. The vswitch has already occupied 41.7% (5/12) of the total CPU cores. Data center providers would have to statically allocate a significant portion of CPU resources to virtual switches to provide line-rate forwarding rates without dynamic resource allocation algorithms.
We also evaluate the vswitch resource allocation algorithm on the testbed for 30 time slots, and the interval of each time slot is 60 s. In each time slot, we generate a random packet flow with an uniform random distribution packet size and collect the averaged queue backlogs and the number of allocated CPU cores on the vswitch in Fig. 4b and c. The current implementation of OVS-DPDK only supports static resource allocation, and we statically allocate three cores and five cores in this experiment. On our testbed, the vswitch achieves line-rate forwarding with five CPU cores. When allocating three CPU cores to the vswitch, it cannot achieve peak performance and delay packet forwarding. Our dynamic allocation algorithm alters CPU allocation to the incoming packets, allowing the vswitch to handle peak and lower network traffic dynamically. Experiments show that our dynamic virtual switch resource allocation algorithm can reduce the CPU (34) H s (t) = H s (t) + P u (t) resource allocation compared to static allocation while providing sufficient forwarding throughput.

Network-wide evaluation
To evaluate our dynamic vswitch resource allocation and the packet scheduling algorithm, we build up an event-driven simulation environment. The network-wide simulation environment includes an edge data center architecture [36,37] and the cloud data center architecture [38,39]. We equip each server in the edge data center with 20 CPU cores and a 25 Gbps NIC, and each server in the cloud data center with 40 CPU cores and a 40 Gbps NIC. We generate cognition service workloads from real-world edge data center network distribution [40] and cloud data center network traces [38]. The tradeoff factor V is set to 1.5 × 10 7 . We implement three resource allocation algorithms as baselines for the experiment. (1) Static resource allocation. It allocates resources to virtual switches according to the mean value of the input data. (2) Shenango-like [23] resource allocation. It directly allocates resources according to vswitch queue backlogs. When the queue backlogs increase, it increases CPU allocation and vice versa. (3) Snap-like [25] resource allocation. It also allocates resources according to queue backlogs. Besides, it employs a more aggressive allocation strategy that always over-allocates CPUs to reduce tail latency.
We show the performance evaluation of these four algorithms in Fig. 5 for edge and cloud data centers, respectively. We collect the number of allocated CPUs, the average queue length, and the throughput of virtual switches. Since the static resource allocation is based on the prior knowledge of the input data, it can achieve higher throughput and lower queue backlogs with less CPU resources. However, it is difficult to obtain the precise data distribution in real-world CIoV scenarios, making this algorithm hard to apply in practice. Both Shenango-like and Snap-like algorithms allocate resources according to the queue backlogs of virtual switches. Shenango-like algorithm occupies less resources and achieves lower throughput compared with the Snap-like algorithm. Snap-like algorithm achieves the shortest queue backlogs and the highest throughput through aggressive resource allocation. However, it clearly consumes a large number of CPU resources. Compared with these algorithms, our resource allocation algorithm not only utilizes vswitch queue backlogs, but also considers virtual queue backlogs, which are transformed from the optimization constraints by the Lyapunov Optimization Framework. It balances resource allocation and queue backlogs in the network-wide scenario and achieves better performance.

Theoretical performance tradeoff evaluation
To validate the [O(1/V), O(V)] performance tradeoff, we execute the Algorithm 1 for 10 3 time slots and tunes the tradeoff factor V from 1.2 to 1.3 × 10 7 under the same experiment configurations in Sect. 5.2. We collect the time-average expectation of queue backlogs and resource allocation for each V in Fig. 6a and b. In Fig. 6a, the time-average expectation of resource allocation reduces and closes to the optimal solution with We also evaluate the extended Algorithm 1 which inserts queue backlog place-holders to virtual queues. We execute the extended algorithm under the same configuration and collect the time-average expectation of resource allocation and the time-average expectation of queue backlogs in Fig. 6a and b. Figure 6a and b reveals that the extended algorithm reduces queue backlogs and enlarges a little bit of the resource allocation on vswitches due to the additional backlog place-holders. The results validate that the extended algorithm also satisfies the [O(1/V), O(V)] tradeoff.

Conclusions
In this paper, we study the dynamic vswitch resource allocation problem and the packet scheduling problem in CIoV. We first formulate the joint optimization problem to optimize resource allocation on vswitches and analysis it through Lyapunov Optimization Framework. Then, we transform the optimization problem into a discrete-time queueing system and decompose the problem into a queue stability problem. Next, we prove the performance tradeoff [O(1/V), O(V)] of the policies between the time-average expectation of resource allocation and queue backlog. We design low-complexity vswitch resource allocation and packet scheduling algorithms from the analysis. Finally, we design experiments to evaluate the performance of our algorithms on a real-world testbed and a network-wide simulation. Results show that our algorithms satisfy the performance tradeoff and outperform other allocation algorithms.