3.1 A heuristic method for service selection of the workflow
As mentioned above, the first sub-problem of the considered workflow scheduling problem is mode assignment. Each task has several selective services with different execution time and cost. For a workflow with \(n + 2\) tasks, there may be \(\prod_{i = 0}^{n} \left| {{\rm M}_{i} } \right|\) service mode selection schemas. With a given deadline, some of them are feasible solutions, and the others are unfeasible solutions. We propose a heuristic method for obtaining an optical solution \({\text{Mode}} = (M_{1}^{*} ,M_{2}^{*} , \cdots ,M_{n}^{*} )\) satisfying the deadline with less cost. The main idea of the heuristic method is described as follows. Firstly, select the service with shortest execution time for each task. Obviously, the mode selection schema can meet the deadline, and the resulting cost is most expensive. Then, the schema is optimized recursively by adjusting the service mode of some tasks with less cost, while the deadline is still satisfied.
In the service pool of each task, the service mode of each task is ranked in ascending order of the execution time, and the descending order of the cost. That is, \(d_{i}^{k + 1} > d_{i}^{k}\) and \(c_{i}^{k + 1} < c_{i}^{k}\),\(0 \le k \le \left| {{\rm M}_{i} } \right| - 1\). For a give mode selection schema, the \(k{\text{th}}\) mode is selected for task \(i\), the earliest start time and finish time can be obtained by \(est_{i} = \mathop {\max }\limits_{{j \in Dpre_{i} }} \{ eft_{j} \}\), \(eft_{i} = est_{i} + d_{i}^{k}\). With the given deadline \(\delta_{{}}\), the latest start time and finish time is obtained by \(lft_{i} = \mathop {\min }\limits_{{j \in Dsucc_{i} }} \{ lst_{j} \}\), \(lst_{i} = lft_{i} - d_{i}^{k}\). If \(lft_{i} - est_{i} \ge d_{i}^{k + 1}\), a service mode \(M_{i}^{k + 1}\) can be reselected for task \(i\) for cheaper cost. There may be several tasks can be adjusted their service modes in set \(\Psi\). In each recursive, the most appropriate task is selected to choice a cheaper service, which is defined by the reduction of cost in per unit of time \(\tau_{i} = (c_{i}^{k} - c_{i}^{k + 1} )/(d_{i}^{k + 1} - d_{i}^{k} )\). Furthermore, the degree centrality of the node in the workflow is considered, since the smaller the degree of the selected task, the smaller of the influence on other nodes. \(\tau^{\prime}_{i} = \tau_{i} /(\left| {Dsucc_{i} } \right| + \left| {Dpre_{i} } \right|)\) is computed for each task \(i \in \Psi\). Algorithm is described below.
3.2 An example of the scheduling strategies
Traditional service scheduling problems discuss the service renting in non-shareable manner. However, the cloud services mostly can be shared between the tasks during the renting interval to decreased the total renting cost, since Minimizing service renting cost is more important to the cloud users rather than the system performance of service providers.
As described in the second section, if we can schedule the tasks who choose the same service to the same service instance, and entire rent the service with time intervals, the renting cost can get a discount. However, it is time consuming to test every time point to confirm whether there are two tasks can share the renting interval. Since the cloud service is thought to be unlimited and available anytime, the workflow can be run when submitted. The earliest start time of each task is obtained based on the structure through forward scheduling. Then, we will give the renting strategy to schedule the tasks efficiently.
A workflow with 12 tasks is shown in Fig. 2, and the deadline is set to 30. Based on the forwarding scheduling, the schema of the workflow is obtained as Fig. 5, where the service instances are un-shareable. To improve the scheduling efficiency, the renting starting point and finishing point are decided whether the service instance can be shared among the tasks.
The considered workflow has four paths as shown in Fig. 3, and the critical path is 1-4-5-8. The completion time of the workflow is 27. It is finished 3 units before the deadline. That is, the workflow has 3 slack time units. So we can delay the tasks in path-3 (3,9) to share the existing Service4 instance1. Task 5 also can be delayed to share the renting interval of Service2. Task 9 should be retabled to share the existing Service3 instance1. It is sure that once two tasks on the same path selected the same service, they should rent the service in time interval manner, such as task 1 and 4.
The final schedule of the workflow is shown in Fig. 4. As we can see tasks numbered in 3,5,6,7,8,9 are delayed to start, but the completion time of the workflow still satisfies the deadline. 6 instances should be rented to execute the workflow. It is necessary to mention that entire rent the server instance in time intervals or on-demand single rent should be decided which way is cost-efficient. For example, task \(v_{8}\) need 8 time units to finish by \(S^{{1}}\), renting in per unit charged manner should be chosen for energy conservation. Task \(v_{5}\) rents \(S^{{2}}\) instance1 in one interval and shared 5 units slot to \(v_{10}\). In this case, task \(v_{10}\) still need 1 unit time. Then the renting method mentioned above is adopted to reduce the cost.
3.3 Heuristic scheduling method proposed for optimizing the cost under shareable instances
Given the deadline \(\delta\), the selected modes for each task \({\text{Mode}} = (M_{1}^{*} ,M_{2}^{*} , \cdots ,M_{n}^{*} )\) obtained by HSI, the proposed heuristic scheduling method is used to schedule the tasks sequentially satisficing the precedence constraints, and the services can be shared on the same time interval to reduce the cost.
According to the mode assignment schema, each task has selected a service, and the execution time is fixed. The earliest start time \(est_{i}\) and finish time \(eft_{i}\) of the task \(i\) is computed by the forwarding scheduling, and the latest start time \(lst_{i}\) and finish time \(lft_{i}\) is obtained by the backward scheduling. For each type of service \(k\) in the service set \(S = (S^{1} ,S^{2} , \cdots ,S^{w} )\), we construct a set of tasks which has select service \(k\), and the tasks are ordered in the non-descending earliest start time, \(\Re_{k} = \{ a_{1} ,a_{2} , \cdots ,a_{m} \}\). The proposed heuristic scheduling method is tried to reschedule the tasks in \(\Re_{k}\) to appropriate start time sharing the renting interval of the service to reduce the total cost.
In order to describe our heuristic scheduling method under shareable instances in detail, we give a mathematical description of some variables. First, the number of existing instances of service \(S^{k}\) is defined as \(NIns_{k}\). Then, we define two sets, \(TInterval_{k}^{t}\) and \(TSlot_{k}^{t}\), to record the renting time intervals and the remaining time slots of the existing instance \(t{\text{th}}\) of service \(S^{k}\). The \(r{\text{th}}\) renting interval in set \(TInterval_{k}^{t}\) can be described as \([st_{r}^{T} ,ft_{r}^{T} ]\), where \(st_{r}^{TI}\) and \(ft_{r}^{TI}\) separately represent the starting time and finishing time of the renting interval. The \(s{\text{th}}\) slot in set \(TSlot_{k}^{t}\) can be described as \([st_{s} ,ft_{s} ]\), where \(st_{s}^{TS}\) and \(ft_{s}^{TS}\) separately represent the starting time and finishing time of the slot. The tasks in \(\Re_{k}\) tries to schedule in optimal start time sharing the existing service instance. The task will rent a new service instance when all the existing remaining time intervals cannot satisfy its time constraints.
The final total cost of the workflow is computed based on the service instance renting time intervals recorded in \(TInterval_{k}^{ins}\), and the hybrid renting manner is adopted.