DRL-based Task Offloading for Deadline-sensitive Applications in Multi-access Edge Computing
Abstract
The proliferation of computation-intensive applications, such as online gaming and autonomous driving, has imperatively urged resource-constrained mobile devices to alleviate the computation and energy consumption pressure with the aid of external computing resources. Recently, a cutting-edge computing paradigm, Multi-access Edge Computing (MEC), has emerged as a promising solution to mitigate the resource shortage problem with mobile devices by selectively offloading a portion of computation-intensive tasks to physically-close edge servers. Over the past years, a series of task offloading schemes, including non-learning-based offloading and learning-based offloading, have been extensively studied. However, we notice that many of the computation-intensive applications are also deadline-sensitive. Namely, the computation tasks from these applications often have deadlines to satisfy. Nevertheless, the existing task offloading schemes face several limitations that hinder their applicability in deadline-sensitive MEC systems. In the thesis, we aim to employ Deep Reinforcement Learning (DRL) to address the task offloading problems in multi-tier deadline-sensitive MEC systems. First, we propose an innovative task offloading scheme for partially observable MEC systems, referred to as PDMO, which incorporates partially observable DRL and Dynamic Voltage and Frequency Scaling (DVFS) to minimize the energy consumption of mobile devices while guaranteeing deadline satis-
faction. Second, we devise a novel Multi-access Edge-assisted Learning-based Offloading (MELO) scheme to effectively optimize the completion time of tasks in a highly dynamic MEC system. Lastly, we propose a unique offloading scheme for safety-critical tasks, Constrained Reinforcement Learning based Offloading (CRLO). With CRLO, a safety layer is integrated to the policy network of the learning-based policy generator, which effectively eliminates risky offloading decisions that could lead to deadline misses. Additionally, to achieve more efficient offloading decisions, Informer, a computationally-efficient long-sequence forecasting model, is utilized to forecast temporally-dependent system states for the upcoming time window. The experimental results indicate that all of the proposed learning-based offloading schemes outperform the baseline methods in terms of energy consumption or task completion time.