Mine Internet of Things (MIoT) devices in intelligent mines often face substantial signal attenuation due to challenging operating conditions. The openness of wireless communication also makes it ...susceptible to smart attackers, such as active eavesdroppers. The attackers can disrupt equipment operations, compromise production safety, and exfiltrate sensitive environmental data. To address these challenges, we propose an intelligent reflecting surface (IRS)-assisted secure transmission system for an MIoT device which enhances the security and reliability of wireless communication in challenging mining environments. We develop a joint optimization problem for the IRS phase shifts and transmit power, with the goal of enhancing legitimate transmission while suppressing eavesdropping. To accommodate time-varying channel conditions, we propose a reinforcement learning (RL)-based IRS-assisted secure transmission scheme that enables MIoT device to optimize both the IRS reflecting coefficients and transmit power for optimal transmission policy in dynamic environments. We adopt the deep deterministic policy gradient (DDPG) algorithm to explore the optimal transmission policy in continuous space. This can reduce the discretization error caused by traditional RL methods. The simulation results indicate that our proposed scheme achieves superior system utility compared with both the IRS-free (IF) scheme and the IRS randomly configured (IRC) scheme. These results demonstrate the effectiveness and practical relevance of our contributions, proving that implementing IRS in MIoT wireless communication can enhance safety, security, and efficiency in the mining industry.
Location-based services (LBS) in vehicular ad hoc networks (VANETs) must protect users' privacy and address the threat of the exposure of sensitive locations during LBS requests. Users release not ...only their geographical but also semantic information of the visited places (e.g., hospital). This sensitive information enables the inference attacker to exploit the users' preferences and life patterns. In this paper we propose a reinforcement learning (RL) based sensitive semantic location privacy protection scheme. This scheme uses the idea of differential privacy to randomize the released vehicle locations and adaptively selects the perturbation policy based on the sensitivity of the semantic location and the attack history. This scheme enables a vehicle to optimize the perturbation policy in terms of the privacy and the quality of service (QoS) loss without being aware of the current inference attack model in a dynamic privacy protection process. To solve the location protection problem with high-dimensional and continuous-valued perturbation policy variables, a deep deterministic policy gradient-based semantic location perturbation scheme (DSLP) is developed. The actor part is used to generate continuous privacy budget and perturbation angle, and the critic part is used to estimate the performance of the policy. Simulations demonstrate the DSLP-based scheme outperforms the benchmark schemes, which increases the privacy, reduces the QoS loss, and increases the utility of the vehicle.
Mobile edge computing (MEC) integration with 5G/6G technologies is an essential direction in mobile communications and computing. However, it is crucial to be aware of the potential privacy ...implications of task offloading in MEC scenarios, specifically the leakage of user location information. To address this issue, this paper proposes a location-privacy-preserved task offloading (LPTO) scheme based on safe reinforcement learning to balance computational cost and privacy protection. This scheme uses the differential privacy technique to perturb the user’s actual location to achieve location privacy protection. We model the privacy-preserving location perturbation problem as a Markov decision process (MDP), and we develop a safe deep Q-network (DQN)-based LPTO (SDLPTO) scheme to select the offloading policy and location perturbation policy dynamically. This approach effectively mitigates the selection of high-risk state–action pairs by conducting a risk assessment for each state–action pair. Simulation results show that the proposed SDLPTO scheme has a lower computational cost and location privacy leakage than the benchmarks. These results highlight the significance of our approach in protecting user location privacy while achieving improved performance in MEC environments.
Orthogonal frequency division multiplexing (OFDM) in 5G has many advantages; however, one of the disadvantages is that the superposition of a large number of subcarriers leads to a high ...peak-to-average power ratio (PAPR) of the transmit signal. A high PAPR results in high-power amplifier distortion and performance degradation. The partial transmit sequence (PTS) algorithm is commonly used for PAPR reduction. It enumerates all combinations of phase factors, weighs the signal using each phase factor combination, and finds the set of phase factors that minimizes the PAPR value of the OFDM signal. The advantage of the PTS is that it determines the optimal solution through enumeration; however, its major drawback is the higher complexity caused by the use of enumeration. Some studies have introduced the discrete particle swarm optimization (DPSO) algorithm instead of enumeration to determine the optimal solution of the PTS algorithm. As an excellent optimization method, the DPSO algorithm represents each individual as a solution during the optimization. Through iterative updates of the initial population, individuals in the population continuously move closer to the optimal solution. This approach significantly reduces complexity compared with the exhaustive enumeration used in the traditional PTS algorithm. However, the disadvantage of the general DPSO algorithm is that it can result in premature and early convergence, which leads to degradation of the PAPR reduction performance. In this study, we propose an improved method based on the general DPSO-based PTS algorithm, and the improved algorithm MDPSO-PTS adopts dynamic time-varying learning factors, which can find the optimal combination of phase factors more efficiently. The MDPSO-PTS algorithm expands the search space when seeking the optimal combination of phase factors. This avoids the drawback of premature convergence commonly observed in general DPSO-PTS algorithms, preventing early consideration of local optima as global optima. A comparative simulation of the improved MDPSO-PTS algorithm with the general DPSO-PTS algorithm shows that the improved algorithm has stronger PAPR reduction, whereas the complexity remains basically unchanged. A comparative simulation with the traditional PTS algorithm shows a significant reduction in complexity, with only a slight, acceptable loss of reduction performance.
Internet of Things (IoT) devices can apply mobile edge computing (MEC) and energy harvesting (EH) to provide high-level experiences for computational intensive applications and concurrently to ...prolong the lifetime of the battery. In this paper, we propose a reinforcement learning (RL) based offloading scheme for an IoT device with EH to select the edge device and the offloading rate according to the current battery level, the previous radio transmission rate to each edge device, and the predicted amount of the harvested energy. This scheme enables the IoT device to optimize the offloading policy without knowledge of the MEC model, the energy consumption model, and the computation latency model. Further, we present a deep RL-based offloading scheme to further accelerate the learning speed. Their performance bounds in terms of the energy consumption, computation latency, and utility are provided for three typical offloading scenarios and verified via simulations for an IoT device that uses wireless power transfer for energy harvesting. Simulation results show that the proposed RL-based offloading scheme reduces the energy consumption, computation latency, and task drop rate, and thus increases the utility of the IoT device in the dynamic MEC in comparison with the benchmark offloading schemes.
An Advanced Persistent Threat (APT) attacker applies multiple sophisticated methods to continuously and stealthily attack targeted cyber systems. In this paper, the interactions between an APT ...attacker and a cloud system defender in their allocation of the Central Processing Units (CPUs) over multiple devices are formulated as a Colonel Blotto game (CBG), which models the competition of two players under given resource constraints over multiple battlefields. The Nash equilibria (NEs) of the CBG-based APT defense game are derived for the case with symmetric players and the case with asymmetric players each with different total number of CPUs. The expected data protection level and the utility of the defender are provided for each game at the NE. An APT defense strategy based on the policy hill-climbing (PHC) algorithm is proposed for the defender to achieve the optimal CPU allocation distribution over the devices in the dynamic defense game without being aware of the APT attack model. Simulation results have verified the efficacy of our proposed algorithm, showing that both the data protection level and the utility of the defender are improved compared with the benchmark greedy allocation algorithm.
Unmanned aerial vehicle (UAV) systems are vulnerable to smart attackers, who are selfish and subjective end-users and use smart radio devices to change their attack types and policies based on the ...ongoing UAV transmission and network states. In this paper, we apply prospect theory to formulate a subjective smart attack game for the UAV transmission, in which a smart attacker Eve makes subjective decisions to choose the attack type such as jamming, spoofing, and eavesdropping without knowing the attack detection accuracy of the UAV system, and the UAV transmit power on multiple radio channels is chosen to resist smart attacks. Reinforcement-learning-based UAV power allocation strategies are proposed to achieve the optimal power allocation against smart attacks without knowing the attack model and the channel model in the dynamic game. A deep Q-learning-based UAV power allocation strategy combines Q-learning and deep learning to accelerate the learning speed for the case with a large number of channel states and attack modes. Simulation results show that our proposed UAV power allocation strategy can suppress the attack motivation of subjective smart attackers and increase the secrecy capacity and the utility of the UAV system.
Fog computing is an energy-efficient and cost-effective paradigm to help alleviate the pressure of resource-constrained mobile devices (MDs) running computation-intensive applications. In this ...article, we investigate the joint task partitioning and power control problem in a fog computing network with multiple MDs and fog devices (FDs), where each MD has to complete a periodic computation task under the constraints of delay and energy consumption. Each task can be partitioned into multiple subtasks and offloaded to the FDs according to the task partition strategy and transmission power strategy to reduce task execution delay and energy consumption. To this end, we present a multiagent deep deterministic policy gradient (MADDPG)-based task offloading algorithm for MDs to maximize the long-term system utility including the execution delay and energy consumption. Each MD inputs the local information, e.g., the task requirements, the available communication, and computation resources of the FDs, the computation resources, and the battery level of the MD into a distributed actor network to generate a task offloading policy, while a centralized critic network is used to update the weights of the actor networks to improve offloading performance. Numerical simulation results demonstrate the effectiveness of the proposed scheme in improving the system utility, reducing the average execution delay as well as the average energy consumption.
Mobile edge computing helps healthcare Internet of Things (IoT) devices with energy harvesting provide satisfactory quality of experiences for computation intensive applications. We propose a ...reinforcement learning (RL)-based privacy-aware offloading scheme to help healthcare IoT devices protect both the user location privacy and the usage pattern privacy. More specifically, this scheme enables a healthcare IoT device to choose the offloading rate that improves the computation performance, protects user privacy, and saves the energy of the IoT device without being aware of the privacy leakage, IoT energy consumption, and edge computation model. This scheme uses transfer learning to reduce the random exploration at the initial learning process and applies a Dyna architecture that provides simulated offloading experiences to accelerate the learning process. A post-decision state learning method uses the known channel state model to further improve the offloading performance. We provide the performance bound of this scheme regarding the privacy level, the energy consumption, and the computation latency for three typical healthcare IoT offloading scenarios. Simulation results show that this scheme can reduce the computation latency, save the energy consumption, and improve the privacy level of the healthcare IoT device compared with the benchmark scheme.
Federated learning (FL) represents a promising distributed machine learning paradigm that allows smart devices to collaboratively train a shared model via providing local data sets. However, problems ...considering multiple co-existing FL services and different types of service providers are rarely studied. In this paper, we investigate a multiple FL service trading problem in Unmanned Aerial Vehicle (UAV)-aided networks, where FL service demanders (FLSDs) aim to purchase various data sets from feasible clients (smart devices, e.g., smartphones, smart vehicles), and model aggregation services from UAVs, to fulfill their requirements. An auction-based trading market is established to facilitate the trading among three parties, i.e., FLSDs acting as buyers, distributed located client groups acting as data-sellers, and UAVs acting as UAV-sellers. The proposed auction is formalized as a 0-1 integer programming problem, aiming to maximize the overall buyers' revenue via investigating winner determination and payment rule design. Specifically, since two seller types (data-sellers and UAV-sellers) are considered, an interesting idea integrating seller pair and joint bid is introduced, which turns diverse sellers into virtual seller pairs. Vickrey-Clarke-Groves (VCG)-based, and one-sided matching-based mechanisms are proposed, respectively, where the former achieves the optimal solutions, which, however, is computationally intractable. While the latter can obtain suboptimal solutions that approach to the optimal ones, with low computational complexity, especially upon considering a large number of participants. Significant properties such as truthfulness and individual rationality are comprehensively analyzed for both mechanisms. Extensive experimental results verify the properties and demonstrate that our proposed mechanisms outperform representative methods significantly.