The effective physical data sharing has been facilitating the functionality of Industrial IoTs, which is believed to be one primary basis for Industry 4.0. These physical data, while providing ...pivotal information for multiple components of a production system, also bring in severe privacy issues for both workers and manufacturers, thus aggravating the challenges for data sharing. Current designs tend to simplify the behaviors of participants for better theoretical analysis, and they cannot properly handle the challenges in IIoTs where the behaviors are more complicated and correlated. Therefore, this paper proposes a privacy-preserved data sharing framework for IIoTs, where multiple competing data consumers exist in different stages of the system. The framework allows data contributors to share their contents upon requests. The uploaded contents will be perturbed to preserve the sensitive status of contributors. The differential privacy is adopted in the perturbation to guarantee the privacy preservation. Then the data collector will process and relay contents with subsequent data consumers. This data collector will gain both its own data utility and extra profits in data relay. Two algorithms are proposed for data sharing in different scenarios, based on whether the service provider will further process the contents to retain its exclusive utility. This work also provides for both algorithms a comprehensive consideration on privacy, data utility, bandwidth efficiency, payment, and rationality for data sharing. Finally, the evaluation on real-world datasets demonstrates the effectiveness of proposed methods, together with clues for data sharing towards Industry 4.0.
Releasing social network data could seriously breach user privacy. User profile and friendship relations are inherently private. Unfortunately, sensitive information may be predicted out of released ...data through data mining techniques. Therefore, sanitizing network data prior to release is necessary. In this paper, we explore how to launch an inference attack exploiting social networks with a mixture of non-sensitive attributes and social relationships. We map this issue to a collective classification problem and propose a collective inference model. In our model, an attacker utilizes user profile and social relationships in a collective manner to predict sensitive information of related victims in a released social network dataset. To protect against such attacks, we propose a data sanitization method collectively manipulating user profile and friendship relations. Besides sanitizing friendship relations, the proposed method can take advantages of various data-manipulating methods. We show that we can easily reduce adversary's prediction accuracy on sensitive information, while resulting in less accuracy decrease on non-sensitive information towards three social network datasets. This is the first work to employ collective methods involving various data-manipulating methods and social relationships to protect against inference attacks in social networks.
Battery-Free Wireless Sensor Networks (BF-WSNs) have been attracting increasing interests in the recent years. To reduce the latency in BF-WSNs, the Minimum Latency Aggregation Scheduling (MLAS) ...problem with coverage requirement <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula> is proposed recently, which tries to choose <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula> percent of nodes for communication and aggregation. In the existing method, the authors try to select nodes adaptively according to their energy status and schedule these nodes to achieve the minimum latency. Unfortunately, it cannot guarantee the distribution of the aggregated nodes and may result in these nodes being squeezed in a small area and a poor aggregation quality. Thus, we re-investigate the <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula>-coverage MLAS problem in this article, which can guarantee that the aggregated nodes are distributed evenly. Firstly, the 1-coverage MLAS problem, in which each node can be covered by at least one aggregated node, is studied. To reduce the latency, we intertwine the selection of aggregated nodes and the computation of a collision-free communication schedule simultaneously. Two algorithms are proposed by scheduling the communication tasks in the bottom-up and top-down manner respectively. Secondly, to satisfy the arbitrary coverage requirement <inline-formula> <tex-math notation="LaTeX">q </tex-math></inline-formula>, three algorithms are proposed to guarantee the aggregated nodes are evenly distributed in the network with a low latency. Additionally, the method to extend the proposed algorithms for the BF-WSNs with multiple channels is also studied. The theoretical analysis and simulation results verify that the proposed algorithms have high performance in terms of latency.
Due to the prominent development of public transportation systems, the taxi flows could nowadays work as a reasonable reference to the trend of urban population. Being aware of this knowledge will ...significantly benefit regular individuals, city planners, and the taxi companies themselves. However, to mindlessly publish such contents will severely threaten the private information of taxi companies. Both their own market ratios and the sensitive information of passengers and drivers will be revealed. Consequently, we propose in this paper a novel framework for privacy-preserved traffic sharing among taxi companies, which jointly considers the privacy, profits, and fairness for participants. The framework allows companies to share scales of their taxi flows, and common knowledge will be derived from these statistics. Two algorithms are proposed for the derivation of sharing schemes in different scenarios, depending on whether the common knowledge can be accessed by third parties like individuals and governments. The differential privacy is utilized in both cases to preserve the sensitive information for taxi companies. Finally, both algorithms are validated on real-world data traces under multiple market distributions.
The massive amount of data generated by the Internet-of-Things (IoT) devices places enormous pressure on sensory data query processing. Due to the limitations of computation and data transmission ...capabilities in traditional wireless sensor networks (WSNs), the current query processing methods are no longer effective. Furthermore, processing vast amount of sensory data also overloads the cloud. To address these problems, we investigate query processing in an edge-assisted IoT data monitoring system (EDMS). Multiaccess edge computing (MEC) is an emerging topic in IoTs. Unlike WSNs, the edge servers in an EDMS can deploy the computation and storage resources to nearby IoT devices and offer data processing services. Therefore, queries toward massive sensory data can be processed in an EDMS in a distributed manner and the edge servers can handle the sensory data in a distributed manner, reducing the workload of the cloud. In this article, we define a query processing problem in an EDMS, which aims to derive a distributed query plan with the minimum query response latency. We prove that this problem is NP-Hard and propose a corresponding approximation algorithm. The performance of the proposed algorithm is bounded. Furthermore, we evaluate the performance of the proposed algorithm through extensive simulations.
The amount of sensory data manifests an explosive growth due to the increasing popularity of Wireless Sensor Networks (WSNs). The scale of sensory data in many applications has already exceeded ...several petabytes annually, which is beyond the computation and transmission capabilities of conventional WSNs. On the other hand, the information carried by big sensory data has high redundancy because of strong correlation among sensory data. In this paper, we introduce the novel concept of ϵ-Kernel Dataset, which is only a small data subset and can represent the vast information carried by big sensory data with the information loss rate being less than ϵ, where ϵ can be arbitrarily small. We prove that drawing the minimum ϵ-Kernel Dataset is polynomial time solvable and provide a centralized algorithm with O(n 3 ) time complexity. Furthermore, a distributed algorithm with constant complexity O(1) is designed. It is shown that the result returned by the distributed algorithm can satisfy the ϵ requirement with a near optimal size. Furthermore, two distributed algorithms of maintaining the correlation coefficients among sensor nodes are developed. Finally, the extensive real experiment results and simulation results are presented. The results indicate that all the proposed algorithms have high performance in terms of accuracy and energy efficiency.
Smart mobile devices and mobile apps have been rolling out at swift speeds over the last decade, turning these devices into convenient and general-purpose computing platforms. Sensory data from smart ...devices are important resources to nourish mobile services, and they are regarded as innocuous information that can be obtained without user permissions. In this article, we show that this seemingly innocuous information could cause serious privacy issues. First, we demonstrate that users' tap positions on the screens of smart devices can be identified based on sensory data by employing some deep learning techniques. Second, it is shown that tap stream profiles for each type of apps can be collected, so that a user's app usage habit can be accurately inferred. In our experiments, the sensory data and mobile app usage information of 102 volunteers are collected. The experiment results demonstrate that the prediction accuracy of tap position inference can be at least 90 percent by utilizing convolutional neural networks. Furthermore, based on the inferred tap position information, users' app usage habits and passwords may be inferred with high accuracy.
Most existing query processing algorithms for wireless sensor networks (WSNs) can only deal with discrete values. However, since the monitored environment always changes continuously with time, ...discrete values cannot describe the environment accurately and, hence, may not satisfy a variety of query requirements, such as the queries of the maximal, minimal, and inflection points. It is, therefore, of great interest to introduce new queries capable of processing time-continuous data. This paper investigates curve query processing for WSNs as curve is an effective way to represent continuous sensed data. Specifically, a sensed curve derivation algorithm to support curve query processing in WSNs is first proposed. Then, the aggregation operation is employed as an example to illustrate curve query processing. The corresponding accurate and approximate aggregation algorithms are devised accordingly. We demonstrate that the energy cost of the approximate aggregation algorithm is optimal, provided that the required precision is satisfied. The theoretical analysis and experimental results indicate that the proposed algorithms can achieve high performance in terms of accuracy and energy efficiency.
Nowadays, deep learning-based models have been widely developed for atrial fibrillation (AF) detection in electrocardiogram (ECG) signals. However, owing to the inevitable over-fitting problem, ...classification accuracy of the developed models severely differed when applying on the independent test datasets. This situation is more significant for AF detection from dynamic ECGs. In this study, we explored two potential training strategies to address the over-fitting problem in AF detection. The first one is to use the Fast Fourier transform (FFT) and Hanning-window-based filter to suppress the influence from individual difference. Another is to train the model on the wearable ECG data to improve the robustness of model. Wearable ECG data from 29 patients with arrhythmia were collected for at least 24 h. To verify the effectiveness of the training strategies, a Long Short-Term Memory (LSTM) and Convolution Neural Network (CNN)-based model was proposed and tested. We tested the model on the independent wearable ECG data set, as well as the MIT-BIH Atrial Fibrillation database and PhysioNet/Computing in Cardiology Challenge 2017 database. The model achieved 96.23%, 95.44%, and 95.28% accuracy rates on the three databases, respectively. Pertaining to the comparison of the accuracy rates on each training set, the accuracy of the model trained in conjunction with the proposed training strategies only reduced by 2%, while the accuracy of the model trained without the training strategies decreased by approximately 15%. Therefore, the proposed training strategies serve as effective mechanisms for devising a robust AF detector and significantly enhanced the detection accuracy rates of the resulting deep networks.