With the revelation that Facebook handed over personally identifiable information of more than 87 million users to Cambridge Analytica, it is now imperative that comprehensive privacy policy laws be ...developed. Technologists, researchers, and innovators should meaningfully contribute to the development of these policies.
Privacy has become a big concern for consumers in electricity consumption activities, as privacy disclosure may cause losses to individuals. Since the information exchange and update in distributed ...energy management (DEM) of smart grids leaves eavesdroppers an opportunity to obtain the private information, it is worth studying privacy disclosure of DEM and design effective privacy-preserving schemes. In this paper, we investigate the privacy concern of a consensus-based DEM algorithm, where both generation units and responsive consumers cooperatively maximize the social welfare. First, we reveal that the private information of consumers including the electricity consumption and the sensitivity to the electricity price can be disclosed under traditional consensus-based DEM. Then, we propose a secret-function-based privacy-preserving algorithm to preserve the private information, where each node adds zero-sum and exponentially decaying noises to the original data for communications. It is assumed that local secret function can only be known by neighboring nodes. To relax this assumption, we propose a privacy-preserving algorithm, where each node utilizes real information for the state update and broadcasts the one with noise. We show that both of two proposed algorithms can preserve the privacy and the privacy degree is analyzed through (ε,δ) -data-privacy. At the same time, the convergence and optimality of final solution are maintained. Extensive simulations demonstrate the effectiveness of proposed algorithms.
Towards Personalized Federated Learning Tan, Alysa Ziying; Yu, Han; Cui, Lizhen ...
IEEE transaction on neural networks and learning systems,
12/2023, Volume:
34, Issue:
12
Journal Article
Open access
In parallel with the rapid adoption of artificial intelligence (AI) empowered by advances in AI research, there has been growing awareness and concerns of data privacy. Recent significant ...developments in the data regulation landscape have prompted a seismic shift in interest toward privacy-preserving AI. This has contributed to the popularity of Federated Learning (FL), the leading paradigm for the training of machine learning models on data silos in a privacy-preserving manner. In this survey, we explore the domain of personalized FL (PFL) to address the fundamental challenges of FL on heterogeneous data, a universal characteristic inherent in all real-world datasets. We analyze the key motivations for PFL and present a unique taxonomy of PFL techniques categorized according to the key challenges and personalization strategies in PFL. We highlight their key ideas, challenges, opportunities, and envision promising future trajectories of research toward a new PFL architectural design, realistic PFL benchmarking, and trustworthy PFL approaches.
With the outbreak of COVID-19, contact tracing is becoming a used intervention to control the spread of this highly infectious disease. This article explores an individual's intention to adopt ...COVID-19 digital contact tracing (DCT) apps. A conceptual framework developed for this article combines the procedural fairness theory, dual calculus theory, protection motivation theory, theory of planned behavior, and Hofstede's cultural dimension theory. The study adopts a quantitative approach collecting data from 714 respondents using a random sampling technique. The proposed model is tested using structural equation modeling. Empirical results found that the perceived effectiveness of privacy policy negatively influenced privacy concerns, whereas perceived vulnerability had a positive influence. Expected personal and community-related outcomes of sharing information positively influenced attitudes toward DCT apps, while privacy concerns had a negative effect. The intention to adopt DCT apps were positively influenced by attitude, subjective norms, and privacy self-efficacy. This article is the first to empirically test the adoption of DCT apps of the COVID-19 pandemic and contributes both theoretically and practically toward understanding factors influencing its widespread adoption.
Social network data can help with obtaining valuable insight into social behaviors and revealing the underlying benefits. New big data technologies are emerging to make it easier to discover ...meaningful social information from market analysis to counterterrorism. Unfortunately, both diverse social datasets and big data technologies raise stringent privacy concerns. Adversaries can launch inference attacks to predict sensitive latent information, which is unwilling to be published by social users. Therefore, there is a tradeoff between data benefits and privacy concerns. In this paper, we investigate how to optimize the tradeoff between latent-data privacy and customized data utility. We propose a data sanitization strategy that does not greatly reduce the benefits brought by social network data, while sensitive latent information can still be protected. Even considering powerful adversaries with optimal inference attacks, the proposed data sanitization strategy can still preserve both data benefits and social structure, while guaranteeing optimal latent-data privacy. To the best of our knowledge, this is the first work that preserves both data benefits and social structure simultaneously and combats against powerful adversaries.
Edge computing and distributed machine learning have advanced to a level that can revolutionize a particular organization. Distributed devices such as the Internet of Things (IoT) often produce a ...large amount of data, eventually resulting in big data that can be vital in uncovering hidden patterns, and other insights in numerous fields such as healthcare, banking, and policing. Data related to areas such as healthcare and banking can contain potentially sensitive data that can become public if they are not appropriately sanitized. Federated learning (FedML) is a recently developed distributed machine learning (DML) approach that tries to preserve privacy by bringing the learning of an ML model to data owners’ devices. However, literature shows different attack methods such as membership inference that exploit the vulnerabilities of ML models as well as the coordinating servers to retrieve private data. Hence, FedML needs additional measures to guarantee data privacy. Furthermore, big data often requires more resources than available in a standard computer. This paper addresses these issues by proposing a distributed perturbation algorithm named as DISTPAB, for privacy preservation of horizontally partitioned data. DISTPAB alleviates computational bottlenecks by distributing the task of privacy preservation utilizing the asymmetry of resources of a distributed environment, which can have resource-constrained devices as well as high-performance computers. Experiments show that DISTPAB provides high accuracy, high efficiency, high scalability, and high attack resistance. Further experiments on privacy-preserving FedML show that DISTPAB is an excellent solution to stop privacy leaks in DML while preserving high data utility.
Digitisation is arguably an inevitable feature of contemporary urban development, yet privacy issues arising from the mass data collection, transmission and processing it entails continue to be a ...poorly understood and contentious issue for people living in cities. This article uses a case study approach to provide new evidence of the detailed perspectives of citizens and policy makers on data privacy in rapidly digitising urban environments, with a focus on one of the UK’s most prominent smart cities: Manchester. It adds to the literature on smart cities through the application of complementary scholarship from two areas – trust and participation – in order to analyse comparatively citizens’ views and concerns on data gathering activity in their city with efforts of policy makers to incorporate data privacy matters in their digital city planning. The article finds a clear – but reparable – data privacy disconnect between people and digital policy makers and explores how citizen data privacy concerns may be addressed through a lens of trust and participation.
High-quality private machine learning (ML) data stored in local data centers becomes a key competitive factor for AI corporations. In this paper, we present a novel insider attack called Matryoshka ...to reveal the possibility of breaking the privacy of ML data even with no exposed interface. Our attack employs a scheduled-to-publish DNN model as a carrier model for covert transmission of secret models which memorize the information of private ML data that otherwise has no interface to the outsider. At the core of our attack, we present a novel parameter sharing approach which exploits the learning capacity of the carrier model for information hiding. Our approach simultaneously achieves: (i) High Capacity - With almost no utility loss of the carrier model, Matryoshka can transmit over 10,000 real-world data samples within a carrier model which has <inline-formula><tex-math notation="LaTeX">220\times</tex-math></inline-formula> less parameters than the total size of the stolen data, and simultaneously transmit multiple heterogeneous datasets or models within a single carrier model under a trivial distortion rate, neither of which can be done with existing steganography techniques; (ii) Decoding Efficiency - once downloading the published carrier model, an outside colluder can exclusively decode the hidden models from the carrier model with only several integer secrets and the knowledge of the hidden model architecture; (iii) Effectiveness - Moreover, almost all the recovered models either have similar performance as if it is trained independently on the private data, or can be further used to extract memorized raw training data with low error; (iv) Robustness - Information redundancy is naturally implemented to achieve resilience against common post-processing techniques on the carrier before its publishing; (v) Covertness - A model inspector with different levels of prior knowledge could hardly differentiate a carrier model from a normal model.
Recent transient-execution attacks, such as RIDL, Fallout, and ZombieLoad, demonstrated that attackers can leak information while it transits through microarchitectural buffers. Named ...Microarchitectural Data Sampling (MDS) by Intel, these attacks are likened to "drinking from the firehose", as the attacker has little control over what data is observed and from what origin. Unable to prevent the buffers from leaking, Intel issued countermeasures via microcode updates that overwrite the buffers when the CPU changes security domains.In this work we present CacheOut, a new microarchitectural attack that is capable of bypassing Intel's buffer overwrite countermeasures. We observe that as data is being evicted from the CPU's L1 cache, it is often transferred back to the leaky CPU buffers where it can be recovered by the attacker. CacheOut improves over previous MDS attacks by allowing the attacker to choose which data to leak from the CPU's L1 cache, as well as which part of a cache line to leak. We demonstrate that CacheOut can leak information across multiple security boundaries, including those between processes, virtual machines, user and kernel space, and from SGX enclaves.
Robust Aggregation for Federated Learning Pillutla, Krishna; Kakade, Sham M.; Harchaoui, Zaid
IEEE transactions on signal processing,
2022, Volume:
70
Journal Article
Peer reviewed
Open access
We present a novel approach to federated learning that endows its aggregation process with greater robustness to potential poisoning of local data or model parameters of participating devices. The ...proposed approach, Robust Federated Aggregation (RFA), relies on the aggregation of updates using the geometric median, which can be computed efficiently using a Weiszfeld-type algorithm. RFA is agnostic to the level of corruption and aggregates model updates without revealing each device's individual contribution. We establish the convergence of the robust federated learning algorithm for the stochastic learning of additive models with least squares. We also offer two variants of RFA: a faster one with one-step robust aggregation, and another one with on-device personalization. We present experimental results with additive models and deep networks for three tasks in computer vision and natural language processing. The experiments show that RFA is competitive with the classical aggregation when the level of corruption is low, while demonstrating greater robustness under high corruption.