Time-sequence data is high dimensional and contains a lot of information, which can be utilized in various fields, such as insurance, finance, and advertising. Personal data including time-sequence ...data is converted to anonymized datasets, which need to strike a balance between both privacy and utility. In this paper, we consider low-rank matrix factorization as one of anonymization methods and evaluate its efficiency. We convert time-sequence datasets to matrices and evaluate both privacy and utility. The record IDs in time-sequence data are changed at regular intervals to reduce re-identification risk. However, since individuals tend to behave in a similar fashion over periods of time, there remains a risk of record linkage even if record IDs are different. Hence, we evaluate the re-identification and linkage risks as privacy risks of time-sequence data. Our experimental results show that matrix factorization is a viable anonymization method and it can achieve better utility than existing anonymization methods.
The number of IT services that use machine learning (ML) algorithms are continuously and rapidly growing, while many of them are used in practice to make some type of predictions from personal data. ...Not surprisingly, due to this sudden boom in ML, the way personal data are handled in ML systems are starting to raise serious privacy concerns that were previously unconsidered. Recently, Fredrikson et al. USENIX 2014 CCS 2015 proposed a novel attack against ML systems called the model inversion attack that aims to infer sensitive attribute values of a target user. In their work, for the model inversion attack to be successful, the adversary is required to obtain two types of information concerning the target user prior to the attack: the output value (i.e., prediction) of the ML system and all of the non-sensitive values used to learn the output. Therefore, although the attack does raise new privacy concerns, since the adversary is required to know all of the non-sensitive values in advance, it is not completely clear how much risk is incurred by the attack. In particular, even though the users may regard these values as non-sensitive, it may be difficult for the adversary to obtain all of the non-sensitive attribute values prior to the attack, hence making the attack invalid. The goal of this paper is to quantify the risk of model inversion attacks in the case when non-sensitive attributes of a target user are not available to the adversary. To this end, we first propose a general model inversion (GMI) framework, which models the amount of auxiliary information available to the adversary. Our framework captures the model inversion attack of Fredrikson et al. as a special case, while also capturing model inversion attacks that infer sensitive attributes without the knowledge of non-sensitive attributes. For the latter attack, we provide a general methodology on how we can infer sensitive attributes of a target user without knowledge of non-sensitive attributes. At a high level, we use the data poisoning paradigm in a conceptually novel way and inject malicious data into the ML system in order to modify the internal ML model being used into a target ML model; a special type of ML model which allows one to perform model inversion attacks without the knowledge of non-sensitive attributes. Finally, following our general methodology, we cast ML systems that internally use linear regression models into our GMI framework and propose a concrete algorithm for model inversion attacks that does not require knowledge of the non-sensitive attributes. We show the effectiveness of our model inversion attack through experimental evaluation using two real data sets.
Hardware Trojans (HTs) have become a serious problem, and extermination of them is strongly required for enhancing the security and safety of integrated circuits. An effective solution is to identify ...HTs at the gate level via machine learning techniques. However, machine learning has specific vulnerabilities, such as adversarial examples . In reality, it has been reported that adversarial modified HTs greatly degrade the performance of a machine learning-based HT detection method. Therefore, we propose a robust HT detection method using adversarial training ( R-HTDetector ). We formally describe the robustness of R-HTDetector in modifying HTs. Our work gives the world-first adversarial training for HT detection with theoretical backgrounds. We show through experiments with Trust-HUB benchmarks that R-HTDetector overcomes adversarial examples while maintaining its original accuracy.
In the fourth industrial revolution, securing the protection of supply chains has become an ever-growing concern. One such cyber threat is a hardware Trojan (HT), a malicious modification to an IC. ...HTs are often identified during the hardware manufacturing process but should be removed earlier in the design process. Machine learning-based HT detection in gate-level netlists is an efficient approach to identifying HTs at the early stage. However, feature-based modeling has limitations in terms of discovering an appropriate set of HT features. We thus propose NHTD-GL in this paper, a novel node-wise HT detection method based on graph learning (GL). Given the formal analysis of the HT features obtained from domain knowledge, NHTD-GL bridges the gap between graph representation learning and feature-based HT detection. The experimental results demonstrate that NHTD-GL achieves 0.998 detection accuracy and 0.921 F1-score and outperforms state-of-the-art node-wise HT detection methods. NHTD-GL extracts HT features without heuristic feature engineering.
Privacy risks of collaborative filtering (CF) have been widely studied. The current state-of-theart inference attack on user behaviors (e.g., ratings/purchases on sensitive items) for CF is by ...Calandrino et al. (S&P, 2011). They showed that if an adversary obtained a moderate amount of user’s public behavior before some time
, she can infer user’s private behavior
time
. However, the existence of an attack that infers user’s private behavior
remains open. In this paper, we propose the first inference attack that reveals past private user behaviors. Our attack departs from previous techniques and is based on
(MI). In particular, we propose the first MI attack on factorization-based CF systems by leveraging data poisoning by Li et al. (NIPS, 2016) in a novel way. We inject malicious users into the CF system so that adversarialy chosen “decoy” items are linked with user’s private behaviors. We also show how to weaken the assumption made by Li et al. on the information available to the adversary from the whole rating matrix to only the item profile and how to create malicious ratings effectively. We validate the effectiveness of our inference algorithm using two real-world datasets.
Recently, the great demand for integrated circuits (ICs) drives third parties to be involved in IC design and manufacturing steps. At the same time, the threat of injecting a malicious circuit, ...called a hardware Trojan, by third parties has been increasing. Machine learning is one of the powerful solutions for detecting hardware Trojans. However, a weakness of such a machine-learning-based classification method against adversarial examples (AEs) has been reported, which causes misclassification by adding perturbation in input samples. This paper firstly proposes a framework generating adversarial examples for hardware-Trojan detection at gate-level netlists utilizing neural networks. The proposed framework replaces hardware Trojan circuits with logically equivalent ones, and makes it difficult to detect them. Secondly, we propose a Trojan-net concealment degree (TCD) and a modification evaluating value (MEV) as measures of the amount of modifications. Finally, based on the MEV, we pick up adversarial modification patterns to apply to the circuits against hardware-Trojan detection. The experimental results using benchmarks demonstrate that the proposed framework successfully decreases the true positive rate (TPR) by a maximum of 30.15 points.
Differentially private GNNs (Graph Neural Networks) have been recently studied to provide high accuracy in various tasks on graph data while strongly protecting user privacy. In particular, a recent ...study proposes an algorithm to protect each user's feature vector in an attributed graph, which includes feature vectors along with node IDs and edges, with LDP (Local Differential Privacy), a strong privacy notion without a trusted third party. However, this algorithm does not protect edges (friendships) in a social graph, hence cannot protect user privacy in unattributed graphs, which include only node IDs and edges. How to provide strong privacy with high accuracy in unattributed graphs remains open. In this paper, we propose a novel LDP algorithm called the DPRR (Degree-Preserving Randomized Response) to provide LDP for edges in GNNs. Our DPRR preserves each user's degree hence a graph structure while providing edge LDP. Technically, our DPRR uses Warner's RR (Randomized Response) and strategic edge sampling, where each user's sampling probability is automatically tuned using the Laplacian mechanism to preserve the degree information under edge LDP. We also propose a privacy budget allocation method to make the noise in both Warner's RR and the Laplacian mechanism small. We focus on graph classification as a task of GNNs and evaluate the DPRR using three social graph datasets. Our experimental results show that the DPRR significantly outperforms three baselines and provides accuracy close to a non-private algorithm in all datasets with a reasonable privacy budget, e.g., epsilon=1. Finally, we introduce data poisoning attacks to our DPRR and a defense against the attacks. We evaluate them using the three social graph datasets and discuss the experimental results.
Time-sequence data is high dimensional and con- tains a lot of information, which can be utilized in various fields, such as insurance, finance, and advertising. Personal data including time-sequence ...data is often converted to anonymized datasets, which need to strike a balance between both privacy and utility. In this paper, we consider low-rank matrix decomposition as one of the anonymization methods and evaluate its efficiency. We convert time-sequence datasets to matrices and evaluate both privacy and utility. The record IDs in time-sequence data are changed at regular intervals to reduce re-identification risk. However, since individuals tend to behave in a similar fashion over periods of time, there remains a risk of record linkage even if record IDs are different. Hence, we evaluate the re- identification and linkage risks as privacy risks of time-sequence data. Our experimental results show that matrix decomposition is a viable anonymization method and it can achieve better utility than existing anonymization methods.
Active Attack Against Oblivious RAM Nakano, Yuto; Hidano, Seira; Kiyomoto, Shinsaku ...
2018 IEEE 32nd International Conference on Advanced Information Networking and Applications (AINA)
Conference Proceeding
When a user consumes an encrypted digital content (for example video and music files), the player application accesses the secret key to decrypt the content. If the user is malicious, he can analyse ...the access pattern of the player application to extract the secret key efficiently. Oblivious RAMs (ORAMs) are effective solution for such threats. However, ORAMs are only effective for `passive' attackers who can observe the RAM access done by the application, but cannot alter data stored on RAM. The attacker with ability to alter data on RAM can be called `active' attackers. In this paper, we evaluate the security of ORAM schemes against active adversaries where they alter data on RAM and try to efficiently extract the secret information. We also propose countermeasures against active adversaries.