Privacy protection is considered as an important problem in learning-based systems. Recently, various works based on differential privacy have been proposed to protect an individual's privacy in the ...machine learning and deep learning contexts. One of the state-of-the-art approaches is Private Aggregation of Teacher Ensembles (PATE), a generic framework which can be successfully applied to many different learning algorithms. In PATE, we need to split the private dataset into many disjoint subsets and train an ensemble of teachers on these subsets. Then, we transfer noisy predictions from the ensemble of teachers to a student model. In this paper, we show that for complex datasets and tasks, such as nature image classification, the training set allocated for one teacher may be too small with respect to the corresponding task to achieve an ideal performance. To alleviate this problem, we propose the TrPATE framework which extends PATE with transfer learning. Based on PATE, we transfer and share the knowledge extracted from a publicly available non-private dataset to the teachers. The extensive experiments are conducted on various datasets, and the empirical results demonstrate the effectiveness of our method.
Recently, membership inference attacks (MIAs) against machine learning models have been proposed. Using MIAs, adversaries can inference whether a data record is in the training set of the target ...model. Defense methods which use differential privacy mechanisms or adversarial training cannot handle the trade-off between privacy and utility well. Other methods based on knowledge transfer to improve model utility need public unlabeled data in the same distribution as private data, and this requirement may not be satisfied in some scenarios. To handle the trade-off between privacy and utility better, we propose two algorithms of deep learning, i.e., complementary knowledge distillation (CKD) and pseudo complementary knowledge distillation (PCKD). In CKD, the transfer data of knowledge distillation all come from the private training set, but their soft targets are generated from the teacher model which is trained using their complementary set. With similar idea, we propose PCKD which reduces the training set of each teacher model and uses model averaging to generate soft targets of transfer data. Because smaller training set leads to less utility, PCKD utilizes pre-training to improve the utility of teacher models. Experimental results on widely used datasets show that CKD and PCKD can both averagely decrease attack accuracy by nearly 25% with negligible utility loss. The training time of PCKD is nearly 40% lower than that of CKD. Compared with existing defense methods such as DMP, adversarial regularization, dropout, and DP-SGD, CKD and PCKD have great advantages on handling the trade-off between privacy and utility.
For automatic and intelligent check of operation ticket for power grid scheduling, a scheduling operation ticket checking and analysis method based on bidirectional GRU (gated recurrent unit) neural ...networks and multiple verification rules is proposed. The semantic analysis technology based on bidirectional GRU neural network is combined with the intelligent checking method for data pre-processing of dispatching instruction and maintenance application form. The state of safety protection measures is compared with the final state of operation ticket equipment. In the intelligent rule checking link, format checking, logic checking and safety checking are performed in turn, and finally the ticket errors are output. The proposed method is tested using the historical dispatching operation tickets in the intelligent outage control system in Zhejiang power grid. The results show that the method can improve the checking safety and efficiency and realize the intelligent operation ticket checking of power grid dispatch
Physically Unclonable Functions (PUFs), as an alternative hardware-based security method, have been challenged by some modeling attacks. As is known to all, samples are significant in modeling ...attacks on PUFs, and thus, some efforts have been made to expand sample sets therein to improve modeling attacks. A closer examination, however, reveals that not all samples contribute to modeling attacks equally. Therefore, in this article, we introduce the concept of sample essentiality for describing the contribution of a sample in modeling attacks and point out that any sample without sample essentiality cannot enhance some modeling attacks on PUFs. As a by-product, we find theoretically and empirically that the samples expanded by the procedures proposed by Chatterjee et al. do not satisfy our sample essentiality. Furthermore, we propose the notion of essential sample sets for datasets and discuss its basic properties. Finally, we demonstrate that our results about sample essentiality can be used to reduce samples efficiently and benefit sample selection in modeling attacks on arbiter PUFs.
This paper proposes an internet-based personal identity verification system using lossless data hiding and fingerprint recognition technologies. At the client side, the SHA-256 hash of the original ...fingerprint image and sensitive personal information are encrypted and embedded into the fingerprint image using an advanced lossless data hiding scheme. At the service provider side, after the hidden data are extracted out, the fingerprint image can be recovered without any distortion due to the usage of the lossless data hiding scheme. Hence, the originality of the fingerprint image can be ensured via hash check. The extracted personal information can be used to obtain the correct fingerprint feature from the database. The fingerprint matching can finally verify the client’s identity. The experimental results demonstrate that our proposed system is effective. It can find wide applications in e-banking and e-government systems to name a few.
This paper proposes a novel approach to high capacity lossless data hiding based on integer wavelet transform, which embeds high capacity data into the most insensitive bit-planes of wavelet ...coefficients. Specifically, three high capacity lossless data hiding methods, namely A, B and C are proposed. Method A is the traditional lossless data hiding technique, which can losslessly recover the original image. The capacity can reach 1/10 of the data volume that the original image occupies and histogram modification is used to prevent over/underflow. Method B is not a traditional lossless data hiding technique. It can only losslessly recover the pre-processed image instead of the original image. However, the capacity can reach 1/2 of the data volume that the original image occupies. It has better visual quality than replacing the four least significant bit-planes in the spatial domain. Method C has not only the larger capacity but also better visual quality than Method B. However, it can only losslessly recover the hidden data. These three methods passed through the test on all 1096 images of CorelDraw database. These techniques can be applied to e-government, e-business, e-medical data systems, e-law enforcement and military systems.