•Simple, yet powerful, method to copy a black-box CNN model with random natural images.•Some constraints are waived and copy attacks are performed with less information.•Understanding copy attacks ...with random natural images.•Throughout evaluation of copycat models created with random natural images.
Convolutional neural networks have been successful lately enabling companies to develop neural-based products, which demand an expensive process, involving data acquisition and annotation; and model generation, usually requiring experts. With all these costs, companies are concerned about the security of their models against copies and deliver them as black-boxes accessed by APIs. Nonetheless, we argue that even black-box models still have some vulnerabilities. In a preliminary work, we presented a simple, yet powerful, method to copy black-box models by querying them with natural random images. In this work, we consolidate and extend the copycat method: (i) some constraints are waived; (ii) an extensive evaluation with several problems is performed; (iii) models are copied between different architectures; and, (iv) a deeper analysis is performed by looking at the copycat behavior. Results show that natural random images are effective to generate copycats for several problems.
•An interactive framework for reconstruction of strip-shredded documents.•The user lock and forbid pairs automatically selected by the recommender module.•Four query strategies for recommending the ...pairs of shreds to be annotated.•A novel methodology to assess the human impact on the quality of a reconstruction.•Annotating 25% of the shreds can yield an error reduction of more than 40%.
Display omitted
The advances in machine learning – particularly in deep learning – have enabled automatizing the reconstruction of shredded documents with significant accuracy. However, despite the recent remarkable results, the state-of-the-art on fully automatic reconstruction still has room for improvement, mainly due to imprecision on the evaluation of how the shreds fit each other (compatibility/cost evaluation). To tackle this problem, we propose a human-in-the-loop reconstruction framework that takes user inputs to improve the solutions (permutation of shreds). In our approach, the user verifies whether adjacent shreds of a solution are also adjacent in the original document. Unlike the current literature, our framework includes a recommender module that automatically selects pairs of shreds to be analyzed by a human. Four recommendation strategies were proposed and evaluated. Results achieved by coupling deep learning reconstruction methods into our framework have shown that introducing the human in the loop can reduce errors by more than 40%.
•Paper shreds matching via self-supervised deep learning.•Training with simulated cuts is effective for real-shredded documents.•A new public dataset with 100 strip-shredded documents (2292 ...shreds).•Accurate (over 90% accuracy) reconstruction of 100 mixed shredded documents.
The reconstruction of shredded documents consists of coherently arranging fragments of paper (shreds) to recover the original document(s). A great challenge in computational reconstruction is to properly evaluate the compatibility between the shreds. While traditional pixel-based approaches are not robust to real shredding, more sophisticated solutions compromise significantly time performance. The solution presented in this work extends our previous deep learning method for single-page reconstruction to a more realistic/complex scenario: the reconstruction of several mixed shredded documents at once. In our approach, the compatibility evaluation is modeled as a two-class (valid or invalid) pattern recognition problem. The model is trained in a self-supervised manner on samples extracted from simulated-shredded documents, which obviates manual annotation. Experimental results on three datasets – including a new collection of 100 strip-shredded documents produced for this work – have shown that the proposed method outperforms the competing ones on complex scenarios, achieving accuracy superior to 90%.