"GrabCut" Rother, Carsten; Kolmogorov, Vladimir; Blake, Andrew
ACM transactions on graphics,
08/2004, Volume:
23, Issue:
3
Journal Article
Peer reviewed
Open access
The problem of efficient, interactive foreground/background segmentation in still images is of great practical importance in image editing. Classical image segmentation tools use either texture ...(colour) information, e.g. Magic Wand, or edge (contrast) information, e.g. Intelligent Scissors. Recently, an approach based on optimization by graph-cut has been developed which successfully combines both types of information. In this paper we extend the graph-cut approach in three respects. First, we have developed a more powerful, iterative version of the optimisation. Secondly, the power of the iterative algorithm is used to simplify substantially the user interaction needed for a given quality of result. Thirdly, a robust algorithm for "border matting" has been developed to estimate simultaneously the alpha-matte around an object boundary and the colours of foreground pixels. We show that for moderately difficult examples the proposed method outperforms competitive tools.
Abstract
Due to the ease in manipulation of digital images with various existing photo editing tools and softwares the task of authentication and identifying the reliability of digital image has ...become a major concern. Copy-move forgery detection (CMFD) is therefore known to be one of the key-domains in the latest digital image authentication study. Copy-move forgery is a passive approach in which a portion of an image is copied and then pasted onto the same image, resulting in a tampered image. In this paper, current developments in CMFD have been surveyed and a comparative evaluation of recent CMFD techniques has been done along with its advantages and limitations. Also, a detailed description of relevant copy-move forgery detection datasets is provided which will help the researchers to decide which dataset to choose for a given CMFD approach.
Existing forensic techniques for image manipulation localization crucially assume that probe pixels belong to one of exactly two classes, genuine or manipulated. This letter argues that this ...convention fuels mis-labeling particularly in unsupervised settings, where singular but genuine content or the presence of multiple distinct manipulations may easily induce non-optimal partitions of the feature space. We propose to relax constraints via a greedy n-ary clustering approach, which we instantiate exemplarily in the popular pixel descriptor space of residual co-occurrences. Experimental results on widely used public benchmark datasets highlight the benefits of our approach.
Digital steganography is becoming a common tool for protecting sensitive communications in various applications such as crime/terrorism prevention whereby law enforcing personals need to remotely ...compare facial images captured at the scene of crime with faces databases of known criminals/suspects; exchanging military maps or surveillance video in hostile environment/situations; privacy preserving in the healthcare systems when storing or exchanging patient’s medical images/records; and prevent bank customers’ accounts/records from being accessed illegally by unauthorized users. Existing digital steganography schemes for embedding secret images in cover image files tend not to exploit various redundancies in the secret image bit-stream to deal with the various conflicting requirements on embedding capacity, stego-image quality, and undetectibility. This paper is concerned with the development of innovative image procedures and data hiding schemes that exploit, as well as increase, similarities between secret image bit-stream and the cover image LSB plane. This will be achieved in two novel steps involving manipulating both the secret and the cover images, prior to embedding, to achieve higher 0:1 ratio in both the secret image bit-stream and the cover image LSB plane. The above two steps strategy has been exploited to use a bit-plane(s) mapping technique, instead of bit-plane(s) replacement to make each cover pixel usable for secret embedding. This paper will demonstrate that this strategy produces stego-images that have minimal distortion, high embedding efficiency, reasonably good stego-image quality and robustness against 3 well-known targeted steganalysis tools.
We present a procedural framework for modeling the annual ring pattern of solid wood with knots. Although wood texturing is a well-studied topic, there have been few previous attempts at modeling ...knots inside the wood texture. Our method takes the skeletal structure of a tree log as input and produces a three-dimensional scalar field representing the time of added growth, which defines the volumetric annual ring pattern. First, separate fields are computed around each strand of the skeleton, i.e., the stem and each knot. The strands are then merged into a single field using smooth minimums. We further suggest techniques for controlling the smooth minimum to adjust the balance of smoothness and reproduce the distortion effects observed around dead knots. Our method is implemented as a shader program running on a GPU with computation times of approximately 0.5 s per image and an input data size of 600 KB. We present rendered images of solid wood from pine and spruce as well as plywood and cross-laminated timber (CLT). Our results were evaluated by wood experts, who confirmed the plausibility of the rendered annual ring patterns. Link to code: https://github.com/marialarsson/procedural_knots.
Semantic colorization with internet images Chia, Alex Yong-Sang; Zhuo, Shaojie; Gupta, Raj Kumar ...
Proceedings of the 2011 SIGGRAPH Asia Conference,
12/2011
Conference Proceeding
Peer reviewed
Colorization of a grayscale photograph often requires considerable effort from the user, either by placing numerous color scribbles over the image to initialize a color propagation algorithm, or by ...looking for a suitable reference image from which color information can be transferred. Even with this user supplied data, colorized images may appear unnatural as a result of limited user skill or inaccurate transfer of colors. To address these problems, we propose a colorization system that leverages the rich image content on the internet. As input, the user needs only to provide a semantic text label and segmentation cues for major foreground objects in the scene. With this information, images are downloaded from photo sharing websites and filtered to obtain suitable reference images that are reliable for color transfer to the given grayscale photo. Different image colorizations are generated from the various reference images, and a graphical user interface is provided to easily select the desired result. Our experiments and user study demonstrate the greater effectiveness of this system in comparison to previous techniques.
Adversarial attacks have been demonstrated to fool the deep classification networks. There are two key characteristics of these attacks: firstly, these perturbations are mostly additive noises ...carefully crafted from the deep neural network itself. Secondly, the noises are added to the whole image, not considering them as the combination of multiple components from which they are made. Motivated by these observations, in this research, we first study the role of various image components and the impact of these components on the classification of the images. These manipulations do not require the knowledge of the networks and external noise to function effectively and hence have the potential to be one of the most practical options for real-world attacks. Based on the significance of the particular image components, we also propose a transferable adversarial attack against unseen deep networks. The proposed attack utilizes the projected gradient descent strategy to add the adversarial perturbation to the manipulated component image. The experiments are conducted on a wide range of networks and four databases including ImageNet and CIFAR-100. The experiments show that the proposed attack achieved better transferability and hence gives an upper hand to an attacker. On the ImageNet database, the success rate of the proposed attack is up to 88.5%, while the current state-of-the-art attack success rate on the database is 53.8%. We have further tested the resiliency of the attack against one of the most successful defenses namely adversarial training to measure its strength. The comparison with several challenging attacks shows that: (i) the proposed attack has a higher transferability rate against multiple unseen networks and (ii) it is hard to mitigate its impact. We claim that based on the understanding of the image components, the proposed research has been able to identify a newer adversarial attack unseen so far and unsolvable using the current defense mechanisms.
Learning the similarity between fashion items is essential for many fashion-related tasks. Most methods based on global or local image similarity cannot meet the fine-grained retrieval requirements ...related to attributes. We are the first to clearly distinguish the concepts of attribute name and their values and divide fashion retrieval tasks that combine images and text into: attribute-guided retrieval and attribute-manipulated retrieval. We propose a hierarchical attribute-aware embedding network (HAEN) that takes images and attributes as input, learns multiple attribute-specific embedding spaces, and measures fine-grained similarity in the corresponding spaces. It can accurately map different attributes to the corresponding areas of the image, thereby facilitating the feature fusion of two different modalities of text and image, including enhancement and replacement. Then on this basis, we propose three attribute-manipulated similarity learning methods, HAEN_Avg, HAEN_Rec, and HAEN_Cmb. With comprehensive validation on two real-world fashion datasets, we demonstrate that our methods can effectively leverage semantic knowledge to improve image retrieval performance, including attribute-guided and attribute-manipulated retrieval tasks.
Full text
Available for:
DOBA, IZUM, KILJ, NUK, PILJ, PNG, SAZU, UILJ, UKNU, UL, UM, UPUK
Metasurfaces are recently incorporated with waveguides for in‐plane wave manipulation and extraction, enhancing the capability and design flexibility of integrated photonic devices. However, the ...unique guided‐wave‐driven scheme poses challenges in achieving multiplexed functionality for on‐chip integrated metasurfaces. Here, an on‐chip direction‐multiplexing strategy is proposed and demonstrated for high‐capacity 3D multiplane projection in free space. By exploring and utilizing the direction‐dependence of the guided‐wave‐driven detour phase manipulation, a 16‐channel on‐chip holography is realized with different images independently encoded at distinct z planes for all four illumination directions. Moreover, the on‐chip full‐space 3D holography is demonstrated by exploiting the conjugated relation of the optical responses driven by the opposite on‐chip illuminations, enabling observation of a total of 16 holographic images in full‐space. The demonstrated on‐chip direction‐multiplexed holography exhibits low background noise or crosstalk and offers high information density and quality, which shows promise for practical applications in data storage, 3D display, and virtual/augmented reality.
An on‐chip direction‐multiplexing strategy is proposed and demonstrated for high‐capacity 3D multiplane projection in free space. By utilizing the direction‐dependence of the guided‐wave‐driven detour phase manipulation, the direction‐multiplexed 16‐channel holography and the on‐chip full‐space 3D holography are realized. Such a feasible and robust direction‐multiplexing method is expected to benefit the on‐chip applications with an increasing demand for information capacity.
Full text
Available for:
BFBNIB, FZAB, GIS, IJS, KILJ, NLZOH, NUK, OILJ, SAZU, SBCE, SBMB, UL, UM, UPUK
Currently, there are several widely used commercial cloud-based services that attempt to recognize an individual’s emotions based on their facial expressions. Most research into facial emotion ...recognition has used high-resolution, front-oriented, full-face images. However, when images are collected in naturalistic settings (e.g., using smartphone’s frontal camera), these images are likely to be far from ideal due to camera positioning, lighting conditions, and camera shake. The impact these conditions have on the accuracy of commercial emotion recognition services has not been studied in full detail. To fill this gap, we selected five prominent commercial emotion recognition systems—Amazon Rekognition, Baidu Research, Face++, Microsoft Azure, and Affectiva—and evaluated their performance via two experiments. In Experiment 1, we compared the systems’ accuracy at classifying images drawn from three standardized facial expression databases. In Experiment 2, we first identified several common scenarios (e.g., partially visible face) that can lead to poor-quality pictures during smartphone use, and manipulated the same set of images used in Experiment 1 to simulate these scenarios. We used the manipulated images to again compare the systems’ classification performance, finding that the systems varied in how well they handled manipulated images that simulate realistic image distortion. Based on our findings, we offer recommendations for developers and researchers who would like to use commercial facial emotion recognition technologies in their applications.
Full text
Available for:
EMUNI, FIS, FZAB, GEOZS, GIS, IJS, IMTLJ, KILJ, KISLJ, MFDPS, NLZOH, NUK, OILJ, PNG, SAZU, SBCE, SBJE, SBMB, SBNM, UKNU, UL, UM, UPUK, VKSCE, ZAGLJ
You have reached the maximum number of search results that are displayed.
For better performance, the search offers a maximum of 1,000 results per query (or 50 pages if the option 10/page is selected).
Consider using result filters or changing the sort order to explore your results further.