The determination of the regularization parameter is an important issue in regularized image restoration, since it controls the trade-off between fidelity to the data and smoothness of the solution. ...A number of approaches have been developed in determining this parameter. In this paper, a new paradigm is adopted, according to which the required prior information is extracted from the available data at the previous iteration step, i.e., the partially restored image at each step. We propose the use of a regularization functional instead of a constant regularization parameter. The properties such a regularization functional should satisfy are investigated, and two specific forms of it are proposed. An iterative algorithm is proposed for obtaining a restored image. The regularization functional is defined in terms of the restored image at each iteration step, therefore allowing for the simultaneous determination of its value and the restoration of the degraded image. Both proposed iteration adaptive regularization functionals are shown to result in a smoothing functional with a global minimum, so that its iterative optimization does not depend on the initial conditions. The convergence of the algorithm is established and experimental results are shown.< >
Video super-resolution (VSR) has become one of the most critical problems in video processing. In the deep learning literature, recent works have shown the benefits of using adversarial-based and ...perceptual losses to improve the performance on various image restoration tasks; however, these have yet to be applied for video super-resolution. In this paper, we propose a generative adversarial network (GAN)-based formulation for VSR. We introduce a new generator network optimized for the VSR problem, named VSRResNet, along with new discriminator architecture to properly guide VSRResNet during the GAN training. We further enhance our VSR GAN formulation with two regularizers, a distance loss in feature-space and pixel-space, to obtain our final VSRResFeatGAN model. We show that pre-training our generator with the mean-squared-error loss only quantitatively surpasses the current state-of-the-art VSR models. Finally, we employ the PercepDist metric to compare the state-of-the-art VSR models. We show that this metric more accurately evaluates the perceptual quality of SR solutions obtained from neural networks, compared with the commonly used PSNR/SSIM metrics. Finally, we show that our proposed model, the VSRResFeatGAN model, outperforms the current state-of-the-art SR models, both quantitatively and qualitatively.
Given a collection of images or a short video sequence, we define a thematic object as the key object that frequently appears and is the representative of the visual contents. Successful discovery of ...the thematic object is helpful for object search and tagging, video summarization and understanding, etc. However, this task is challenging because 1) there lacks a priori knowledge of the thematic objects, such as their shapes, scales, locations, and times of re-occurrences, and 2) the thematic object of interest can be under severe variations in appearances due to viewpoint and lighting condition changes, scale variations, etc. Instead of using a top-down generative model to discover thematic visual patterns, we propose a novel bottom-up approach to gradually prune uncommon local visual primitives and recover the thematic objects. A multilayer candidate pruning procedure is designed to accelerate the image data mining process. Our solution can efficiently locate thematic objects of various sizes and can tolerate large appearance variations of the same thematic object. Experiments on challenging image and video data sets and comparisons with existing methods validate the effectiveness of our method.
In this paper, benefiting from the strong ability of deep neural network in estimating non-linear functions, we propose a discriminative embedding function to be used as a feature extractor for ...clustering tasks. The trained embedding function transfers knowledge from the domain of a labeled set of morphologically-distinct images, known as classes, to a new domain within which new classes can potentially be isolated and identified. Our target application in this paper is the Gravity Spy Project, which is an effort to characterize transient, non-Gaussian noise present in data from the Advanced Laser Interferometer Gravitational-wave Observatory, or LIGO. Accumulating large, labeled sets of noise features and identifying of new classes of noise lead to a better understanding of their origin, which makes their removal from the data and/or detectors possible.
The introduction of Video Objects (VOs) is one of the innovations of MPEG-4. The /spl alpha/-plane of a VO defines its shape at a given instance in time and hence determines the boundary of its ...texture. In packet-based networks, shape, motion, and texture are subject to loss. While there has been considerable attention paid to the concealment of texture and motion errors, little has been done in the field of shape error concealment. In this paper we propose a post-processing shape error concealment technique that uses the motion compensated boundary information of the previously received /spl alpha/-plane. The proposed approach is based on matching received boundary segments in the current frame to the boundary in the previous frame. This matching is achieved by finding a maximally smooth motion vector field. After the current boundary segments are matched to the previous boundary, the missing boundary pieces are reconstructed by motion compensation. Experimental results demonstrating the performance of the proposed motion compensated shape error concealment method, and comparing it with the previously proposed weighted side matching method are presented.
In object-based video encoding, the encoding of the video data is decoupled into the encoding of shape, motion, and texture information, which enables certain functionalities, like content-based ...interactivity and content-based scalability. The fundamental problem, however, of how to jointly encode this separate information to reach the best coding efficiency has not been studied thoroughly. In this paper, we present an operational rate-distortion optimal scheme for the allocation of bits among shape, motion, and texture in object-based video encoding. Our approach is based on Lagrangian relaxation and dynamic programming. We implement our algorithm on the MPEG-4 video verification model, although it is applicable to any object-based video encoding scheme. The performance is accessed utilizing a proposed metric that jointly captures the distortion due to the encoding of the shape and texture. Experimental results demonstrate that the gains of lossy shape encoding depend on the percentage the shape bits occupy out of the total bit budget. This gain may be small or may be realized at very low bit rates for certain typical scenes.
In this manuscript, we address the problem of studying layer structure in X-ray Fluorescence (XRF) elemental maps of paintings through the incorporation of reflectance imaging spectral data in the ...visible or near IR range. We propose a conceptually flexible approach, which involves an initial clustering step for the visible hyperspectral reflectance data (RIS) and the formation of a synthetic surface XRF image. Considering the difference of the full and synthetic surface XRF images, surface and subsurface correlated features are then identified. Results are demonstrated on real and simulated data.
Recovery of low-rank matrices has recently seen significant activity in many areas of science and engineering, motivated by recent theoretical results for exact reconstruction guarantees and ...interesting practical applications. In this paper, we present novel recovery algorithms for estimating low-rank matrices in matrix completion and robust principal component analysis based on sparse Bayesian learning (SBL) principles. Starting from a matrix factorization formulation and enforcing the low-rank constraint in the estimates as a sparsity constraint, we develop an approach that is very effective in determining the correct rank while providing high recovery performance. We provide connections with existing methods in other similar problems and empirical results and comparisons with current state-of-the-art methods that illustrate the effectiveness of this approach.
Respiratory diseases constitute one of the leading causes of death worldwide and directly affect the patient's quality of life. Early diagnosis and patient monitoring, which conventionally include ...lung auscultation, are essential for the efficient management of respiratory diseases. Manual lung sound interpretation is a subjective and time-consuming process that requires high medical expertise. The capabilities that deep learning offers could be exploited in order that robust lung sound classification models can be designed. In this paper, we propose a novel hybrid neural model that implements the focal loss (FL) function to deal with training data imbalance. Features initially extracted from short-time Fourier transform (STFT) spectrograms via a convolutional neural network (CNN) are given as input to a long short-term memory (LSTM) network that memorizes the temporal dependencies between data and classifies four types of lung sounds, including normal, crackles, wheezes, and both crackles and wheezes. The model was trained and tested on the ICBHI 2017 Respiratory Sound Database and achieved state-of-the-art results using three different data splitting strategies-namely, sensitivity 47.37%, specificity 82.46%, score 64.92% and accuracy 73.69% for the official 60/40 split, sensitivity 52.78%, specificity 84.26%, score 68.52% and accuracy 76.39% using interpatient 10-fold cross validation, and sensitivity 60.29% and accuracy 74.57% using leave-one-out cross validation.
A new, reduced complexity algorithm is proposed for compensating the Inter-Carrier Interference (ICI) caused by severe PHase Noise (PHN) and Residual Frequency Offset (RFO) in OFDM systems. The ...algorithm estimates and compensates the most significant terms of the frequency domain ICI process, which are optimally selected via a Minimum Mean Squared Error (MMSE) criterion. The algorithm requires minimal knowledge of the phase process statistics, the estimation of which is also considered. The scheme outperforms previously proposed compensation methods of similar complexity, when severe phase impairments are present.