Efficient indexing is a key in content-based video retrieval solutions. In this paper we propose a new dynamic indexing scheme based on the kd-tree structure. Video sequences are first represented as ...traces in an appropriate low dimensional space via luminance field scaling and PCA projection. Then, the indexing scheme is applied to give the video database a manageable structure. Being able to handle dynamic video clip insertions and deletions is an essential part of this solution. At the beginning, an ordinary kd-tree is created for the initial database. As new video traces are added to the database, they will be added to the indexing tree structure as well. A tree node will be split if its size exceeds a certain threshold. If the tree structure un-balance level exceeds a threshold, merging and re-splitting will be performed. Preliminary experiments showed that merging and re-splitting will ensure the efficiency of the indexing scheme.
Photographs acquired under low-light conditions require long exposure times and therefore exhibit significant blurring due to the shaking of the camera. Using shorter exposure times results in ...sharper images but with a very high level of noise. In this paper we address this problem and present a novel blind deconvolution algorithm for a pair of differently exposed images. We formulate the problem in a hierarchical Bayesian framework by utilizing prior knowledge on the unknown image and blur, and also on the dependency between two observed images. By incorporating a fully Bayesian analysis, the developed algorithm estimates all necessary algorithm parameters along with the unknowns, such that no user-intervention is needed. Moreover, we employ a variational Bayesian inference procedure, which allows for the statistical compensation of errors occurring at different stages of the restoration, and also provides uncertainties of the estimates. Experimental results demonstrate the high restoration performance of the proposed algorithm.
We apply reinforcement learning to video compressive sensing to adapt the compression ratio. Specifically, video snapshot compressive imaging (SCI), which captures high-speed video using a low-speed ...camera is considered in this work, in which multiple (B) video frames can be reconstructed from a snapshot measurement. One research gap in previous studies is how to adapt B in the video SCI system for different scenes. In this paper, we fill this gap utilizing reinforcement learning (RL). An RL model, as well as various convolutional neural networks for reconstruction, are learned to achieve adaptive sensing of video SCI systems. Furthermore, the performance of an object detection network using directly the video SCI measurements without reconstruction is also used to perform RL-based adaptive video compressive sensing. Our proposed adaptive SCI method can thus be implemented in low cost and real time. Our work takes the technology one step further towards real applications of video SCI.
Blind image restoration using local bound constraints May, K.; Stathaki, T.; Katsaggelos, A.K.
Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181),
1998, Letnik:
5
Conference Proceeding
A new method of incorporating local image characteristics into blind image restoration is proposed. The local variance of the degraded image is used as a measure of spatial activity, from which ...individual pixel bounds are determined. A parameter defined by the user controls the degree of smoothing. The local bounds define the solution more precisely than smoothness constraints on the image (including those that are spatially-adaptive), reducing the number of possible solutions and leading to a faster rate of convergence. Experimental results demonstrate the potential of this method as an alternative/supplement to smoothing constraints in blind image restoration.
In this paper, we propose EveRestNet, a convolutional neural network designed to remove blocking artifacts in videostreams using events from neuromorphic sensors. We first degrade the video frame ...using a quadtree structure to produce the blocking artifacts to simulate transmitting a video under a heavily constrained bandwidth. Events from the neuromorphic sensor are also simulated, but are transmitted in full. Using the distorted frames and the event stream, EveRestNet is able to improve the image quality.
Propose: To present DeepCOVID-Fuse, a deep learning fusion model to predict risk levels in patients with confirmed coronavirus disease 2019 (COVID-19) and to evaluate the performance of pre-trained ...fusion models on full or partial combination of chest x-ray (CXRs) or chest radiograph and clinical variables. Materials and Methods: The initial CXRs, clinical variables and outcomes (i.e., mortality, intubation, hospital length of stay, ICU admission) were collected from February 2020 to April 2020 with reverse-transcription polymerase chain reaction (RT-PCR) test results as the reference standard. The risk level was determined by the outcome. The fusion model was trained on 1657 patients (Age: 58.30 +/- 17.74; Female: 807) and validated on 428 patients (56.41 +/- 17.03; 190) from Northwestern Memorial HealthCare system and was tested on 439 patients (56.51 +/- 17.78; 205) from a single holdout hospital. Performance of pre-trained fusion models on full or partial modalities were compared on the test set using the DeLong test for the area under the receiver operating characteristic curve (AUC) and the McNemar test for accuracy, precision, recall and F1. Results: The accuracy of DeepCOVID-Fuse trained on CXRs and clinical variables is 0.658, with an AUC of 0.842, which significantly outperformed (p < 0.05) models trained only on CXRs with an accuracy of 0.621 and AUC of 0.807 and only on clinical variables with an accuracy of 0.440 and AUC of 0.502. The pre-trained fusion model with only CXRs as input increases accuracy to 0.632 and AUC to 0.813 and with only clinical variables as input increases accuracy to 0.539 and AUC to 0.733. Conclusion: The fusion model learns better feature representations across different modalities during training and achieves good outcome predictions even when only some of the modalities are used in testing.
We solve the problem of uplink video streaming in CDMA cellular networks by jointly designing the rate control and scheduling algorithms. In the pricing-based distributed rate control algorithm, the ...base station announces a price for the per unit average rate it can support, and the mobile devices choose their desired average transmission rates by balancing their video quality and cost of transmission. Each mobile device then determines the specific video frames to transmit by a video summarization process. In the time-division-multiplexing (TDM) scheduling algorithm, the base station collects the information on frames to be transmitted from all devices within the current time window, sorts them in increasing order of deadlines, and schedules the transmissions in a TDM fashion. This joint algorithm takes advantage of the multi-user content diversity, and maximizes the network total utility (i.e., minimize the network total distortion), while satisfying the delivery deadline constraints. Simulations showed that the proposed algorithm significantly outperforms the constant rate provision algorithm.