3D volumetric image processing has attracted increasing attention in the last decades, in which one major research area is to develop efficient lossless volumetric image compression techniques to ...better store and transmit such images with massive amount of information. In this work, we propose the first end-to-end optimized learning framework for losslessly compressing 3D volumetric data. Our approach builds upon a hierarchical compression scheme by additionally introducing the intra-slice auxiliary features and estimating the entropy model based on both intra-slice and inter-slice latent priors. Specifically, we first extract the hierarchical intra-slice auxiliary features through multi-scale feature extraction modules. Then, an Intra-slice and Inter-slice Conditional Entropy Coding module is proposed to fuse the intra-slice and inter-slice information from different scales as the context information. Based on such context information, we can predict the distributions for both intra-slice auxiliary features and the slice images. To further improve the lossless compression performance, we also introduce two new gating mechanisms called Intra-Gate and Inter-Gate to generate the optimal feature representations for better information fusion. Eventually, we can produce the bitstream for losslessly compressing volumetric images based on the estimated entropy model. Different from the existing lossless volumetric image codecs, our end-to-end optimized framework jointly learns both intra-slice auxiliary features at different scales for each slice and inter-slice latent features from previously encoded slices for better entropy estimation. The extensive experimental results indicate that our framework outperforms the state-of-the-art hand-crafted lossless volumetric image codecs ( e.g., JP3D) and the learning-based lossless image compression method on four volumetric image benchmarks for losslessly compressing both 3D Medical Images and Hyper-Spectral Images.
This paper introduces an end-to-end learned image compression system, termed ANFIC, based on Augmented Normalizing Flows (ANF). ANF is a new type of flow model, which stacks multiple variational ...autoencoders (VAE) for greater model expressiveness. The VAE-based image compression has gone mainstream, showing promising compression performance. Our work presents the first attempt to leverage VAE-based compression in a flow-based framework. ANFIC advances further compression efficiency by stacking and extending hierarchically multiple VAE's. The invertibility of ANF, together with our training strategies, enables ANFIC to support a wide range of quality levels without changing the encoding and decoding networks. Extensive experimental results show that in terms of PSNR-RGB, ANFIC performs comparably to or better than the state-of-the-art learned image compression. Moreover, it performs close to VVC intra coding, from low-rate compression up to perceptually lossless compression. In particular, ANFIC achieves the state-of-the-art performance, when extended with conditional convolution for variable rate compression with a single model. The source code of ANFIC can be found at https://github.com/dororojames/ANFIC .
The existing hyperspectral image data contain significant local and non-local spatial redundancy, as well as a large amount of spectral redundancy. However, current algorithms inadequately explore ...these redundant information, limiting the compression performance. To address this issue, this paper introduces a lossy compression algorithm for hyperspectral images, named THSIC(Transformer-based HyperSpectral Image Compression). This algorithm first utilizes a channel-spatial attention module to fully exploit spatial and spectral redundancies in hyperspectral images, resulting in a better latent representation. Subsequently, the Transformer and CNN-based hyperprior branches are employed to extract non-local and local redundant information from the latent representation, respectively. These two hyperprior information, along with the locally contextual prior extracted from the local context, are fused to construct multiple hyperprior information. Then, a more accurate entropy model is built using these priors, thereby enhancing the rate–distortion performance of lossy compression for hyperspectral images.
Display omitted
•The paper introduces a hyperspectral image compression algorithm based on multiple residual modules and channel-spatial attention to enhance the compression performance of the backbone network.•To explore local and non-local redundancy in the latent representation of hyperspectral images, the paper proposes a hybrid hyperprior network with dual branches, utilizing both Transformer and CNN-based hyperpriors.•Experimental results on three hyperspectral remote sensing image datasets show that the proposed hyperspectral remote sensing image compression algorithm based on Transformer super-prior has achieved good performance.
Light field (LF) imaging enables new possibilities for digital imaging, such as digital refocusing, changing of focus plane, changing of viewpoint, scene-depth estimation, and 3D scene ...reconstruction, by capturing both spatial and angular information of light rays. However, one main problem in dealing with LF data is its sheer volume. In this context, efficient compression methods are needed for such a particular type of content. In this paper, we propose a content-based LF image-compression method with Gaussian process regression to improve the compression efficiency and accelerate the prediction procedure. First, the LF image is fed to the intra-frame codec of HEVC. In the prediction procedure, the prediction units (PUs) are classified as non-homogenous texture units, homogenous texture units, and visually flat units, based on the content property of the LF image. For each category, we design a corresponding Gaussian process regression (GPR)-based prediction method. Moreover, we propose a classification mechanism to exactly decide to which category the current PU belongs, so as to adjust the trade-off between the computational burden and the LF image coding efficiency. Experimental results demonstrate that the proposed LF image compression method is superior to several other state-of-the-art compression methods in terms of different quality metrics. Furthermore, the proposed method can also achieve a good visual quality of views rendered from decoded LF contents.
The synthetic aperture radar(SAR) image is widely used in many remote sensing applications. In order to store and transmit the increasing SAR image data, more efficient compression algorithms are ...needed. The purpose of this paper is to introduce a new framework for compressing SAR images. Firstly, we propose a novel analysis and synthesis transform based on multi-Resblocks for transforming the original SAR image into a compact latent representation. Then, a Gaussian mixture model is used to estimate the latent representation's distribution. In order to explore the redundancy within the latent representation, the entropy model parameter is estimated by combining the local context, global context, and hyperprior information. In order to evaluate the performance of the proposed algorithm, we conduct experiments on a dataset of SAR images. The results show that the proposed algorithm outperforms JPEG2000 and some state-of-the-art learned image compression schemes in terms of compression performance.
While deep learning image compression methods have shown an impressive coding performance, most of them output a single-optimized-compression rate using a trained-specific network. However, in ...practice, it is essential to support the variable rate compression or meet a target rate with a high-coding performance. This paper proposes a novel image compression method, making it possible for a single convolutional neural network (CNN) model to generate the variable rate efficiently with an optimized rate-distortion (RD) performance. The method consists of CNN-based multi-scale decomposition transform and content adaptive rate allocation. Specifically, the transform network is learned to decompose the input image into several scales of representations while optimizing the RD performance for all scales. Rate allocation algorithms for two typical scenarios are provided to determine the optimal scale of each image block for a given target rate or quality factor. For a target rate, the allocation is adaptive based on content complexity. In addition, for a target quality factor which indicates a tradeoff between the rate and the quality, the optimal scale is determined by minimizing the RD cost. The experimental results have shown that our method has outperformed the JPEG2000 and BPG standards with high efficiency and the state-of-the-art RD performance as measured by the multi-scale structural similarity index metric. Moreover, our method can strictly control the rate to generate the target compression result.
Most image encryption algorithms based on low-dimensional chaos systems bear security risks and suffer encryption data expansion when adopting nonlinear transformation directly. To overcome these ...weaknesses and reduce the possible transmission burden, an efficient image compression–encryption scheme based on hyper-chaotic system and 2D compressive sensing is proposed. The original image is measured by the measurement matrices in two directions to achieve compression and encryption simultaneously, and then the resulting image is re-encrypted by the cycle shift operation controlled by a hyper-chaotic system. Cycle shift operation can change the values of the pixels efficiently. The proposed cryptosystem decreases the volume of data to be transmitted and simplifies the keys distribution simultaneously as a nonlinear encryption system. Simulation results verify the validity and the reliability of the proposed algorithm with acceptable compression and security performance.
•A simultaneous image encryption–compression scheme based on 2D CS and hyperchaos is proposed.•The nonlinear cycle shift operation controlled by a hyper-chaotic system is used for diffusion.•The hyper-chaotic system with good randomness and sensitiveness is used to generate random sequences.•The data of the ciphertext is decreased and the security is improved due to double encryptions.
This paper addresses the problem of lossy image compression, a fundamental problem in image processing and information theory that is involved in many real-world applications. We start by reviewing ...the framework of variational autoencoders (VAEs), a powerful class of generative probabilistic models that has a deep connection to lossy compression. Based on VAEs, we develop a new scheme for lossy image compression, which we name quantization-aware ResNet VAE (QARV). Our method incorporates a hierarchical VAE architecture integrated with test-time quantization and quantization-aware training, without which efficient entropy coding would not be possible. In addition, we design the neural network architecture of QARV specifically for fast decoding and propose an adaptive normalization operation for variable-rate compression. Extensive experiments are conducted, and results show that QARV achieves variable-rate compression, high-speed decoding, and better rate-distortion performance than existing baseline methods.