DASH, the Dynamic adaptive video streaming over hypertext transfer protocol (HTTP), has become the de-facto video delivery mechanism nowadays, which takes advantage of the existing low cost and ...wide-spread HTTP platforms. Standards like MPEG-DASH defines the bitstreams conformance and decoding process, while leaving the bitrate adaptive algorithm open for research. So far, most DASH researches focus on the constant bitrate video delivery. In this paper, various bitrate (VBR) video delivery is investigated in the on-demand streaming scenario. Detailed instant bitrates of future segments are exploited in the proposed adaptation method to grasp the fluctuation traits of the VBR video. Meanwhile, the adaptation problem is formulated as an optimization process with the proposed internal QoE goal function, which keeps a good balance between various requirements. Besides, the parameters within the internal QoE function can be tuned to guarantee the flexibility of meeting different preferences. The experimental results demonstrate that our proposed QoE-based video adaptation method outperforms the state-of-the-art method with a good margin.
Forward error correction (FEC) codes are widely studied to protect streamed video over unreliable networks. Typically, enlarging the FEC coding block size can improve the error correction ...performance. For video streaming applications, this could be implemented by grouping more than one video frame into one FEC coding block. However, in this case, it leads to decoding delay, which is not tolerable for real-time video streaming applications. In this paper, to solve this dilemma, a real-time video streaming scheme using randomized expanding Reed-Solomon (RS) code is proposed. In this scheme, the RS coding block includes not only the video packets of the current frame, but could also include all the video packets of previous frames in the current group of pictures. At the decoding side, the parity-check equations of the current frame are jointly solved with all the parity-check equations of the previous frames. Since video packets of the following frames are not encompassed in the RS coding block, no delay will be caused for waiting for the video or parity packets of the following frames both at encoding and decoding sides. Experimental results show that the proposed scheme outperforms other real-time error resilient video streaming approaches significantly, specifically, for the Foreman sequence, the proposed scheme could provide 1.5 dB average gain over the state-of-the-art approach for 10% i.i.d. packet loss rate, whereas for the burst loss case, the average gain is more than 3 dB.MATLAB code of this paper is available for download at http://www.mmtlab.com.
Dynamic adaptive streaming over HTTP (DASH) has emerged as an efficient technology for video streaming. For a DASH system, a most common case is that a limited server bandwidth is competed by ...multiusers. In order to improve user quality of experience (QoE) and guarantee fairness, we propose to use the game theory in a proxy server to allocate the bandwidth collaboratively for multiusers. By taking user buffer length, received video bit rates, video qualities, etc., into account, the bandwidth allocation problem is formulated as a cooperative bargaining problem and the Nash bargaining solution (NBS) is obtained by convex optimization. The requested bit rate of users will be rewritten as the proxy calculated bit rate (i.e., NBS) when the user requested bit rate is larger. Experimental results demonstrate that user QoE and fairness can be improved significantly, i.e., the delay frequency and duration are smaller, and the received video qualities are higher and more stable, when comparing the proposed method with existing methods.
Multiview video streaming continues to gain popularity due to the great viewing experience it offers, as well as its availability that has been enabled by increased network throughput and other ...recent technical developments. User demand for interactive multiview video streaming that provides seamless view switching upon request is also increasing. However, it is a highly challenging task to stream stable and high quality videos that allow real-time scene navigation within the bandwidth constraint. In this paper, a convolutional neural network (ConvNet)-assisted seamless multiview video streaming system is proposed to tackle the challenge. The proposed method solves the problem from two perspectives. First, a ConvNet-assisted multiview representation method is proposed, which provides flexible interactivity without compromising on multiview video compression efficiency. Second, a bit allocation mechanism guided by a navigation model is developed to provide seamless navigation and adapt to network bandwidth fluctuations at the same time. These two blocks work closely to provide an optimized viewing experience to users. They can be integrated into any existing multiview video streaming framework to enhance overall performance. Experimental results demonstrate the effectiveness of the proposed method for seamless multiview streaming.
A 3-D multiview video gives users an experience that is different from that provided by a traditional video; however, it puts a huge burden on limited bandwidth resources. Mixed-resolution video in a ...multiview system can alleviate this problem by using different video resolutions for different views. However, to reduce visual uncomfortableness and to make this video format more suitable for free-viewpoint television, the low-resolution (LR) views need to be super-resolved to the target full resolution. In this paper, we propose a virtual-view-assisted super-resolution algorithm, where the inter-view similarity is used to determine whether the missing pixels in the super-resolved frame need to be filled by virtual-view pixels or by spatial interpolated pixels. The decision mechanism is steered by the texture characteristics of the neighbors of each missing pixel. Furthermore, the inter-view similarity is used, on the one hand, to enhance the quality of the virtual-view-copied pixels by compensating the luminance difference between different views and, on the other hand, to enhance the original LR pixels in the super-resolved frame by reducing their compression distortion. Thus, the proposed method can recover the details in regions with edges while maintaining good quality at smooth areas by properly exploiting the high-quality virtual-view pixels and the directional correlation of pixels. The experimental results demonstrate the effectiveness of the proposed approach with a peak signal-to-noise ratio gain of up to 3.85 dB.
The depth-image-based-rendering is a key technique to realize free viewpoint television. However, one critical problem in these systems is filling the disocclusion due to the 3-D warping process. ...This paper exploits the temporal correlation of texture and depth information to generate a background reference image. This is then used to fill the holes associated with the dynamic parts of the scene, whereas for static parts the traditional inpainting method is used. To generate the background reference image, the Gaussian mixture model is employed on the texture information, whereas, depth maps information are used to detect moving objects so as to enhance the background reference image. The proposed holes filling approach is particularly useful for the single-view-plus-depth format, where, contrary to the multi-view-plus-depth format, only information of one view could be used for this task. The experimental results show that objective and subjective gains can be achieved, and the gain ranges from 1 to 3 dB over the inpainting method.
In video coding, traditional motion estimation methods work well for videos with camera translational motion, but their efficiency drops for other motions, such as rotational and dolly motions. In ...this paper, a motion-information-based three-dimensional (3D) video coding method is proposed for texture plus depth 3D video. The synchronized global motion information of the camera is obtained to assist the encoder improve its rate-distortion performance by projecting the temporal neighboring texture and depth frames into the position of the current frame, using the depth and camera motion information. Then, the projected frames are added into the reference buffer list as virtual reference frames. As these virtual reference frames could be more similar to the current to-be-encoded frame than the conventional reference frames, the required bits to represent the residual will be reduced. The experimental results demonstrate that the proposed scheme enhances the coding performance for all camera motion types and for various scene settings and resolutions using H.264 and HEVC standards, respectively. With the computer graphic sequences, for H.264, the average gain of texture and depth coding are up to 2 dB and 1 dB, respectively. For HEVC and HD resolution sequences, the gain of texture coding reaches 0.4 dB. For realistic sequences, up to 0.5 dB gain (H.264) is achieved for the texture video, while up to 0.7 dB gain is achieved for the depth sequences.
Long Term Evolution (LTE) network is widely used in video transmission because of its high-speed communication capacity. In this paper, an end-to-end distortion-based bandwidth allocation method for ...multiusers is proposed to enhance video transmission performance of the LTE network. For an LTE network, since throughput is observed as an independent variable which can affect the packet loss ratio, and thus affect the transmission distortion, the end-to-end distortion model is first derived by taking throughput into account. Then, based on the derived end-to-end distortion model, the bandwidth allocation problem is formulated as a convex optimization problem and solved by Karush-Kuhn-Tucker conditions. Simulation results demonstrate the effectiveness of the proposed method. The rate distortion performance of the proposed method is better than the average bandwidth allocation method under different bandwidth utilizations, and is very close to the exhaustive search-based method.
Due to the prediction structures employed in video coding, the loss of one packet will affect many following frames. In this paper, a multiple description coding scheme with stagger frame order is ...proposed for stereoscopic 3-D videos. First, the reference and auxiliary views in stereoscopic sequences will be asymmetrically encoded into one description, whereas the other description will be formed in the same way with one dumb frame delay. Because of the stagger frame order, the coarsely encoded B frames will be inserted into different positions of the two descriptions. If a certain frame encoded with I/P mode is lost, then its corresponding B-frame version will be employed to compensate for the loss. In each description, the quantization steps of B frames are tuned based on a closed-form solution that considers the video contents, network status, frame positions in the group of picture, and the layer of the views. For further improvement, a fusing scheme is provided. The experimental results demonstrate that the proposed scheme outperforms state-of-the-art schemes. Specifically, up to 1.3-dB gain is achieved in the case of packet loss, and 2-dB gain is obtained for the side/central performance.
Reed-Solomon erasure codes are commonly studied as a method to protect the video streams when transmitted over unreliable networks. As a block-based error correcting code, on one hand, enlarging the ...block size can enhance the performance of the Reed-Solomon codes; on the other hand, large block size leads to long delay which is not tolerable for real-time video applications. In this paper a novel Dynamic Sub-GOP FEC (DSGF) approach is proposed to improve the performance of Reed-Solomon codes for video applications. With the proposed approach, the Sub-GOP, which contains more than one video frame, is dynamically tuned and used as the RS coding block, yet no delay is introduced. For a fixed number of extra introduced packets, for protection, the length of the Sub-GOP and the redundancy devoted to each Sub-GOP becomes a constrained optimization problem. To solve this problem, a fast greedy algorithm is proposed. Experimental results show that the proposed ap proach outperforms other real-time error resilient video coding technologies.