Journal for Research | Volume 02| Issue 03 | May 2016 ISSN: 2395-7549
Responsive Video Format for Adaptive Streaming Stenal P Jolly UG Student Department of Computer Science & Engineering Athanasius College of Engineering Kothamangalam, Ernakulam, Kerala, India
Sarath Sasikumar UG Student Department of Mathematics Narnarayan Shastri Institute of Technology, Jetalpur, Ahmedabad, GTU-Gujarat, India
Santhanu K UG Student Department of Computer Science & Engineering Athanasius College of Engineering Kothamangalam, Ernakulam, Kerala, India
Ebin Jose UG Student Department of Computer Science & Engineering Athanasius College of Engineering Kothamangalam, Ernakulam, Kerala, India
Joby George Assistant Professor Department of Computer Science & Engineering Athanasius College of Engineering Kothamangalam, Ernakulam, Kerala, India
Abstract The increasing use of digital video has put into perspective the storage requirements for a large number of video files. Different qualities of videos are available and each of these requires different storage areas. Different variety of devices having different pixel densities uses these files and accordingly has varying video processing capabilities. This paper presents a novel multimedia video encoding/decoding algorithm which is responsive to the device and responds to the device capabilities. This involves a light-weight, multi-layered approach whereby the video streams are divided into different layers and these are differentially loaded as per the quality requirements. This differential loading involves the progressive decoding algorithm for the video stream and henceforth a single file to be used vary the bit rate depending on several factors like the device processing capabilities, available memory, screen width and pixel density for different quality requirements. By in cooperating this method to WebRTC framework, the quality of video can be varied according to the network congestions. Our analysis reveals that this method is far more efficient than transcoding-on-air method which is used by multimedia servers which require special hardware support. For normal servers which use HTML5 for playback of video, this method reduces the total bandwidth required and saves a lot of storage space which results in smooth streaming over the web. This new approach makes the video files adaptive to the different devices on which they are running. Keywords: adaptive streaming, codec, multilayer, responsive, video format, frame adaptation _______________________________________________________________________________________________________ I.
INTRODUCTION
As multimedia data becomes more prominent in our day to day lives, the need for efficient compression algorithms for multimedia data will become increasingly important. Examples of such future applications include video e-mail and video-ondemand. In particular, as more commercial sites offer video and audio-based services, efficient compression of video data will probably determine the acceptance or rejection of these applications. One primary factor is the large amount of size used by the different versions of the video files and also the inability to dynamically adapt to the network conditions. The video streams are usually delivered via UDP where there is no transport layer congestion control. UDP is the transport protocol of choice for video streaming platforms mainly because the fully reliable and strict in-order delivery semantics of TCP do not suit the real-time nature of video transmission. Video streams are loss tolerant and delay sensitive. Retransmissions by TCP to ensure reliability introduce latency in the delivery of data to the application, which in turn leads to degradation of video image quality. Additionally, the steady state behaviour of TCP involves the repeated halving and growth of its congestion window, following the well-known additive increase/multiplicative decrease (AIMD) algorithm. Hence, the throughput observed by a TCP receiver oscillates under normal conditions. This presents another difficulty since video is usually streamed at a constant rate Our contributions are in two areas, video compression and network streaming. The video component of our system makes compressed video streaming friendly through support of priority drop. We describe a video format, called SPEG (Scalable MPEG), to illustrate how current video compression techniques can be extended to support priority drop. In contrast to random dropping, which results in unusable video at dropping levels of just a few percent, priority drop is informed and can achieve graceful degradation, over more than an order of magnitude in rate. One of the main questions that arises when considering such
All rights reserved by www.journalforresearch.org
94
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
a range of target rates is what aspect or aspects of video to degrade? The answer can be influenced by several factors, such as the nature of the content, the nature of the viewing device, the personal preferences of authors, viewers, etc. In this paper we present a novel multi layered approach whereby the video stream split into multiple fragments and the similar fragments are grouped into a separate layer. Based on the different quality requirements, the video is differentially processed. Thus the same video file can be used for different qualities. At first we introduce the FLIF image format which is used for the individual image frames. The second section introduces how the video is to be divided into different layers and the grouping of the similar layers into one. The third section introduces the differential processing of the video based on the requirements. The last section gives the analysis of the works and an insight into the future prospects of the paper. Another aspect that is given a totally new dimension through the Responsive Video Format is the adaptability to different devices like Desktops, Smart TV, Laptops, Tablets and Smart phones. The video format adjusts the quality in "response" to the device on which it is being played. It automatically adjusts to the requirement of the current device and loads only the required amount of pixels rather than the entire video file like in the prevalent video formats. This in turn provides with much more efficiency in the native running devices. II. PROPOSED SYSTEM The proposed method intends to implement the FLIF image format into video frame. The main advantage of the FLIF image format is its responsiveness to the quality. This particular quality of the FLIF format has encouraged us to use this format in case of a video file which would make the file adaptable to the requirements and the specific devices. FLIF image format need not load the entire image file for displaying the content. It can dynamically adjust to the device and the network conditions. The same thing has been adopted in our system to develop a video format which would dynamically to the network and the requirements. The details of the FLIF image format has been given now A. FLIF: FLIF is a novel lossless image format which outperforms PNG, lossless WebP, lossless BPG, lossless JPEG2000, and lossless JPEG XR in terms of compression ratio. FLIF files are on average: 14% smaller than lossless WebP, 22% smaller than lossless BPG, 33% smaller than brute-force crushed PNG files (using ZopfliPNG), 43% smaller than typical PNG files, 46% smaller than optimized Adam7-interlaced PNG files, 53% smaller than lossless JPEG 2000 compression, 74% smaller than lossless JPEG XR compression. Even if the best image format was picked out of PNG, JPEG 2000, WebP or BPG for a given image corpus, depending on the type of images (photograph, line art, 8 bit or higher bit depth, etc), then FLIF still beats that by 12 percentage on a median corpus (or 19 percentage on average, including 16-bit images which are not supported by WebP and BPG). 1) Advantages: Here are some of the key advantages of FLIF: 1) Best compression: The results of a compression test similar to the WebP study are shown below. FLIF clearly beats other image compression algorithms. 2) Works on any kind of image: FLIF does away with knowing what image format performs the best at any given task. You are supposed to know that PNG works well for line art, but not for photographs. For regular photographs where some quality loss is acceptable, JPEG can be used, but for medical images you may want to use lossless JPEG 2000. It can be tricky for non-technical end-users. More recent formats like WebP and BPG do not solve this problem, since they still have their strengths and weaknesses. FLIF works well on any kind of image, so the end-user does not need to try different algorithms and parameters. Here is a selection of different kinds of images and how each image format performs with them. FLIF beats anything else in all categories. Here is an example to illustrate the point. On photographs, PNG performs poorly while WebP, BPG and JPEG 2000 compress well (see plot on the left). On medical images, PNG and WebP perform relatively poorly (note: it looks like the most recent development version of WebP performs a lot better!) while BPG and JPEG 2000 work well (see middle plot). On geographical maps, BPG and JPEG 2000 perform (extremely) poorly while PNG and WebP work well (see plot on the right). In each of these three examples, FLIF performs well — even better than any of the others. 3) Progressive and lossless: FLIF is lossless, but can still be used in low-bandwidth situations, since only the first part of a file is needed for a reasonable preview of the image. Lossy compression is useful when network bandwidth or diskspace are limited, and you still want to get a visually OK image. The disadvantages of lossy compression are obvious: information is lost forever, compression artifacts can be noticeable, and transcoding or editing can cause generation loss. With better compression, the need to go there is lessened.
All rights reserved by www.journalforresearch.org
95
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
4) Responsive by design: A FLIF image can be loaded in different „variationsâ€&#x; from the same source file, by loading the file only partially. This makes it a very appropriate file format for responsive web design. This is the very thing that forms the basis of our paper. The FLIF image frames are used in the responsive video format to adjust to the requirements. This allows the system to load only what is required henceforth providing much better efficiency. B. FFMPEG: FFmpeg is a free software project that produces libraries and programs for handling multimedia data. It is basically a library for creating video applications and general purpose utilities. It performs encoding, decoding, muxing, demuxing and filtering. It is mostly written in the C programming language or Assembly programming language. In making the player, we will be using SDL to output the audio and video of the media file. SDL is an excellent cross-platform multimedia library that's used in MPEG playback software, emulators, and many video games. FFmpeg includes libavcodec, an audio/video codec library used by several other projects, libavformat, an audio/video container mux and demux library, and the ffmpeg command line program for transcoding multimedia files.
C. THEORA: Theora is a free lossy video compression format. It is developed by the Xiph.Org Foundation and distributed without licensing fees alongside their other free and open media projects, including the Vorbis audio format and the Ogg container. The libtheora video codec is the reference implementation of the Theora video compression format being developed by the Xiph.Org Foundation. Theora is a free and open video compression format from the Xiph.org Foundation. Like all our multimedia technology it can be used to distribute film and video online and on disc without the licensing and royalty fees or vendor lock-in associated with other formats.Theora scales from postage stamp to HD resolution, and is considered particularly competitive at low bitrates. It is in the same class as MPEG-4/DiVX, and like the Vorbis audio codec it has lots of room for improvement as encoder technology develops. III. FRAME ADAPTATION Video compression is generally lossy, where video degradation is exchanged in proportion to the amount of compression desired. We cannot use the entire set of frames for the modified requirements. Henceforth, we need to restrict the number of frames. Many methods have been proposed to limit the frames. Displayed Frame Rate Adaptation: There are two major techniques for reducing the rate at which video frames are processed at the receiver: frame dropping and playback dilation. With frame dropping, a subset of the frames associated with a video clip is discarded at the server end. These frames are thus skipped by the player, reducing the playerâ€&#x;s frame processing rate. Note that frame dropping is in reality another form of SNR adaptation, since the receiver obtains less accurate data about the video sequence, and must either repeat preceding frames or interpolate, both of which result in a reduction in perceptual quality. In a generalization of this technique, the server may drop data at the macroblock level, so that a better name for this technique would be macroblock filtering. With playback dilation, all the frames of the source video are delivered to the receiver. However, the rate at which video frames are processed at the receiver is intentionally reduced below the encoded frame rate of the video. This results in the dilation or expansion of the playback time of the video. Displayed frame rate adaptation, using macroblock filtering or playback dilation or a combination of the two, can be used to handle reductions in the data rate supported by the server or the network for the video stream as well as reductions in the frame processing rate supported by the receiver. Although any of the four major techniques described above (i.e., spectral filtering, quantization filtering, macroblock filtering, and playback dilation) may be used individually or in combination to adapt the transmitted video stream to changes in the QoS delivered by the server, the network, and the receiver, the impact of each of these choices on the perceptual quality of the delivered video may be significantly different. Also, different video clips may undergo varying perceptual quality degradation levels for a given adaptation choice. Furthermore, perceptual quality is a subjective phenomenon which may vary among human observers. The choice of adaptation technique(s) also has a varying impact on the server/network throughput and receiver frame processing rate required to support the adapted video stream.
All rights reserved by www.journalforresearch.org
96
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
Fig. 1: pixels in various video qualities
IV. MODIFIED FRAME ADAPTATION We adopt a modified approach to the frame dropping. The dropping of the frames is dependent on the quality of the video file required. The entire video file is segmented based on the colour space. A single colour space is divided into different fragments and the similar fragments are grouped into one. A. Encoding: The encoding process consists of encoding the individual frames based on an encoding algorithm.
Fig. 2: algorithm 1 Frame Encoding
The procedure encodeFrame takes a frame and returns layers of the frame as the output. There are certain levels of qualities stored which determines the layer which is to be extracted. The layers are formed by extracting the layers corresponding to the level. These are grouped together and then returned as the set of layers.
All rights reserved by www.journalforresearch.org
97
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
Fig. 3: algorithm 2 Ectracting layers
The extractLayer algorithm takes the frame and the particular level requirement as the input. The size and the data is taken into variables size and data from the frame. If the level is less than 6 which is the total number of available qualities then we extract the pixels in a differential manner according to the level. If the level value is more than 6 then we need not extract any pixel and the entire frame is passed on. For extracting the pixel, we move in groups of 3 and add the first 2 pixels of each group. The third pixel is differentially added. We have approximated that the number of pixels in millions in each quality is 2/3rd of the number of pixels in the previous quality. This may be demonstrated by the Fig. After this, the temporary Frame temp is passed on as the frame and this particular frame is freed for the next frame extraction. B. Decoding:
Fig. 4: algorithm 3 Frame Decoding
The decoding after the lousy encoding of the video frames considers differential streaming of the encoded files as per the quality requirements and specifications. The decode Frame procedure gives the decoded version of the qualitatively encoded Frame. It take as input the layers and an optionally a maxLevel parameter. If no value is given then automatically the highest quality level is selected as the required quality parameter. The count of the number of the layers is stored in level Size. The maxLevel value is assigned to the level Size if no value is given for maxLevel or if it is more than level Size. MAXSIZE contains the size of the frame.
All rights reserved by www.journalforresearch.org
98
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
Fig. 5: algorithm 4 Appending layers
The Append Layer functions decode the layers of the frame during each iteration. A detailed description of the append Layer function is given. The procedure append Layer is used to append the layers and combine them to form the frame. When the very first layer calls the append Layer function the value of the frame will be NULL and this would make the algorithm create a new memory area with maxSize. This layer is now copied onto the frame and the particular frame with the single layer is returned. When the algorithm is called for the subsequent layers the size and data inside the frame is extracted and the layer data is stored in a new Data integer. Just as in the encoding process the decoding is done in groups of 3 .Two frames of each block of original Data is added along with a block of the new Data. This temporary variable is copied onto the frame and then returned. V. ANALYSIS The analysis of the Responsive video format shows memory efficiency as a huge amount of storage may be saved by not storing the poor quality streams. This in itself leads to cost efficiency. Also it provides efficiency in case of Network congestion. In this particular condition the quality of the video is adjusted as per the network situation. The video format is “responsive” to the video requirements and the network conditions. The overall message of our results is that priority drop is very effective: a single video can be streamed across a wide range of network bandwidths, on networks heavily saturated with competing traffic, while maintaining real-time performance and gracefully adapting quality. In the adaptive streaming field, this video format is a completely new trend which has shown comparatively better results when compared to the development of other protocols and other such developments. This has shown far better results in both computing and storage efficiency. VI. CONCLUSION The basic idea of the Responsive Video Format suggested is the adaptiveness of the system to the requirements thereby increasing the processing capabilities and although less relevant in recent times saving a lot of storage. This would in turn also increase the proficiency of the networks especially when the whole world is moving towards developing more and more video oriented websites. This would increase the efficiency of storage in case of the server side and speed of processing in case of the client side. The systems henceforth perform more efficiently. The most evident advantage is the maintenance of the Quality of Service. The user in no manner feels he is being served with an inhibited quality as he is provided with whatever his system can handle. The analysis of the Responsive video Format shows marked improvement in the efficiency of the systems. The analysis is based on the comparison with the established video formats used in the systems currently. ACKNOWLEDGMENT First and foremost, we sincerely thank the „God Almighty‟ for his grace for the successful and timely completion of the project.
All rights reserved by www.journalforresearch.org
99
Responsive Video Format for Adaptive Streaming (J4R/ Volume 02 / Issue 03 / 018)
We express sincere gratitude and thanks to Dr. Soosan George T, our Principal and Dr. Surekha Mariam Varghese, Head of the Department, Computer Science and Engineering, for providing the facilities and all the encouragement and support. We would also like to thank Mr. Joby George, Assistant Professor, Computer Science and Engineering, for his in valued support during the course of completion of this project. We would also like to thank the entire faculty of the Computer Science and engineering department for their valuable suggestions and help during the course of completion of the project. Finally, We would like to acknowledge the heartfelt efforts, comments, criticisms, co-operation and tremendous support given to me by my dear friends during the preparation of the seminar and also during the presentation without whose support this work would have been all the more difficult to accomplish. REFERENCES [1] [2] [3] [4] [5] [6]
Quality-Adaptive Media Streaming by Priority Drop - Charles Krasic, Jonathan Walpole Adaptive Cross-Layer Protection Strategies for Robust Scalable Video Transmission ver 802.11 WLANs - Mihaela van der Schaar, Member, IEEE, Santhana Krishnamachari, Sunghyun Choi, Member, IEEE, and Xiaofeng Xu, Student Member, IEEE Adaptive Streaming of MPEG Video over IP Networks - Ranga S. Ramanujan, Jim A. Newhouse, Maher N. Kaddoura, Atiq Ahamad, Eric R. Chartier, and Kenneth J. Thurber A Survey of Application Layer Techniques for Adaptive Streaming of Multimedia - BobbyVandalore, WuChiFeng, Raj Jain, Sonia Fahmy A Multi-Layer Video Browsing System Herng-Yow Chen and Ja-Ling Wu Communication & Multimedia Lab. Transform-domain Wyner-Ziv Codec for Video - Anne Aaron, Shantanu Rane, Eric Setton, and Bernd Girod
All rights reserved by www.journalforresearch.org
100