504 1 by ides editor

Poster Paper Proc. of Int. Colloquiums on Computer Electronics Electrical Mechanical and Civil 2011

Speech Compression Using JPEG Munmun Baisantry1, Himanshu Arora2, Jyotsna Singh2 1

Image Analysis Centre, Defense Electronic Applications Laboratory, Dehradun-248001,India Email: munmunbaisantry@gmail.com 2 Division of Electronics and Communications, Netaji Subhas Institute of Technology Faculty of Technology, (University of Delhi) ,New Delhi - 110078

Abstract— When a video is streamed over internet, different types of data e.g., moving images, audio, control signals etc are sent over the channel multiplexed into individual sub channels. Different bit rates lead to poor utilization of channel capacity. A novel technique to convert WAVE speech signals into BITMAP images and compress them using JPEG module is proposed here. This not only maximally utilizes the channel capacity by using the same channel for image and speech signals, it also minimizes computational resources and time by enabling the same processing tools to be used for both type of signals. Also, due to better compression rate of JPEG over MP3, it facilitates faster transfer of extensive volumes of data. Multiplexers used to segregate the audio and image parts of videos can also be skipped in such a communication system. Results are shown to further prove that the compression ratio is better for the proposed technique as compared to speech signals compressed by conventional techniques. PESQ values further prove that the converted and compressed speech signals are easily reproducible.

Fig.1 Overview of the proposed module

In the paper, we have proposed a novel technique to convert the WAVE files into BITMAP files using Microsoft.NET framework as a suitable platform and visual C# for programming [1], [2]. For compression of the converted signals, the JPEG module of Paint.Net [3] was extracted and used. The rest of this paper is organized as follows. In Section II, WAVE, BITMAP formats as well as JPEG compression are discussed briefly. Section III gives a detailed description of conversion of WAVE files into BMP images and back. Section IV presents results to validate the proposed algorithm. Concluding remarks are given in Section V.

Index Terms— channel, compression, Bitmap, JPEG, MP3, PESQ, Q-factor, speech, Wave.

I. INTRODUCTION

II. WAVE AND BITMAP FILE FORMATS

When a video is streamed over internet, the moving image, audio, data (if any) and control signals are sent simultaneously over the sub-channels. Due to unequal bit-rates, while the some of the channels are still transmitting the data, the other ones are sitting idle and it is not possible to assign any new data to these idle channels until the previously sent data has been streamed completely. If, instead of being multiplexed, only a single channel is used, the capacity of the channel can be fully utilized. As different data types cannot be transferred simultaneously using the same channel, it is preferable to convert audio data into image and then transmit them. Along with the maximal utilization of the channel, various other advantages associated with the proposed idea are: [1]. Conversion of audio data into image format also allows for, same software and hardware to be developed for both speech and image signals [2]. This also saves us from including a multiplexer required to segregate the audio and image parts. [3]. Experimental results prove that the audio file converted into image and then compressed using JPEG compression format has a better compression ratio than MP3, thus saving bandwidth as well as time. A diagrammatic overview of the proposed technique is shown in Fig.1:

The commonly used format for multimedia files is discussed in A. For storing digital images in an uncompressed format, BITMAP format is used which is discussed in B. A. WAVE File Format A WAVE file is often just a RIFF file [4] with a single “WAVE” chunk which consists of two sub-chunks — a “fmt” chunk specifying the data format and a “data” chunk containing the actual sample data. The elements of RIFF chunk are as shown in Table 1. TABLE 1: RIFF CHUNK

FIELDS AND THEIR DESCRIPTION

The “WAVE” format consists of two subchunks: “fmt” and “data”: The “fmt” subchunk describes the sound data’s format as shown in Table 2.