IJIEEB-V8-N6

Page 1


INTERNATIONAL JOURNAL OF INFORMATION ENGINEERING AND ELECTRONIC BUSINESS (IJIEEB) ISSN Print: 2074-9023, ISSN Online: 2074-9031 Editor-in-Chief Prof. Anatoliy Sachenko, Ternopil National Economic University, Ukraine

Associate Editors Dr. P. Ma, National Technical University of Ukraine "KPI", Ukraine Prof. Abdel-Badeh M Salem, Ain Shams University, Egypt Prof. H. S. Guruprasad, BMS College of Engineering, India

Members of Editorial and Reviewer Board Dr. W.B. Hu Wuhan University, China

Dr. Sanjay Kumar Dubey Amity University, India

Dr. Ahmed Fouad Suez Canal University, Egypt

Dr. Beladgham Mohammed University of Bechar, Algeria

Dr. Jitendra Singh PGDAV College, University Delhi, India Dr. Kumar Rajnish Birla Institute of Technology, India

of

Dr. Imtiaz Hussain Khan King Abdulaziz University, Saudi Arabia Dr. Pinaki Majumdar M.U.C Women's College, India

Dr. Nivet Chirawichitchai Sripatum University, Thailand Dr. Carlo Ciulla University for Information Science and Technology, “St. Paul the Apostle”, Republic of Macedonia

Prof. Akella. V. S. N. Murty Aditya Engineering College, India

International Journal of Information Engineering and Electronic Business (IJIEEB, ISSN Print: 2074-9023, ISSN Online: 2074-9031) is published quarterly by the MECS Publisher, Unit B 13/F PRAT COMM’L BLDG, 17-19 PRAT AVENUE, TSIMSHATSUI KLN, Hong Kong, E-mail: ijieeb@mecs-press.org, Website: www.mecs-press.org. The current and past issues are made available on-line at www.mecs-press.org/ijieeb. Opinions expressed in the papers are those of the author(s) and do not necessarily express the opinions of the editors or the MECS publisher. The papers are published as presented and without change, in the interests of timely dissemination. Copyright © by MECS Publisher. All rights reserved. No part of the material protected by this copyright notice may be reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording or by any information storage and retrieval system, without written permission from the copyright owner.


International Journal of Information Engineering and Electronic Business (IJIEEB) ISSN Print: 2074-9023, ISSN Online: 2074-9031 Volume 8, Number 6, November 2016

Contents REGULAR PAPERS Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services Mina Malekzadeh

1

Framework for an E-Voting System Applicable in Developing Economies Lauretta O. Osho, Muhammad B. Abdullahi, Oluwafemi Osho

9

M-Commerce in Bangladesh –Status, Potential and Constraints Biman Barua

22

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications Emmanuel C. Paul

28

The Impact of Dots Representation in Recognition of Isolated Arabic Characters Nehad H A Hammad, Mohammed Elhafiz Musa

37

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection K.Mani, P.Kalpana

46

Design & Optimization of Reversible Logic Based ALU Using ACO Shaveta Thakral, Dipali Bansal

55

An Analysis of Fuzzy and Spatial Methods for Edge Detection Pushpa Mamoria, Deepa Raj

62



I.J. Information Engineering and Electronic Business, 2016, 6, 1-8 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.01

Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services Mina Malekzadeh Electrical and Computer Faculty of Hakim Sabzevari University Sabzevar, Iran Email: m.malekzadeh@hsu.ac.ir

Abstract—Since internet connections are provided through a variety of disparate networks, connecting these networks and supporting heterogeneous interworking is a major issue particularly in cellular networks. The interworking issue becomes even more challenging when it comes to providing QoS for real time services. To improve QoS requirements of the real time data and to increase inter cellular mobility, we propose a costeffective framework structure consists of four separate cellular models. The framework includes a heterogeneous interwork model that integrates cellular WiMAX and UMTS. Moreover, a pure WiMAX network model along with two pure UMTS network models is set up by the framework. The performance of the heterogeneous model is evaluated via simulation and analyzed against the measured metrics of the pure models to quantify the level of improvements from QoS of the real time packets point of view. Based on the results, the recommendations are made on the most appropriate model in regard to better QoS for VoIP in cellular networks. Index Terms—Cellular interworking, Heterogeneous networks, WiMAX, UMTS, VoIP I. INTRODUCTION Nowadays there are different types of available access networks which are used by end users to connect to the internet. Thus, the users must be provided with seamless network connectivity to stay connected while moving around from one place to another. This seamless network connectivity is achieved by connecting different types of networks which is called heterogeneous internetworking. By integrating different network technologies into one common heterogeneous network architecture, they can coexist and interoperate with each other and improve network performance in term of Quality of Service (QoS) [15]. Heterogeneous wireless networks may incorporate wireless local area networks (WLAN), wireless personal area networks (WPAN), wireless metropolitan area networks (WMAN) and wireless wide area networks (WWAN) including cellular networks and satellite [2]. Among these architectures, heterogeneous cellular interworking provides substantial ability in increasing network data rate and coverage area making it is more suitable for mobile end users. Heterogeneous cellular interworking is provided by integration of different radio Copyright © 2016 MECS

access technology such as universal mobile telecommunications system (UMTS), IEEE 802.16 worldwide interoperability for microwave access (WiMAX), long term evolution (LTE), WLAN, etc. While the combination of cellular networks provides better service to end users, it also has its own issues. The types of networks that are integrated into a heterogeneous architecture, each have been developed separately with their own specific characteristics. These special requirements of cellular networks make the interworking between them a challenging task particularly for QoSsensitive real time data. In this work we propose a cellular framework to increase the inter-cellular mobility yet to improve QoS of multimedia data. The framework consists of four models. The first model is a heterogeneous model that integrates WiMAX [13,14] and UMTS cellular networks. Along with the heterogeneous model, the framework includes three pure models for WIMAX and UMTS networks. The performance of the proposed heterogeneous network model is evaluated and analyzed against the measured metrics of the pure network models. Special attention is paid to the VoIP packets. By studying several cellular scenarios, the main objective of this work is to analyze various alternatives for interworking of UMTS and WiMAX networks to optimize QoS for VoIP real time data and achieve better network performance in demand for higher data rate. The rest of this work is organized as follows. Section 2 provides the related works. Section 3 describes the design structure of the proposed framework for interworking. Section 4 presents results and discussions. The conclusion is provided in section 5.

II. RELATED WORKS Due to importance of the interworking of different types of networks, it has been attracted the attention of many researches. WiMAX and UMTS interworking are investigated in [1] for IP Multimedia Subsystem (IMS) based networks. Two network models as Hybrid Coupled WiMAX-UMTS with and without QoS provisioning are created and compared. The two models are compared with each other however the effect of having a merged network on performance of IMS is not taken into account in their work.

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


2

Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

In [2] authors emphasis on importance of heterogonous networks and thereby develop a model and simulation environment to present and analyze the performance of WLAN and UMTS integration. The aim is to identify an architecture that can provide better overall performance and flexible interworking for handover. The authors mainly focus on integrating Wi-Fi and UMTS networks while the WiMAX networks are not taken into their consideration. Heterogonous networks are also considered in [3]. By implying that the usage of wireless network is growing day by day, the authors create two heterogeneous network architectures by integrating Wi-Fi with WiMAX and also Wi-Fi with UMTS. The two architectures are simulated and the results are compared in term of a better voice quality. Integrating WLAN with cellular networks is also discussed in [4]. The distance based path-loss model is considered in Internet Multimedia Subsystem (IMS) integrating UMTS-WiMAX-WLAN. The integrated model is used to select the efficient radio mode with expected QoS. The integration of WLAN, WIMAX and UMTS is also discussed in [5, 12]. In order to execute intersystem handover, the authors implement the algorithm of interworking architecture between WiMAX, GPRS and UMTS in [6]. However, the roaming performance is not compared beyond the internetworking on pure cellular networks. Authors in [7] propose a framework for interworking of the UMB, WLAN and WiMAX technologies. The download and upload response time of Email along with voice jitter and packet end-to-end delay are simulated to evaluate the proposed algorithm. The loose and tight coupling as two main interworking architectures are investigated in [8]. The authors highlight the objectives, features, and challenges of internetworking. However, no implementation is provided in order to evaluate the overall performance. A survey of interworking architectures also provided in [9, 11]. A handover framework is introduced in [10] for interworking of LTE, WiMAX and WLAN technologies. The objective is to offer seamless high quality IP-based multimedia services to users at any time during the handoff process. Based on the related works, there is no existing research regarding to development of a framework capable of modeling and implementing the cellular interworking along with pure network cores to which it is compared to. This need is important as it can give a guideline to telecommunication industry for developing new techniques in demand for higher data rates. By taking this need into account, this paper contributes to the identification of the most appropriate model in regard to better QoS for VoIP in cellular networks.

integrates WiMAX and UMTS, a pure WiMAX network, a pure UMTS network with two independent cores and finally a pure UMTS network with one common core. The following assumptions are set out for all the four models:  To avoid side effects of the network complexity and to simplify the simulation models, the point to point real time traffics are exchanged between a transmitter and a receiver.  The real time VoIP traffics are transmitted from calling Node1 to the called Node2 in all four models in the framework.  To provide similar network conditions for the fair comparison purpose, the calling node in all models transmits equal amount of real time traffic to the called node.  The simulation time to run each experiment is one hour. The first model in the framework is the heterogeneous network model. In this model, the radio access network at the calling node side is WiMAX while it is UMTS at the called node side. The access service network (ASN) comprises one cell supported by a wireless base station and one ASN gateways (ASNGW) that together form the radio access network at the calling node side. The gateway controls and aggregates the traffic from the WiMAX base station. The radio access network at the called node side is UMTS terrestrial radio access network (UTRAN) that handles all radio related functionality. It consists of a Node B directly connected to the radio network controller (RNC) which is responsible for radio resource management and control of the Node B. To handle packet switched services by the UMTS core network, the RNC in turn is connected to serving GPRS support node (SGSN) and gateway GPRS support node (GGSN). Finally to merge these two disparate networks and provide interworking, the GGSN and ASN gateway are connected through the internet backbone. This heterogeneous model is presented in Fig. 1.

III. PROPOSED FRAMEWORK The structure of the proposed framework consists of four separate network models: a cellular interworking that Copyright © 2016 MECS

Fig.1. Heterogeneous network model

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

The next model in the framework is a pure WIMAX model consists of two separate cells each of which supported by its own wireless base station. The two cells are connected to a common ASN gateway through the internet backbone. This model is presented in Fig. 2.

3

The second pure UMTS model is designed with reduced number of network elements in compare to the first pure UMTS model. It consists of one common core to independently support both called and calling nodes. For simplicity we refer to this model as UMTS-single which is presented in Fig. 4.

Fig.4. Pure UMTS model with a common core (UMTS-single) Fig.2. Pure WIMAX model

As the third and fourth model, the framework includes two pure UMTS models. In a complex pure UMTS model there may be various strategies for combination of the network components to achieve different purposes based on the particular requirements of the network. Thus by considering two different configurations for pure UMTS model, we design two separate models. The goal of designing two different pure UMTS in the framework is to verify the parameters that can affect the network performance and also to identify which configuration can provide better QoS support. This is due to the fact that identifying pros and cons of different cellular strategies can help the network designers and operators to optimize and adopt the right models to perform specific services for that network. The first pure UMTS model consists of two independent cores to support the called and calling nodes independently through the internet backbone. For simplicity we refer to this model as UMTS-dual which is presented in Fig. 3.

The simulation models are developed using opnet14.5 simulator environment. The ability of the four models to sustain the expected QoS for real time VoIP traffics is measured independently in terms of throughput, average end-to-end delay and delay variations. Based on these metrics, a variety of experiments are simulated for each model in the framework while attention is also paid to the kind of codec used by the VoIP packets.

IV. SIMULATION RESULTS We need to verify how the differences in the characteristics and internal structure of the four models in the framework would affect the transmission of the audio streams. Thus, we run the simulation models each of which for one hour to measure the performance metrics. This section provides the results obtained by conducting multiple scenarios and simulation runs of the proposed framework. A. Throughput of audio streams To measure the bit rate of the audio streams, first we measure the bytes per second of the VoIP data sent by the calling node in each model which presented in Fig. 5. As we can see from the above results, the calling nodes load the same amount of audio data into all the four models. This similarity is necessary to provide fair comparative conditions for the models and to indicate how much of these sent data can in fact be received by the called nodes in each model.

Fig.3. Pure UMTS model with two independent cores (UMTS-dual)

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


4

Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

(B) G.729

Fig.5. Overall audio stream load sent by the calling node in all four models

In other words, the comparison between the amount of sent and received audio data will describe the ability of each model to handle traffic of the same offered load. Form the above results, we also observe that the type of codec has direct impact on the amount of traffic sent from a device. To simplify the analysis, we redraw the above graph this time based on type of codec which convey a much better picture of codec impact. The results are provided in Fig. 6 A-C for G.711, G.729, and G.723 respectively.

(A) G.711

Copyright Š 2016 MECS

(C) G.723 Fig.6. Overall audio stream load sent by the calling node based on type of codec

As we can see more clearly now, regardless of the structure of the cellular radio access mode, the transmission traffics for G.729 and G.723 is much lower than the G.711. In this regard, there is very small difference between sent traffic in G.729 and G.723. The reason is that the G.711 unlike the G.729 and G.723 uses no compression. Typically the process of compressing and decompressing the voice on the both ends of the call is a time consuming process which imply latency to the entire voice transmission process. Also, the reason for small differences between G.729 and G.723 is that G.729 uses less compression than G.723. Now as we mentioned, after identifying the amount of sent traffic, it is time to measure how much of these sent data are practically received by the called nodes in each model. The details of throughput measurements corresponding to the heterogonous, pure WIMAX, UMTS-single, and UMTS-dual models are provided in Fig. 7 A-D respectively.

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

(A) Heterogonous model

5

(D) Dual-UMTS model Fig.7. Throughput measurements of the four models

(B) Pure WIMAX model

From the results presented in the above figure some information are derived. As a first result we can see that regardless of the type of radio access, the amount of received traffics in all models is much less than the amount of sent traffic discussed in Figs. 5 and 6. Moreover, the throughput rate remains at almost the same level in the four models with slight differences. The both UMTS models achieve the same throughput as each other proving that having a common core or two independent cores does not affect the throughput of the voice traffics. By looking more into details, we see that throughput received by each called node in the UMTS models is a little higher than either heterogonous interworking model or WiMAX model. By ignoring this slight difference, the simulation results confirm that the despite the differences in the radio access networks, using the same codec, the four models do not differ significantly in regard to throughput of the voice traffics. The simulation results also suggest that G.711 codec is beneficial for enhancing the throughput in all the models. B. Average of end-to-end delay for audio streams In order to study the performance stability of each model in the proposed framework, we measure the average delay experienced by voice traffics. The mean end-to-end delay results for the heterogonous, pure WIMAX, UMTS-single, and UMTS-dual models are depicted in Fig. 8 A-D respectively.

(C) Single-UMTS model

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


6

Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

(A) Heterogonous model

(D) Dual-UMTS model Fig.8. Mean end-to-end delay results of the four models

(B) Pure WIMAX model

Observing the arrival time of the packets in the called nodes shows a significant difference between the models. The amount of delay in both UMTS models is nearly the same and low which reaches to 3s at most. However, for both heterogonous and pure WiMAX model the delay is increased highly to at least seven times more. Lower delay in UMTS models and very high delay in WiMAX model will explain that the high delay in the heterogonous model is related to merging the WIMAX radio access. Thus, interworking of the WiMAX and UMTS can increase the delay in the UMTS networks. Considering the fact that real time traffics tend to be quite sensitive to delay and delay variations, in such cases interconnecting WiMAX and UMTS may not be desirable to end users or voice providers companies. The results also describe that the least delay belongs to G.723 codec in the four models. We were curious that since connecting the WIMAX radio access mode to the UMTS increases the delay, what would be the side effect on the mean opinion score (MOS) of the voice packets. Thus we run the interworking model again to measure the MOS which is expressed in Fig. 9.

(C) Single-UMTS model

Fig.9. MOS in the interworking model

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

7

The MOS values presented in Table1 are considered as metric to evaluate the level of call quality. Table 1. Voice Quality Levels For VoIP 5 4 3 2 1

Excellent Good Fair Poor Bad

We notice from the interworking results in Fig. 9 that the MOS value is 2 which is considered poor voice perception quality. These results confirm our expectations as delay of the voice packets is highly relevant to achieved MOS; the higher the delay the more it degrades the quality of the calls. Thus, while pure UMTS networks offer a good level of call quality, their interworking with WiMAX will result in suffering low calls quality by the end users.

(C) Single-UMTS model

C. Delay variations of audio streams To further demonstrate the benefits or constraints of the models in the proposed framework, the behavior of delay variants is studied by observing the simulated results provided for each model. The variations in delay introduced by internal structure of the heterogonous, pure WIMAX, UMTS-single, and UMTS-dual models are depicted in Fig. 10 A-D respectively.

(D) Dual-UMTS model Fig.10. Delay variations results of the four models

(A) Heterogonous model

The analysis of delay variations results show that the delay variations reach the highest values in the pure WiMAX model while it is the lowest in the both UMTS models. Consequently, the high delay variations in WiMAX model directly affect the interworking model by highly increasing the variations in delay. Also comparing the two UMTS models shows higher delay variation when the UMTS design include two independent cores one for the calling party and the other for the called party. Thus, to get the best performance of the UMTS networks in term of delay variations it is better to have both end call parties in the same core to avoid propagation delay of the core network components. Also based on the results, avoiding G.729 codec in WiMAX networks can significantly improve the delay variations of the voice packets.

V. CONCLUSION

(B) Pure WIMAX model

Copyright Š 2016 MECS

In this work, we presented a framework consists of four separate cellular network models for developing new techniques in demand for higher data rates particularly for cellular interworking. From the simulation results, it is concluded that while the models provide similar data rate, the pure UMTS models will achieve greater QoS I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


8

Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services

performance for the call services in terms of less delay and variations in delay. Due to complexity of the internal structure of the WiMAX core, the end users experience poor call quality in these networks and any other network that is merged with them. Additionally, due to high compression rate of G.723 codec, applying this codec to compress and decompress the voice packets in cellular based networks is not recommended especially when there are bandwidth limitation conditions as it leads to highly degradation of the network performance. REFERENCES [1] G.Vijayalakshmy and G. Sivaradje, WiMAX-UMTS Converging Architecture with IMS Signaling analysis to achieve QoS, Elsevier 2nd International Conference on Communication, Computing & Security (CCCS-2012), 2012. [2] I.I. Mohamed, A.A. Hadi, R. Othman, A. Oudah, Performance Analysis of Seamless Vertical Handover in 4G Networks, Journal of Theoretical and Applied Information Technology, Vol. 79. No. 2, 2015. [3] A.L. Karthika, M.G. Sumithra, and A.Shanmugam, Performance of voice in integrated WiMAX-WLAN and UMTS-WLAN, International Journal of Innovative Research in Science, Engineering and Technology. Vol. 2, No. 4, 2013. [4] V. Bharathi and L. Nithyanandan, Efficient Cooperative Relaying in UMTS-WiMAX-WLAN Overlaid Heterogeneous Networks, Elsevier Processing of International Conference on Advances in Communication, Network, and Computing, CNC, 2014. [5] P. Mehta and S. Baghla, Performance Evaluation of Heterogeneous Networks for Various Applications Using OPNET Modeler, International Journal on Recent and Innovation Trends in Computing and Communication, Vol. 3, No. 6, 2015. [6] O. Arafat, M.A. Gregory, and M.M.A. Khan, Interworking Architecture between 3GPP IMS, Mobile IP and WiMAX in OPNET, IEEE 2nd International Conference on Electrical, Electronics and System Engineering (ICEESE), 2014. [7] E. Kalaiselvi, S. Kokila, and G. Sivaradje, A Novel Resource Provisioning Algorithm for the Integrated UMBWiMAX-WLAN Overlay Networks. International Journal on Applications in Information and Communication Engineering Vol. 1, No. 12, 2015.

[8] O. Khattab and O. Alani, An Overview of Interworking Architectures in Heterogeneous Wireless Networks: Objectives, Features and Challenges, Proceedings of the Tenth International Network Conference (INC2014), 2014. [9] A.A. Atayero and E.I. Adegoke, Interworking Architectures in Heterogeneous Wireless Networks: An Algorithmic Overview. International Journal of Computer Applications, Vol. 48, No .9, 2012. [10] R.A. Hamada, S.A. Hanaa and M.I. Abdalla, SIP-Based Mobility Management for LTE-WiMAX-WLAN Interworking Using IMS Architecture. International Journal of Computer Networks (IJCN), Vol. 6, No. 1, 2014. [11] O. Khattab, Improving Initiation Phase for Vertical Handover in Heterogeneous Mobile Networks, International Journal of Engineering Trends and Technology (IJETT), Vol. 29, No. 3, 2015. [12] G. Vijayalakshmy and G. Sivaradje, Loosely Coupled Heterogeneous Networks Convergence using IMS-SIPAAA, International Journal of Computer Applications, Vol. 85, No. 16, 2014. [13] B.S.K Reddya and B. Lakshmib, BER Analysis with Adaptive Modulation Coding in MIMO-OFDM for WiMAX using GNU Radio, I.J. Wireless and Microwave Technologies, Vol.4, 2014. [14] A.A. Subail, M. Ahsan, and M. Enayetullah, Development of A New Efficient Routing Scheme for WiMAX Mesh Networks, I. J. Modern Education and Computer Science, Vol.10, 2013. [15] K. Jakimoski and T. Janevski, Priority Based Uplink Scheduling Scheme for WiMAX Service Classes, I.J. Information Technology and Computer Science, Vol. 8, 2013.

Authors’ Profiles Mina Malekzadeh is an assistant professor and lecturer in the department of computer science at Hakim Sabzevari University. Her research interests include communication networks, network security, VoIP, and system development programming. She holds a Doctoral degree in computer security from UPM, MSc in software engineering from UPM, BSc in computer engineering from SBU.

How to cite this paper: Mina Malekzadeh,"Heterogeneous Cellular Interworking of WIMAX and UMTS for Sensitive Real-Time Services", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.1-8, 2016. DOI: 10.5815/ijieeb.2016.06.01

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 1-8


I.J. Information Engineering and Electronic Business, 2016, 6, 9-21 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.02

Framework for an E-Voting System Applicable in Developing Economies Lauretta O. Osho Email: laurettachristi@gmail.com

Muhammad B. Abdullahi1 and Oluwafemi Osho2 1

Department of Computer Science Department of Cyber Security Science Federal University of Technology, Minna, 920001, Nigeria Email: {el.bashir02, femi.osho}@futminna.edu.ng 2

Abstract—Information technology has pervaded virtually every facet of human life. Even in the delivery of governance, information technology has gradually found a place. One of its applications is the use of electronic voting, also known as e-voting, as opposed to the traditional manual method of voting. This form of voting, however, is not immune to challenges generally associated with voting. Two of these include guaranteeing voting access to all eligible voters, and providing necessary voting security. The challenge of accessibility is especially peculiar to developing countries where IT adoption is still relatively low. This paper proposes a framework for an e-voting system that would most benefit developing economies. It ensures availability of the system to only eligible voters and integrity of the voting process through its capacity to identify and prevent ineligible voters and multiple voting. To guarantee accessibility to all eligible voters, it supports both online and offline voting capabilities. Adopting electronic form of voting would provide a more robust, easier to use, and reliable system of voting, which, consequently, would contribute towards enhancing the delivery of democratic dividends. Index Terms—e-voting, cloud computing, democracy, election, security, availability.

I. INTRODUCTION E-voting is seen as the ability of a nation to improve her electoral process. Non-electronic voting systems are often replete with many flaws including high cost and easy manipulation. Paper ballots, direct counting, and other manual electoral processes have proved unreliable due to rigging, misplacement of ballot papers, overcounting or undercounting of votes, mixing up votes and changing of ballot papers and results. Most of these flaws could be avoided through the adoption of e-voting system. Electronic form of voting provides many advantages. These include ensuring access to the voting process for people living with disabilities, reducing the cost and time of elections, increasing turnout at elections, and providing services that can be trusted [1], Copyright © 2016 MECS

[2]. The adoption of e-voting has the potential to boost voters‟ confidence [3]. Many countries, especially the underdeveloped and developing, have found it almost impossible to adopt the use of e-voting system in their electoral processes due to a barrage of problems. None or only few have been able to engage substantially electronic means of voting. This contrasts with countries in advanced economies, including Austria, Australia, Brazil, Canada, Switzerland, Estonia, France, Great Britain, Japan, Russia, Sweden, and USA, that have gone beyond holding pilots and trials, to legally binding e-voting or remote e-voting implementation [4]. Generally, there is a wide gap between developed and developing countries in terms of information and communication technology (ICT) usage. For instance, according to the International Telecommunication Union (ITU), in 2014, internet usage per 100 inhabitants in developing countries was 32.4, compared to 78.3 applicable for developed countries. Fig. 1 presents the statistics from 2005 to 2014 for developed and developing countries and the world average for each year [5]. This statistics is particularly alarming for some countries. Countries like Eritrea, Timor-Leste, Myanmar, Burundi, Somalia, Guinea, Niger, Sierra-Leone, and Ethiopia all have less than 2 persons per 100 inhabitants using the internet. The statistics for the thirteen least countries in terms of internet penetration is displayed in Fig. 2 [6]. An e-voting system must be accessible to every eligible voter, and provide a high level of security. However, this system has been found to be vulnerable to various security challenges and threats, including stored central data leakage/disclosure, selling of votes, and the presence of certain malware on voter‟s machine, to mention but a few [7]. Although, there are strong encryption schemes applicable to address issues concerning confidentiality, integrity, and authenticity, there is need for further technological implementations to address issues of availability, which consequently enhances overall security [2]. Essentially, electronic voting requires a level of security higher than the other components, including e-commerce [4]. It is a complex

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


10

Framework for an E-Voting System Applicable in Developing Economies

system where every stage of its implementation must be secured. From the foregoing, the implication of this common challenge of low internet penetration among developing countries is that, adopting an e-voting system that is strictly internet-based would not provide equality of access to all eligible voters. This implies a need for a system that can function both online and offline, to provide voting opportunity for eligible voters in areas without internet infrastructure, whilst ensuring a level of security required for an e-voting system. This paper presents a framework for an e-voting system that supports both online and offline voting capabilities, ensures accessibility and security, and is suitable within the setting of a developing economy to

guarantee a free and fair election. The components of security focused on in the study entails the capacity of the system to identify and prevent ineligible voters and multiple voting, thereby ensuring availability of the system to only eligible voters and integrity of the voting process. The rest of the paper is organized as follows: section two reviews existing e-voting frameworks and systems. The generic, security, and functional requirements, to be satisfied by the proposed system, are defined in section three. In section four, the proposed framework is presented. The system components and process design are thereafter elaborated. Other components of the framework, including result synchronization and addressing of security requirements are then discussed.

90.0 80.0 70.0 60.0 50.0

Developed

40.0

Developing

30.0

World

20.0 10.0 0.0 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 Fig.1. Number of internet users per 100 inhabitants for years 2005 to 2014 in developed and developing countries, and the world

3 2.7 2.4 2.1 1.8 1.5 1.2 0.9 0.6 0.3 0

Fig.2. Countries with least internet penetration rate

II. REVIEW OF EXISTING E-VOTING FRAMEWORKS/SYSTEMS In this section some existing e-voting systems are reviewed. The bases for evaluating the performance of

Copyright Š 2016 MECS

the systems are the level of ubiquity – capacity to provide equality of access – and security supported. In the development of any system, e-voting systems inclusive, certain requirements must be considered as primary objectives, since having a perfect system is almost infeasible. One common requirement is security. Most voting data and process security are based on

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

encryption schemes via public key infrastructure, and certificate, e.g. in [4], [8], [9], [10]. This can be used in conjunction with biometric and/or smart card for secure authentication, as in [11], [12], [13] and [14]. On the other hand, [15] mainly focused on security at the authentication level via implementation of both biometric and smart card technology. Security must be given high priority to ensure voters‟ trust and the success of the election. Voting data; voting process; voting channels, including access control channel, communication channel; and all voting tools and technologies must be secured. The implication is that, security only at one level or aspect of the voting process leaves other levels vulnerable. Attackers could easily leverage on this to compromise the integrity of the entire process. Another important factor, crucial for a successful voting process, is developing a system architecture that actually supports security. This is known as security by architecture. For instance, using centralized server architecture, as in the case of [10], [14], and [15], creates a single point of attack. A successful attack against the server compromises the entire voting process. For [16], adopting solely the use of mobile platform to deplore elections in a country like Nigeria is not reliable. This is due to the fact that the success of the elections would largely have to depend on the country‟s mobile providers. The system proposed by [17] also depends largely on the use of mobile technology, for receiving one-time password, used for authentication. Attacks against communication infrastructures would render the election unsuccessful. For [11], one major drawback is allowing prospective voters to install the voting application on their local computer systems. A compromised system can be used to launch attacks against the remote voting server. One of the determinants used to assess the level of success in any election process is the percentage of eligible voters actually enfranchised. One method of achieving this lies in the use of multiple voting channels, e.g. in [11], [18]. Typically, e-voting systems, on a large scale, are deployed for online voting. However, considering the challenge of disenfranchising voters in areas without internet, which is common in most developing and underdeveloped nations, some authors developed their systems either only for offline use, for instance, in [19], or for both online and offline, e.g. [13]. Most developing countries use the manual voting system during elections. This possibly could have motivated some authors to develop e-voting systems applicable only to their respective countries. Examples include e-voting systems developed by [19], applicable for elections in Lebanon; [20] for use in Ghana; [16], which incorporated an integrated multilingual voting service platform for online elections in Nigeria; and [14], which integrated the India‟s Integrated Unique Identification (UID) mechanism as one of the requirements for authentication of prospective voters. Recently, some authors have highlighted the possibilities of deploying e-voting systems on the cloud, for example in [4], [14]. While the e-voting system Copyright © 2016 MECS

11

proposed in both studies combined distributed server and cloud architectures, they differed in the components to be deployed in the cloud. For [4], the cloud desktop, service distributor, validating server, and publishing server were all located in the cloud. On the other hand, [14] proposed hosting the technologies for handling user request and authentication and service management, and vote counting server on the cloud. Table 1 summarises the features of some existing evoting systems, with the methodologies used, their strengths and weaknesses. A review of existing proposed e-voting systems reveals some characteristics that make them not suitable for implementation especially in developing countries. Basically, the existing e-voting solutions possess either or both of two problems: limitation in inherent capacity to provide voting security, and inability to guarantee equality of access to all eligible voters. A potential taxonomy in respect of e-voting security could consider security from two perspectives, namely security by architecture and security supportable by the cryptographic scheme used. Security by architecture focuses on the choice, arrangement, and integration of the e-voting infrastructures, including voting platforms, servers and communication channels. This inherently contributes to the security of voting process. Many of the existing e-voting systems adopt the use of server-client architecture. The location(s) of the servers used in the system can easily become target of attack. On the other hand, the cryptographic scheme used determines security of voting data and channels. Lack of universality, that is, non-guarantee of voting access for every eligible voter, regardless of locations, implies some eligible voters are bound to be disenfranchised. For instance, in Nigeria, many rural areas do not have internet facilities. High speed broadband internet penetration in Nigeria is between 4% and 6% [21]. And the digital divide between urban and rural areas is very wide. In 2006, out of a population of more than 140 million, more than 72 million were eligible voters [22]. If this is extrapolated to the end of 2012, where population was estimated to be 166.21 million, assuming the same growth rate, approximately 86 million of the country‟s population was eligible voters. By the end of 2012, there were 32.88 internet subscribers per 100 inhabitants in Nigeria [23]. This implies that an e-voting system that is strictly internet-based could potentially disenfranchise around 57.7 million eligible voters.

III. DEFINITION OF REQUIREMENTS OF PROPOSED SYSTEM Building an e-voting system entails an in-depth understanding of a host of requirements, ranging from legal to regulatory, functional to security. It is essential to ensure voters authentication, voting process security, and security of voting data [13] are given top priority during development life cycle.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


12

Framework for an E-Voting System Applicable in Developing Economies

Table 1. Summary of characteristics of existing e-voting systems S/No.

Methodology/Features

Strength of Method

Limitation of Method

System employs three agents: a ballot distributor, certifying authority, and vote compiler; and public-key cryptography.

Supports adequately requirements including accuracy, democracy, privacy and verifiability

Use of blind signature, and public key infrastructure

Ensures public-ness and transparency of voting.

3.

Diehl and Weddeling (2006) [9] Malkawi, Khasawneh, and Al-Jarrah (2009) [15]

 Suitable only for online (internet) voting,  Cannot prevent vote buying,  Poor fraud detection, and  Cast ballot can be traced back to IP address.  Suitable only for online voting.

Client and server side architecture, incorporate biometric and smart card authentication.

Increased level of authentication security. Capable of handling elections with multiple scopes simultaneously.

 Voting process would not be secure since local voting station and central DB server were placed on the same local network.

4.

Zissis (2011) [4]

Combination of cloud and distributed server architectures, and public-key cryptography.

Enhanced security of voting process and data

 Suitable only for online voting.

5.

Okediran, Omidiora, Olabiyisi, Ganiyu, and Alo (2011) [11]

Uses a 3-tier architecture: client, server, and database tiers, biometric for authentication, and RSA algorithm and SSL/TLS for securing communication.

The system accommodates voting via mobile terminals, remote personal computers, and on-site polling stations.

6.

Ofori-Dumfuo and Paatey (2011) [20]

Ease of implementation of system for a small scale election.

7.

Olaniyi, Adewumi, Oluwatosin, Bashorun, and Arulogun (2011) [16]

Uses a 3-tier architecture: client, server, and database tiers, with authentication achieved via ID and password. Uses a 3-tier architecture which consists of the front end, middle tier and back end.

 Installing voting application of client‟s computer could be exploited by malicious voters.  The architecture would not support the principle of secrecy of elections.  Suitable only for online voting.  Encryption scheme used not discussed.

Provides a platform that would benefit majorly rural and suburban communities using voter‟s mother tongue

 Suitable only for online voting.  Google Android OS cannot provide adequate security required for evoting.  Application of mobile platform only would imply that the success of the election would have to primarily rely on mobile service providers.

8.

Visvalingam and Chandrasekaran (2011) [12]

Uses biometric token and iris pattern for authentication, and employed the internet as the sole medium of communication, in which case entire voting and verification processes are achieved on a single transaction.

Reduced voting process time.

 Suitable only for online voting.  Complete reliance on internet makes the system susceptible to many attacks.

9.

Alaguvel and Gnanavel (2013) [13]

Multi-level authentication ensures security against unauthorized voters.

10.

Olaniyi, Arulogun, Omidiora, and Oludotun (2013) [10]

Uses facial recognition, RFID smart card and finger veins sensing for offline e-voting system, and GSM one-time password for online voting. Uses multifactor authentication and cryptographic hash function.

 The combination of facial recognition, fingerprint, and smart card technology, for authentication will significantly increase voting time.  Using a centralized server creates a single point of attack against voting data.

11.

Gupta, Dhyani, and Rishi (2013) [14]

Reduced operational cost. Increased security of voting process.

 Suitable only for online voting.  Using a centralized server creates a single point of attack against voting data.

12.

Njogu (2014) [18]

Implemented using cloud architecture, with country‟s Integrated Unique Identification (UID) mechanism and biometric for authentication. Uses client-server architecture

Provides platform for multiple voting channels.

 Suitable only for online voting.  One server is responsible for storing, processing and securing data.

13.

Biswas (2015) [17]

Uses one-time password sent to voter‟s mobile phone

Ensures voter‟s anonymity and provides secure authentication.

1.

2.

Author(s) and system Year Ray, Ray and Narasimhamurthi (2001) [8]

Copyright © 2016 MECS

Ensures the integrity of voting process and data.

Dependent largely on mobile technology.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

A. Generic Requirements These are baseline requirements that an e-voting system must satisfy. Some of the generic requirements include [4], [24]:  Scalability: ability of the system to be expanded to meet growing demands, whilst still maintaining its performance level.  Flexibility: ability of a system to be compatible with different standard technologies and platforms. This is also used to describe a system that is accessible to the disabled.  Mobility: the ability of a system to provide no restrictions in the location where prospective voters can cast their votes.  Robustness: ability of the system to cope with execution errors, or continue to operate correctly despite incorrect inputs.  Democracy: the system must ensure the principle of „one man-one vote,‟ that is, no one can vote more than once. It must also ensure no voting by proxy. B. Security Requirements Zissis [4] highlighted some security requirements for the different phases of elections. Some of these requirements include:  The system shall identify and authenticate the voter before accepting and storing the e-vote.  The system shall protect the confidentiality of the transmitted e-votes.  The system shall verify the integrity and authenticity of e-votes.  The system shall protect the integrity and authenticity of e-votes.  The system shall communicate only with the authentic and unaltered client-side voting system.  The system should be tamper-resistant and tamperevident. C. Functional Requirements They constitute system-specific requirements peculiar to the e-voting system. It constitutes functionality, required input, expected output, and what is to be stored. Some of these requirements are as follows:  Every eligible voter must be able to access the system.  An ineligible voter must not be able to access the system.  An eligible voter must not be able to vote more than once.  The system must provide alternative accessibility platforms for voters.

Copyright © 2016 MECS

13

 An eligible voter should be able to select alternative voting platforms within a specified deadline.  More than one voter should be able to vote simultaneously.

IV. PROPOSED FRAMEWORK FOR A SECURE E-VOTING SYSTEM The main focus of this study is to develop an e-voting system that supports universality of access to all eligible voters, provides security by architecture, and ensures integrity of voting process. We propose a hybrid electronic voting system that combines the capabilities of the Direct-Recording Electronic (DRE) and online electronic voting system, offering a platform for online and offline voting. Specifically, it supports voting via direct-recording, online poll-site, and remote e-voting systems. To support online voting, it is implemented as a cloud application. This implies a Platform as a Service (PaaS) arrangement, where operating system, servers, network, memory and other hardware resources are provided by the cloud provider. As a cloud application, voting is done online either remotely through a PC connected to the internet, or at an e-voting polling kiosk. For eligible voters who reside or intend to cast their vote in locations where internet is inaccessible, the DRE, administered at a polling kiosk, is implemented with a computer with keyboard or touch screen to cast their votes, with vote tally stored on the computer memory storage, after which the results are synchronized with the cloud. The system also provides an audit system for logging every process. The architecture of the system is depicted in Fig. 3. The e-voting system provides flexibility in the choice of voting mechanism. For instance, a voter who registers to vote via a voting platform has the opportunity to select any other alternative platform within a stipulated deadline, subject to satisfying stipulated requirements. The use of multiple servers, with separation of duties, ensures that an attack against any of the servers will not necessarily lead to the end of the election.

V. SYSTEM COMPONENTS The architecture of the proposed e-voting system possesses an underlying structure that supports security. This is accomplished by the segregation of the different components of the voting process, with each process handled by a sub-system or server. This inherently protects the entire process as a single point of failure or susceptibility to attack is avoided. There are various components that make up the evoting system, each with one or more functions. These include the following:

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


14

Framework for an E-Voting System Applicable in Developing Economies

Fig.3. Architecture of proposed e-voting system

 Offline Voting Application: this is a stand-alone component of the entire e-voting system, the Direct-Recording Electronic (DRE) machine, made up of a computer with keyboard or touch screen for casting votes, and other essential peripheral devices, with vote tally stored on the computer memory storage.  Online Voting Interface: this is a cloud-based desktop interface used at the online poll-site and by remote voters to access the e-voting system located in the cloud.  Validating Server: it contains the registration details of all eligible voters. Hence, a voter must be validated on this server before access to the voting server is granted.  Voting Server: this provides the functionality for casting of votes.  Vote Storing Server: each vote cast is stored by the Voting Server on this server.  Voters’ List Server: the Voting server separates the details of voters from the votes cast. The votes are stored in the Vote Storing Server, while the details of the voters are stored on this Voters‟ List Server.  Vote Counting Server: used for counting the votes. Votes for each candidate vying in the election are separated in this server.  Publishing Server: publishes the results of the election(s).  Audit: this is an audit service that provides for auditing of the entire process.

Copyright © 2016 MECS

VI. PROCESS DESIGN Generally speaking, voting can be divided into three phases: registration (pre-election), voting (election), and tallying (post-election) phases [4]. Part of the activities under the first phase is registration of voters. This section discusses these phases. A. Registration Phase This is the period for registration of prospective voters. During this phase, the e-voting system (both online and offline) is configured only for registration. This means all functionalities that support the election and post-election phases are disabled. Likewise during election phase, and through the stipulated election period, functionalities that support other phases are disabled. Once the stipulated election period elapses, all functionalities other than those for the post-election phase are disabled. Registration Requirements Registration can be done either online, only at an evoting polling kiosk, or offline, depending on the location of the individual being registered. The following are required to be provided during registration:  Bio-data. This includes full name, date of birth, sex, nationality, state of origin, local government area, and other personal details.  Contact Address.  Biometric (fingerprints) details.  Facial image.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

 Preferred voting medium. This determines the voting category of the voter being registered.  Mobile number. This is applicable for those registering online. Fig. 4 is a use-case diagram of the registration requirements.

15

B. Voting Phase This is the main election phase of the entire exercise. It begins with authentication of voters. Only those who are authenticated are allowed to cast ballots. Due to the availability of multiple voting platforms, voters have different categories. Each voter category has different authentication requirements before a prospective voter, in that category, could be allowed to cast a vote. Voter Categories The e-voting system presents three modes and corresponding platforms for voting, viz. offline, via a polling kiosk; online, via a polling kiosk; and online, via a PC remotely connected to the system. Consequently, these form the three different categories of voters. Voting Requirements The requirements for authentication of a voter before being allowed to vote depend on the category of the voter. Based on the foregoing, authentication under the three mechanisms of voting are considered:  Offline, via a polling kiosk,  Online, via a polling kiosk, and  Online, via a PC.

Fig.4. Registration use case

Registration Procedures Once a prospective voter satisfies the registration requirements, two unique Identification numbers are generated by the system. One is a permanent voter ID number, which combines the state, local government and registration center IDs, and a unique serial number. For instance, using Nigeria as an example, a prospective voter registers in Gwagwalada local government area of the Federal Capital Territory (FCT), in a polling kiosk with ID number 001, assuming FCT is assigned code FT, and Gwagwalada the code GWA, the permanent voter ID number of a prospective voter could be FT/GWA/001/0001. The second is a voting ID that is generated based on the medium the voter indicated to use for voting. For instance, if a prospective voter, during registration, selects to vote online using e-voting polling kiosk, assuming this is assigned the code ONK, the voting ID number could be ONK/0001. The codes ONP and OFF for online voting via PC and offline voting respectively are assumed. This mechanism would help prevent multiple-voting. A prospective voter can update his registration details and/or preferred voting platform not later than a stipulated time before election date. Once an intended voting medium is changed, a new voting ID, which overrides the previous one, is generated for the individual. Once registration is completed (or updated), a voter ecard is generated, downloadable in pdf format. This contains all registered details of prospective voter. Copyright © 2016 MECS

The authentication requirements for each voter category, which stem directly from the registration category, are summarized in Fig. 5. It must be stated that while it is expected that a voter‟s registration category strictly determines the voter category, the proposed system allows for flexibility in the choice of voter category. For instance, a prospective voter who registered offline may choose immediately, or at a later time change, as long as the permitted time frame is yet to elapse, to vote under any of the other two categories. Authentication Process For offline and online voting via polling kiosk, the system uses a 3-level authentication. A prospective voter supplies the voter and voting IDs. The fingerprint is then scanned. The system verifies if these details correspond to those in the database or cloud (in the case of online voting). If the details do not correspond the prospective voter is allowed to repeat the steps again. After three attempts, if failure is still recorded the voter is rejected, and the processes are logged. However, if the captured details are correct, other details of the voter, including bio-data, address, and facial image are displayed. The voting administrator physically verifies if the displayed image is the same as the prospective voter. If yes, the voting administrator confirms the identity of the voter (clicking on a button). The voting interface is then presented for the voter to cast his/her ballot. Else, if the displayed image does not correspond to the face of the voter, the voting administrator disapproves of the voter. This means, the voting interface is not presented. All system processes are logged.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


16

Framework for an E-Voting System Applicable in Developing Economies

On the other hand, for remote online voting, the system also uses a 3-level authentication. The same process of filling in voter and voting IDs, and supplying fingerprint image are followed. In the event that the captured details are correct, a random 6-digit token is then generated, and sent to the prospective voter‟s mobile phone (which acts as a token device). The voter enters the value. If incorrect,

the prospective voter is rejected, else, if the entered value is correct, other details of the voter, including bio-data, address, and facial image are displayed. The voter can then proceed to cast his/her vote. All system processes are likewise logged. The authentication processes for all categories are presented in a flowchart depicted in Fig. 6.

Fig.5. Registration and voting requirements, categories of registration and voters, and respective voting media

Voting Process

c.

The process of voting varies depending on whether voting is done online or offline. For online voting, either via polling kiosk or PC, the following steps are involved:

d. e.

 A voter a. Is authenticated by the Validating Server. b. Is connected to the Voting server to be able to vote. c. Makes a choice of candidate. In the case of multiple elections, such voter indicates the elections he/she wants to be part of, and goes on to select candidates accordingly. d. Encrypts the ballot and personal data using the public key of the Vote Storing server, and sends it to the Voting server.  The Voting server a. Signs the ballot. b. Forwards the signed ballot to the Vote Storing server.  The Vote Storing server a. Separates the digital signature and the ballot. b. Decrypts the ballot, and separates the voter‟s details from the vote.

Copyright © 2016 MECS

f.

Encrypts the voter‟s details with the public key of the Voters‟ List server, and Forwards these details to the same server. Encrypts the vote with the public key of the Vote Counting server, and Forwards the vote to the same server.

For offline voting, voting is consisted in the following:  A voter a. Is authenticated. b. Makes a choice of candidate. In the case of multiple elections, such voter indicates the elections he/she wants to be part of, and goes on to selects candidates accordingly.  The system a. Encrypts the ballot and voter‟s personal data using the public key of the cloud Vote Storing server, and b. Stores them on the system‟s database. In both methods of voting, each activity throughout the entire process is usually logged. Fig. 7 is a sequence diagram for online voting.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

17

Fig.6. Voting authentication process

C. Tallying Phase Tallying, also known as counting, involves collation of the number of votes for each candidate, and publishing the results of the election. For electronic form of voting, tallying is usually done automatically, even while voting is still ongoing. For offline voting, the results generation is synonymous to report generation. The amount of votes for each candidate is pulled from the database by the system and published. For online voting, the Vote Storing server separates the votes from the voters‟ details. The votes are sent to the Vote Counting server. The summary of results for the candidates is sent to the Publishing server. However, before the final results are published, results from offline systems are synchronized with results on the cloud, for final collation. Copyright © 2016 MECS

VII. SYNCHRONIZING WITH THE CLOUD Synchronization is essentially between offline systems used for voting and the cloud. After completion of voting, an offline e-voting system is brought to the collation center, which essentially has internet access. In order to synchronize cast ballots with the cloud, the system is authenticated by the Validating server. Using a combination of the global unique identifier of the computer system and voting application certificate, a secure SSL connection is created between the system and Validating server, thus encrypting exchanged data and guaranteeing their security through the cloud infrastructure. Upon validation, each ballot, together with the voter‟s personal data, in the database stack is popped up, signed by the Voting Server and then forwarded to the Vote Storing Server. Subsequent procedures are similar to those for online voting.

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


18

Framework for an E-Voting System Applicable in Developing Economies

Fig.7. Online voting process sequence diagram

data on transmit. It is therefore necessary to make it as difficult as possible for the attacker to make sense of the accessed data.

VIII. ADDRESSING SECURITY REQUIREMENTS This study majorly tackles security from the architecture point of view. This aids security of voting process. However, successful election is closely tied to satisfying as many security requirements as possible. The implication of this is that the security of voting data and communication channels cannot be neglected. In the course of elections deployed through electronic means, data, in digital form, is transmitted from one system to the other. For instance, the Voting server transmits a signed copy of the ballot to the Vote Storing server. An attacker could gain unauthorized access to this Copyright Š 2016 MECS

A. Public Key Cryptographic Scheme For an e-voting system, cryptographic implementation contributes to the security of the voting data and channels. However, the cryptographic scheme to be adopted must ensure a balance between security, usability, and accessibility of the system. For the proposed e-voting system, the RSA encryption scheme is proposed. RSA was invented by R. Rivest, A. Shamir, and L. Adleman, and has over the years become one of the most widely used encryption schemes. Its

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

security is based on the intractability of the integer factorization problem [25]. It is used to provide secrecy and digital signature, has been shown to be significantly faster in encrypting and decrypting than some other encryption schemes, including ElGamal [26], and can be implemented with Extensible Authentication Protocols (EAP) to secure cloud data [27]. For instance, for online voting, one of the procedures involves the Vote Storing server encrypting the voter‟s details with the public key of the Voters‟ List server, and forwarding the details to the same server. Using RSA for encryption, the encryption and decryption processes are as follows: Encryption:  Vote Storing server obtains Voters‟ List server‟s authentic public key (n, e), where e is the encryption exponent and n is the modulus.  Vote Storing server represents the voter‟s details as an integer m in the interval [0, n – 1].  It then computes mod n.  And send the ciphertext c to the Voters‟ List server. Decryption:  The Voters‟ List server uses its private key d to recover m = mod n. B. Randomized Authentication Token This is used to provide an added layer of security during authentication of online voters accessing the voting system remotely. It is essentially a non cryptographic solution [4] that involves tokens generated randomly and sent to voter‟s mobile devices. C. Digital Signature Blind signature is the electronic equivalent of the traditional signing technique [4]. It is a number that relies on the some secret known only to the signer, and the content of the signed message [25]. For the system the voter encrypts the ballot and personal data using the public key of the Vote Storing server, and sends it to the Voting server. The Voting server then signs the ballot and forwards it to the Vote Storing server. Thereafter, the Vote Storing server separates the digital signature and the ballot. The Voting server uses the RSA algorithm for signing the votes. The signing process is presented as follows: the Voting server has a public key (n, e) and private key d. b is the ballot to be signed, and k is a random number between 1 and n.  The Voting server computes ̃ , an integer in the interval [0, n – 1]. ̃  It then computes .  The signature of the Voting server is s.

Copyright © 2016 MECS

19

The Vote Storing server verifies the Voting server‟s signature s and recovers the ballot b using the following steps:  It obtains the Voting server‟s public key (n, e).  It computes ̃ ̃ .  It then computes D. Separation of Duty It is an architectural framework that involves separation of functions, where a server is dedicated for each function. This technique supports security of voting. For instance, to ensure privacy of votes, Vote Storing server separates voters‟ details from ballots, while voter‟s details are sent to Voters‟ List server, the ballots are sent to Vote Counting server.

IX. CONCLUSION The benefits of electronic means of voting as against the use of manual voting method cannot be overemphasized. This study has contributed to existing knowledge primarily by presenting a system with a architectural framework that guarantees accessibility to virtually all categories of voters to be enfranchised, and supports security of voting data and processes. The architecture of the system presented inherently supports security of voting data, by separating and assigning duties to different servers. However, this assertion was not evaluated. Hence, this area is open for further studies. This study has focused on two elements of information security – integrity and availability. It is suggested that future research endeavours could consider confidentiality in the entire process. Another area worthy of further exploration is the consideration of different cryptographic schemes to determine which would best be suitable for the system, especially considering its cloud nature. REFERENCES [1] V. Gupta, “e-Voting: Move to intelligence suffrage,” SETLabs Briefings, vol. 9(2), pp. 3–8, 2011. [2] D. Zissis, and D. Lekkas, “Securing e-Government and eVoting with an Open Cloud Computing Architecture,” Government Information Quarterly, vol. 28, pp. 239–251, 2011. [3] O. Osho, V. L. Yisa, and O. J. Jebutu, ”E-Voting in Nigeria: A Survey of Voters‟ Perception on Security and Other Trust Factors,” Proceedings of the International Conference on Cyberspace Governance, Abuja, FCT, pp. 202–211, November 2015. doi: 10.1109/CYBERAbuja.2015.7360511. [4] D. Zissis, “Methodologies and Technologies for Designing Secure Electronic Voting Information Systems,” 2011. Unpublished PhD thesis, University of Aegean. [5] ITU. “ITU Key 2005 – 2014 ICT Data,” 2015. Retrieved May 17, 2015 from http://www.itu.int/en/ITUD/Statistics/Documents/statistics/2014/ITU_Key_20052014_ICT_data.xls

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


20

Framework for an E-Voting System Applicable in Developing Economies

[6] Internet Society. “Global Internet Penetration.” Retrieved May 17, 2015 from http://www.internetsociety.org/map/global-internetreport/?gclid=CPi5mqrwxsUCF UT n wgodrxUALg [7] S. Chaeikar, M. Jafari, H. Taherdoost, and N. Chaeikar, “Definitions and Criteria of CIA Security Triangle in Electronic Voting System,” International Journal of Advanced Computer Science and Information Technology vol. 1 (1), pp. 14–23, 2012. [8] I. Ray, I. Ray, and N. Narasimhamurthi, “An Anonymous Electronic Voting Protocol for Voting over the Internet,” Proceedings of the Third International Workshop on Advanced Issues of E-Commerce and Web-Based Information Systems, San Juan, CA, pp. 188–190, June 2001. [9] K. Diehl, and S. Weddeling, “Online Voting Project-New Developments in the Voting System and Consequently Implemented Improvement in the Representation of Legal Principles,” Electronic Voting, pp. 213–222, 2006. [10] O. M. Olaniyi, O. T. Arulogun, and E. O. Omidiora, “Design of Secure Electronic Voting System Using Multifactor Authentication and Cryptographic Hash Functions,” International Journal of Computer and Information Technology, vol. 2(6), pp. 1122–1130, 2013. [11] O. O. Okediran, E. O. Omidiora, S. O. Olabiyisi, R. A. Ganiyu, and O. O. Alo, “A Framework for a Multifaceted Electronic Voting System,” International Journal of Applied Science and Technology, vol. 1(4), pp. 135–142, 2011. [12] K. Visvalingam, and R. M. Chandrasekaran, “Secured Electronic Voting Protocol Using Biometric Authentication,” Advances in Internet of Things, vol. 1, pp. 38–50, 2011. doi:10.4236/ait.2011.12006. [13] R. Alaguvel, and G. Gnanavel, “Offline and Online EVoting System with Embedded Security for Real Time Application,” International Journal of Engineering Research, vol. 2(2), pp. 76–82, 2013. [14] A. Gupta, P. Dhyani, and O. P. Rishi, “Cloud based eVoting: One Step Ahead for Good Governance in India,” International Journal of Computer Applications, vol. 67(6), pp. 29–32, 2013. [15] M. Malkawi, M. Khasawneh, O. Al-Jarrah, and L. Barakat “Modeling and Simulation of a Robust e-Voting System,” Communications of the IBIMA, vol. 8, pp. 198–206, 2009. [16] O. M. Olaniyi, D. O. Adewumi, E. A. Oluwatosin, M. A. Bashorun, and O. T. Arulogun, “Framework for Multilingual Mobile E-Voting Service Infrastructure for Democratic Governance,” African Journal of Computing and ICT, vol. 4(2), pp. 23–32, 2011. [17] S. Biswas, “GSM Verification Based Secure E-Voting Framework,” International Journal of u- and e-Service, Science, and Technology vol. 8(1), pp. 231–238, 2015. [18] J. I. Njogu, “E-Voting System: A Simulation Case Study of Kenya,” 2014. Unpublished MSc. Thesis, University of Nairobi. [19] M. Hajjar, B. Daya, A. Ismail, and H. Hajjar, “An EVoting System for Lebanese Elections,” Journal of Theoretical and Applied Information Technology, vol. 2(1), pp. 21–29, 2006. [20] G. Ofori-Dwumfuo, and E. Paatey, “The Design of Electronic Voting System,” Research Journal of Information Technology, vol. 3(2), pp. 91 – 98, 2011. [21] Presidential Committee on Broadband. “Nigeria‟s National Broadband Plan 2013 – 2018.” Retrieved February 27, 2014 from: http://www.phase3telecom.com/The% 20Nigerian%20National%20Broadband%20Plan%202013 _19May2013%20FINAL.pdf

Copyright © 2016 MECS

[22] National Population Commission. “2006 Population and Housing Census. Priority Table, Vol. 4,” 2010. Retrieved February 10, 2014 from http://www.population.gov.ng/ images /Priority%20table%20Vol %204.pdf [23] ITU. “Percentage of Individuals using the Internet,” 2013. Retrieved February 10, 2014 from http://www.itu.int/en/ITUD/Statistics/Documents/statistics/2013/Individuals_ Internet_2000-2012.xls [24] G. Z. Qadah, and R. Taha, “Electronic Voting Systems: Requirements, Design, and Implementation,” Computer Standards and Interfaces, vol. 29, pp. 378–386, 2007. [25] A. J. Menezes, P. C. Van Oorschot, and S. A. Vanstone, “Handbook of Applied Cryptography,” CRC Press, 1996. [26] T. Z. Nwe, and S. W. Phyo, “Performance Analysis of RSA and ElGamal for Audio Security,” International Journal of Scientific Engineering and Technology Research, vol. 3(11), pp. 2494–2498, 2014. [27] S. Marium, Q. Nazir, A. Ahmed, S. Ahthasham, and M. A. Mehmood, “Implementation of EAP with RSA for Enhancing the Security of Cloud Computing,” International Journal of Basic and Applied Science, vol. 1(3), pp. 177–183, 2012.

Authors’ Profiles Lauretta Oluwafemi Osho holds a B.Tech. degree in Mathematics/Computer Science and an M.Tech degree in Computer Science. Her research interests include cloud computing and software development.

Muhammad Bashir Abdullahi received B.Tech (Honors) in Mathematics/Computer Science from Federal University of Technology, Minna, Nigeria, and Ph.D. in Computer Science and Technology from Central South University, Changsha, Hunan, P. R. China. His current research interests include trust, security and privacy issues in data management for wireless sensor and ad hoc networks, cloud computing, big data technology and information and communication security.

Oluwafemi Osho is currently a lecturer in the Department of Cyber Security Science, Federal University of Technology, Minna, Nigeria. He holds a B.Tech. degree in Mathematics/Computer Science and an M.Tech. degree in Mathematics. Before joining the institution, he served as Head of the IT Department of one of the leading mortgage banks in Nigeria. His current research interests include cybersecurity, mobile security, and security analysis. He is a Certified Ethical Hacker (CEH).

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


Framework for an E-Voting System Applicable in Developing Economies

21

How to cite this paper: Lauretta O. Osho, Muhammad B. Abdullahi, Oluwafemi Osho,"Framework for an E-Voting System Applicable in Developing Economies", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.9-21, 2016. DOI: 10.5815/ijieeb.2016.06.02

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 9-21


I.J. Information Engineering and Electronic Business, 2016, 6, 22-27 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.03

M-Commerce in Bangladesh –Status, Potential and Constraints Biman Barua Senior Lecturer, Department of AMT, BGMEA University of Fashion and Technology, Dhaka -1230, Bangladesh Email: biman@buft.edu.bd Abstract—Mobile Commerce often referred to as “MCommerce” or “mCommerce” is a new dimension or extension of e-Commerce that is performed by mobile devices and Personal Digital Assistants (PDA) using mobile phone networks. As the number of mobile users is increasing dramatically the prospect of m-commerce is also increasing day by day in developing countries like Bangladesh. Though there are a lot of researchers has written about the prospect and adoption of M-commerce but in my research, I tried to find out the statistical analysis of mobile users, mobile internet users; Mcommerce current status in Bangladesh. Research also has been done for a number of visitors of stakeholder’s site, using the ranking tools, uses of mobile apps by customers and limitation of mobile commerce adoption in Bangladesh those were not discussed earlier. Here, I collected data through web and phone call from various stakeholders, in the top line m-commerce business, studied and identified the problem, shown the current status and major barriers of M-commerce and suggested a methodological framework. Index Terms—M-Commerce, Barriers, Commerce, Mobile operators, Bangladesh.

I.

Mobile

INTRODUCTION

The expansion of the technology and revolution of mobile communication in the rural area of Bangladesh it has eventually been a motivating variable in the development of M-Commerce. In the present Business organizations, mobile commerce has been presented to account, administration, retail, and telecommunication and data innovation administrations. In these areas, MCommerce is not just being broadly acknowledged additionally it is by and large more utilized as a wellknown method for business/commerce. In this paper, I attempt to give an outline of the essentials about mcommerce present statistics and limitations. M-commerce is on a development track. It is increasing expanding acknowledgment amongst the various areas of Bangladesh. This development can be followed back to innovating and demographical advancements that have affected vital parts of the socioeconomic condition in today's reality. The requirement for adaptability is by all accounts an essential main motivation behind M-Commerce applications, for Copyright © 2016 MECS

example, Mobile Entertainment, Mobile Banking and Mobile Marketing. In Bangladesh, there is an amazingly developing number in the reception of wireless technologies in the range of M-commerce as there are more purchasers have a cell phone than having a PC at home. This is again clear in the figures reported by the Ministry of Finance, Bangladesh that currently the quantity of mobile users in Bangladesh is increasing rapidly. Notwithstanding, Mcommerce is still generally new wonder contrasted with different markets in Europe and in Asia Pacific to be specific Japan, Hong Kong, Taiwan, and Singapore. The greater part of the organization in Bangladesh is still slow in giving M-commerce administrations because of lack of timely and reliable systems for the delivery of physical goods, low bank account and credit card penetration, low income, and low computer and the low rate of internet users in Bangladesh. The main objective of the research are to discover the current statistics of m-commerce, future prospects and the main barriers on M-Commerce adoption in Bangladesh. Also suggested some recommendations to implement MCommerce successfully in Bangladesh. The research also considers the opportunity for online payment methods and delivery systems that would enable it to help more mechanisms of the business transaction process in a developing country such as Bangladesh.

II.

LITERATURE REVIEW

A. M-Commerce History and Current Statistics The rapid growth of the mobile phone users and mobile applications over the year mobile has become a vital part of human life. With the rise of mobile user, mobile commerce gets the highest priority to do business using it. The use of mobile technology as a payment gateway was started in 1997 when Coca-Cola introduced the initial two cell telephone empowered candy machines in Finland. They could send portable installments to the candy machines by means of SMS instant messages. At the same time and nation that an M-Commerce based managing an account administration was presented as well. The first mobile commerce on the online was started in 1999 by a Japanese organization name was I-mode. IMode was permitted clients the capacity to peruse the net,

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


M-Commerce in Bangladesh –Status, Potential and Constraints

read email, download entertainments and access other services. In the U.S.―limitless mobile phone arrangements have been none existent among significant carriers little a long time back, whereas, in the European markets, it has been the standard and now and again are the law. While Japan and Europe started 3G in 2001, the U.S. didn't introduce 3G until 2003 what's more, in Bangladesh 2012. B.

M-Commerce in Bangladesh

Before the end of 2008, the number of mobile subscribers was 44.6 million in Bangladesh from six operators. Where Grameenphone (47%), Warid (5%), Aktel (18%), Banglalink (23%), Citycell (4%) and Teletalk (2%) (BTRC, n.d). Now, as of February 2016 the ratio becomes Grameenphone (43%), Airtel (8%), Robi (Atket) (21%), Banglalink (24%), Citycell (.064%) and Teletalk (3.25%) (BTRC, n.d).[11]

23

Table 1. No of subscribers in Bangladesh Operator Grameen Phone (GP) Banglalink Robi Airtel Citycell Teletalk Total

No. of Subscribers 56.132 31.960 27.553 10.351 0.833 4.257 131.085

The data analysis has shown below using graph depending on the above number of subscribers of different mobile service providers. Grameen Phone Number of Subscribers (GP), 60 56.132 Banglalink, Robi, 31.96 27.553 Airtel, 40 Teletalk, 10.351 20 Citycell, 0 4.257 0

Mobile Subscribers 3%

0%

No. of Subscribers

8%

GP 44%

24% 21%

Fig.2. Comparative scenarios of mobile phone subscribers

Robi Banglalink

B. Increasing number of Internet users

Teletalk

According to BTRC, the current mobile internet users has reached to 53.431 million at the end of January 2016 whereas ISP and PSTN internet users are 2.594 million. This expansion in web usage sets off the utilization of various e-business destination’s past said in this task.

Citycell Fig.1. No of mobile phone subscribers as on February 2016

III.

IV.

PROSPECT OF M-COMMERCE

A. Increasing Number of Mobile Subscribers Currently, there are six mobile operators are providing mobile services to country peoples. At the end of June 2008, the number of subscribers was 43.7 million (BTRC, Annual Report, 2007-2008). Now it has reached to 133.05 million subscribers report at the end of February 2016[11] during the same period last year was 122.657 which means the number of subscribers increased by over 10 percent in one year. The telecom base any nation influence the online benefits directly, cause it is to a great extent relied on upon it. The telecom selection rate expanded considerably in later years, which means more individuals are near the edge of utilizing e-business or take an interest as a part of e-trade exercises. Comparative scenarios of mobile phone subscriber Till February 2016 [11]

Copyright © 2016 MECS

TYPES OF M-COMMERCE SERVICES OFFERED IN BANGLADESH

The central bank has given guidelines on “Mobile Financial Service for Bank” in September 2011. It has given 10 licenses to bank to perform full-fledged financial services using mobile. The two leaders have come forward with largest customer based and agent networks. The bKash was provided by BRAC bank and DBBL Mobile banking is provided by Dutch Bangla Mobile bank. Now mobile banking has become a very popular media to perform any financial activity. Now a day’s mobile transactions has expanded to the following areas     

Bill pay through mobile phone Mobile Remittance/ Banking Buying Ticket of Bus, Train, Launch, movie etc. Booking Hotel and Restaurant Mobile Marketing Shopping

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


M-Commerce in Bangladesh –Status, Potential and Constraints

24

V.

Alexa site ranking

THE PROBLEM OF THE RESEARCH

In 1997 the internet was functioning for the first time in Bangladesh. (Debnath, 2007). At the starting point, the young population of the higher class used the internet for chatting and other web surfing. Internet was out of reach for poor people On the other hand around 32% of total population use cellular telephones. So the risk of development of mobile commerce is more than that of ecommerce. However, the mass individuals can't get the benefits of most recent innovation of M-commerce or electronic commerce. The research proposed to do this examination to discover the suitable rule for installment framework for the association with the goal that it will convey many offices to mass individuals. Also, it is important to identify the constraint to implement m-commerce business in Bangladesh. As the mobile subscribers are increasing rapidly the government also should implement the new rules and regulations to expand m-commerce business. So that all customer shopping anywhere at any time without any hassle.

VI.

bikroy

500000

ekhanei

400000

kenakata

300000

daraz

200000

shoppersbd

100000

busbd

0

shohoz

Alexa visitors Fig.3. Alexa rank statistics

Bangladesh site Rank SHOHOZ.COM

523

BUSBD.COM.BD

2231

SHOPPERSBD.COM

CURRENT SCENARIOS OF M-COMMERCE IN BANGLADESH

A. Survey and Data Analysis among different providers

794

DARAZ.COM.BD

77

KENAKATA.COM

0

EKHANEI.COM

98

BIKROY.COM

28

There are a number of online businesses are found in Bangladesh those are doing M-Commerce. Most of the vendors are using the same policies for payment and other aspects. According to Alexa ranking few Bangladeshi sites are enumerated below-

0

500 1000 1500 2000 2500 BD Rank

Fig.4. Bangladesh rank using Alexa

Table 2. Subscribers site rank statistics Provider

Alexa Rank

BD Rank

bikroy.com

5,671

28

ekhanei.com

17,136

98

kenakata.com

9,800,265

no

daraz.com.bd

14,555

77

shoppersbd.com

168,855

794

busbd.com.bd

437,464

2,231

shohoz.com

118,286

523

VII.

Statistical data analysis depending on following criteria’s are done for the bellow service providers. In these findings, most of the companies are using almost same payment system and policies.

According to the data’s, a highest number of users are using Mobile commerce for buying tickets of different categories and second highest no of users are using mobile commerce for online shopping then third highest no of users are using C2C for buying or selling products through online using mobile. The graphical representation of site ranking using Alexa ranking is presented below-

Copyright © 2016 MECS

RESEARCH METHODOLOGY

     

Model B2C model Target Audience Payment System Coverage Area Language Support Revenue model- Sales of goods Ads, Subscription fees, transaction fees. VIII.

A.

DATA ANALYSIS

ekhanei.com

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


M-Commerce in Bangladesh –Status, Potential and Constraints

25

Table 8. ekhanei.com statistics Table 3. ekhanei.com statistics Model Target Audience Coverage Area Payment System Site Access Language Support Shipping Policy

B.

Model Target Audience Coverage Area Payment System

C2C model All Populations Six Divisions Hand to Hand when delivering the product. No need to register. English, Bengali Hand to Hand

Site Access Language Support Shipping Policy Products

bikroy.com G.

Table 4. bikroy.com statistics Model Target Audience Coverage Area Payment System

C2C model All Populations Six Divisions Hand to Hand when delivering the product. No need to register. English, Bengali Hand to Hand

Site Access Language Support Shipping Policy

C.

B2C model All Populations Whole Bangladesh Credit Card, Mobile bank bKash and DBBL, SureCash, MTB Registration required English Hand to Hand, Courier. BUS, Train and Lunch, event and movie ticket

busbd.com.bd Table 9. busbd.com statistics

Model Target Audience Coverage Area Payment System

B2C model All Populations Whole Bangladesh Credit Card, Mobile bank bKash and DBBL, SureCash. Registration required English Hand to Hand, Courier. Bus ticket.

Site Access Language Support Shipping Policy Products

daraz.com.bd Table 5. daraz.com.bd statistics

Model Target Audience

B2B, B2C, C2C model Bangladesh, Pakistan, Myanmar International provider (Bangladesh, Pakistan, Myanmar)

Coverage Area

Payment System

Credit Card, bKash, Home pay, Courier pay. No need to register. English, Bengali Hand to Hand, Courier

Site Access Language Support Shipping Policy

D.

IX.

shoppersbd.com

Providers biroy.com

B2B, B2C model All Populations Six Divisions Hand to Hand when delivering the product. No need to register. English Hand to Hand, Courier.

Site Access Language Support Shipping Policy

E.

To explore M-commerce business, the first step is to develop mobile apps for the client for surfing and placing an order through mobile phone. Good mobile apps should be user-friendly, so that client can place an order without any hassle. For statistical analysis, I collected some data’s from Google play store where few points are shown below in tabular formatTable 10. Apps statistics from Google play store

Table 6. shoppersbd.com statistics Model Target Audience Coverage Area Payment System

MOBILE APPS STATISTICS OF M-COMMERCE PROVIDERS

ekhanei.com

chaldal.com Table 7. chaldal.com statistics

Model Target Audience Coverage Area Payment System Site Access Language Support Shipping Policy Products

F.

B2B, B2C model All Populations Dhaka city, except certain parts of Old town. Hand to Hand when delivering the product, credit card bKash Need to register. English, Bengali. Hand to Hand, Courier. Food, Groceries, Home appliances etc.

kenakata.com

Google Play Data Updated -April 1, 2016 Size- 4.5M Installs- 500,000 - 1,000,000 Current Version- 0.9.57 Requires Android- 4.0 and up Content Rating- Rated for 3+ Offered By- Bikroy Updated- June 9, 2015 Size- 5.5M Installs- 500,000 - 1,000,000 Current Version- 10.1.3.28 Requires Android- 2.3.3 and up Content Rating- Rated for 3+ Interactive Elements Users Interact, Shares Info Updated- December 5, 2014 Size- 4.5M Installs- 10,000 - 50,000 Current Version- 2.0 Requires Android- 3.0 and up Content Rating- Rated for 3+ Offered By- Tech Fiesta

shohoz.com

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


M-Commerce in Bangladesh –Status, Potential and Constraints

26

daraz.com.bd

bdonlineshop.c om

shoppersbd.co m

chaldal.com

busbd.com.bd

Shohoz.com

bdtickets.com

X.

Updated - March 23, 2016 Size- 8.0M Installs- 1,000,000 - 5,000,000 Current Version- 1.8.1 Requires Android- 4.0.3 and up Content Rating- Rated for 3+ Interactive Elements Digital Purchases

CHALLENGES AND OPPORTUNITIES

There is no doubt that the main cause of the increase of mobile commerce is customer demand. The shopper like to have the flexibility and choose to shop online anytime at anywhere. Consumers are using more and more Smartphone and trying to compare the product and pricing from different online stores. At the same time, demand is creating for retailers to satisfy the customer by providing good services and building trust. The reality of the most customers of Bangladesh is that they would like to buy products by investigating on their own hand. Though there are a lot of M-commerce providers doing business around the country but still there are a lot of challenges are there those should be overcome to explore the possibility of this business. This research has found the bellow constraints or challenges –

Updated- September 13, 2015 Size- 13M Installs- 1,000 - 5,000 Current Version- 3.0.1 Requires Android- 2.3.3 and up Content Rating- Rated for 3+ Offered By- NameLess Updated- January 8, 2016 Size- 2.6M Installs- 1,000 - 5,000 Current Version-1.0 Requires Android- 2.3 and up Content Rating-Rated for 3+ Offered By- DigiWebApps

  

Updated- December 16, 2015 Size- 17M Installs- 5,000 - 10,000 Current Version- 1.1.3 Requires Android- 4.0.3 and up Content Rating- Rated for 3+ Offered By- Chaldal

      

Updated- December 29, 2015 Size- 2.2M Installs- 1,000 - 5,000 Current Version- 1.2 Requires Android- 4.0 and up Content Rating- Rated for 3+ Offered By- MR Soft BD Updated- March 9, 2016 Size - 6.0M Installs- 1,000 - 5,000 Current Version- 2.2.0 Requires Android- 4.1 and up Content Rating- Rated for 3+ Offered By- Shohoz.com

High internet uses cost Poor knowledge of internet promoting Absence of Government and Private companies’ involvement Lack of ICT education and Training Poor literary knowledge Absence of shipping policy Lack of Privacy policy Restriction of online payment gateway Achieving of Client trust Lack of awareness

XI.

CONCLUSION

This research has examined some important factors of M-Commerce those were not written earlier like the current growth of mobile and mobile internet users in Bangladesh, research also has been done on m-commerce providers site rank using tools and statistics of mobile apps uses by customers. M-commerce business scenarios and limitations those includes an absence of secure payment systems, lack of awareness, lake of achieving client trust and poor ICT knowledge.

Updated- July 24, 2015 Size- 4.7M Installs- 500 - 1,000 Current Version- 1.1 Requires Android- 3.0 and up Content Rating- Rated for 3+ Interactive Elements Digital Purchases Offered By- Boss Devs

[Data collected from Google play store as on 01st April 2016]

According to downloaded and installed mobile apps of specific stakeholders, a statistical analysis has been doneTable 11. Downloaded apps by users Provider

Installed mobile apps by No. Users

bikroy.com

500,000 - 1,000,000

ekhanei.com

500,000 - 1,000,000

kenakata.com

10,000 - 50,000

daraz.com.bd

1,000,000 - 5,000,000

shoppersbd.com

1,000 - 5,000

chaldal.com

10000-50000

busbd.com.bd

1,000 - 5,000

Shohoz.com

1,000 - 5,000

Copyright © 2016 MECS

XII.

LIMITATION AND FUTURE RESEARCH

This research is neither a technical research of mcommerce nor does it discuss the government policies, rather finding out the opportunities and challenges of mcommerce in Bangladesh based on the experimental study. It would be better if I could collect data using questionnaires to survey that could have helped address the specific issue that emerged later in the study. Future exploration could quantify the financial effect of m-commerce. Investigation of Governmental strategies in this field could be a decent research work. The other significant examination could be to perceive the total amount of revenue generated from mobile commerce.

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


M-Commerce in Bangladesh –Status, Potential and Constraints

REFERENCES

[13]

[1] Md. Aminul Islam, Tunku Salha Binti Ahmad, Mohammad Aktaruzzaman Khan & Mohammad Hasmat Ali, Adoption Of M-Commerce Services: The Case Of Bangladesh, World Journal of Management, Vol.2 No.1 March 2010, Pp. 37-54 [2] Ohidujjaman, Mahmudul Hasan and Mohammad Nurul Huda, “E-commerce Challenges, Solutions and Effectiveness Perspective Bangladesh “ in International Journal of Computer Applications (0975 – 8887), Volume 70– No.9, May 2013 [3] Mohammed Mizanur Rahman, “Barriers to M-commerce Adoption in Developing Countries – A Qualitative Study among the Stakeholders of Bangladesh”, The International Technology Management Review, Vol. 3 (2013), No. 2, 80-91 [4] The web information company: http://www.alexa.com/siteinfo/clickbd.com [5] Online shopping, selling or buying: http://www.bikroy.com/ [6] Online shopping, selling or buying: http://www. khanei.com [7] Online shopping, selling or buying : daraz.com.bd [8] Online shopping, selling or buying : shoppersbd.com [9] Online bus reservation service : busbd.com.bd [10] Online bus, train and hotel reservation service: Shohoz.com [11] Advance financial inclusion to improve the lives of the poor. Available at- http://www.cgap.org/blog/growthmobile-financial-services-bangladesh [12] Bangladesh Telephone Regulatory Commission [BTRC], Accessed on 1st April 2016 -

[14]

[15]

[16]

27

http://www.btrc.gov.bd/content/mobile-phone-subscribersbangladesh-february-2016 A. Smith, Exploring m-commerce in terms of viability growth and challenges. International Journal of Mobile Communication, 4(6) (2006) pp.682-703 Zhang, J. (2009). Exploring Drivers in the Adoption of Mobile Commerce in China. Journal of American Academy of Business, 15, 64-69. Swilley, E. (2007). An Empirical Examination of the Intent of Firms to Adopt Mobile Commerce as a Marketing Strategy [dissertation]. Florida (US). Florida State University Abdel Nasser H. Zaied, Barriers to E-Commerce Adoption in Egyptian SME, I.J. Information Engineering and Electronic Business, 2012, 3, 9-18

Authors’ Profiles Biman Barua received his B.Sc. in Computer Science in the year of 1999 from University of Madras, INDIA and M.Sc. in Communication Engineering, M.Sc. in Computer Science and Engineering respectively. He has been working as Senior Lecturer at BGMEA University of Fashion and Technology (BUFT), Bangladesh from 2008 prior to that he also worked in various multinational organizations as IT Manager. His areas of teaching are Electronic commerce, EBusiness, e-Professional etc.

How to cite this paper: Biman Barua,"M-Commerce in Bangladesh –Status, Potential and Constraints", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.22-27, 2016. DOI: 10.5815/ijieeb.2016.06.03

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 22-27


I.J. Information Engineering and Electronic Business, 2016, 6, 28-36 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.04

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications Emmanuel C. Paul Department of Mathematics, University of Ilorin, Ilorin, Nigeria Email: ewebstech@gmail.com

Abstract—Server-side scripts like Hyper-Text Preprocessor, Active Server Pages, and their interaction with databases, has been one of the most popularly used packages for Large Database Intensive Enterprise Software today. In this paper, an approach based on a detailed and efficient method for compiling WebApplications into executable formats to increase ease of software distribution, where limited internet access exists, is proposed. By this approach, first, one of the server-side scripts and a very popular web-application language in the world, Hyper-Text Preprocessor, is employed as a case study. Second, the methodology of using C++ for writing server installation scripts and creating Graphics User Interface Applications with Qt is shown, with tested applications. Third, Inno Setup Compiler scripts are written and used for compiling into installation and uninstallation setup files. Finally, the relevance of offline Web-Applications for solving scientific problems, the enhancement of C++ codes powered by Graphics User Interface for scientific computation, through interchannels communication using Qt, and the steps required to easily conquer the challenges faced during the installation of Web-Applications’ Servers and Databases like MySQL, are discussed. This approach is efficiently manifested by indicating and confirming this computational potential in the installation and usage of offline web-applications. Index Terms—Compiling Web-Applications, Qt, C++, GUI, Executables, PHP, Seamless Server Installations, Web-Applications’ Servers and Databases. I. INTRODUCTION The importance and use of Server Side driven WebApplications have undoubtedly affected the working structure of our world today. According to Lee Babin (2007), Internet scripting technology has come along at a very brisk pace. While its roots are lodged in text-based displays (due to very limited amounts of storage space and memory), over the years it has rapidly evolved into a visual and highly functional medium [2].

Copyright © 2016 MECS

It is generally believed that Web Programming/Scripting languages like Hyper Text Markup Language (HTML), Hyper-Text Preprocessor (PHP) and JAVASCRIPT are mostly used for website development projects while other languages like C, C++, Visual Basic or even Python are used to create desktop applications. However, it is also true that WebApplications can be built, distributed as exes without any existing knowledge of Web Server configurations by the software users. Hence, the software would be navigated and managed by the user just as with any other desktop application. The flexibility and ease in the development and use of web applications for large network-dependent software projects like Computer Based Tests (CBT) Software and several Enterprise Management Software has made the relevance of web programming important to both the native web developer and also the scientific researcher especially in developing countries, and hence, its relevance cannot be overemphasized. Javascript and its frameworks like jQuery has improved the world of web development today, hereby creating avenues for comfortable, fast and efficient client-side programming. PHP is known for its robust nature, viable documentations, stable version releases and frequent updates. PHP's integration with cURL, Perl and its GD library has placed it amongst the finest programming tool in the programming field. In these treatise, we disclose practical examples of how Web Applications could be run locally on computer systems, introduce the usage of Qt with C++ to write commands for automatic installation of necessary servers on the client's computer and finally write codes in Inno Compiler setup in order to compile the scripts into .exes which makes the web application install and uninstall like a system application. The remainder of this paper is organized as follows: Section II gives an overview of the languages used in this paper, Section III contains the C++ methodologies and needed configuration procedures, Section IV entails the GUI application techniques, Section V contains the web application compilation procedures, and the discussion of results is done in Section VI. The Conclusion is given in the final section.

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

II. KEY LANGUAGES/TECHNOLOGIES USED A. Hyper-Text Preprocessor (PHP) PHP is one of the best free open-source server side scripting languages used by many web-application developers. Hence, according to Steve Suehring, Tim Converse and Joyce Park (2009), PHP is a server-side scripting language, usually used to create web applications in combination with a web server, such as Apache. PHP can also be used to create command-line scripts akin to Perl or shell scripts, but such use is much less common than PHPs use as a web language [10]. Matt Zandstra (2000), also revealed that PHP's support for Apache and MySQL further secured its popularity. Apache is now the most-used Web server in the world, and PHP can be compiled as an Apache module. MySQL is a powerful free SQL database, and PHP provides a comprehensive set of functions for working with it. The combination of Apache, MySQL, and PHP is all but unbeatable [9]. Unfortunately, Tutorials on the working principles of PHP is not in the scope of this work. Therefore, no PHP Scripts would be provided here. Alternatively, web sites like http://www.hotscripts.com can be visited and browsed through to get working PHP scripts for test purposes in this work. B. Qt Qt is a complete C++ application development framework. It includes a comprehensive C++ class library, RAD GUI development tool (Qt Designer), Internationalization tool (Qt Linguist), Help browser (Qt Assistant) and comprehensive documentation. QT is very comprehensive in the sense that it possesses • • • • • • •

400+ fully documented classes, Core libs such as GUI, Utility, Events, File, Print, Network, Plugins, Threads, Date and Time, Image processing, Styles and Standard dialogs. Modules like Canvas, Iconview, Network, OpenGL, SQL, Table, Workspace, XML Tools such as Designer, Assistant, Linguist and finally, Extensions like ActiveQt, Motif migration and MFC migration. The version of Qt IDE used as at the time of this publication is Qt 5.5.1 (MSVC 2013, 32 bit).

However, this section gives an insight into what this means without going into the finer details of the language definition. Its purpose is to give a general overview of C++ and the key techniques for using it, not to provide you with the detailed information necessary to start programming in C++. D. Inno Setup Compiler Inno Setup Compiler is a free installer for Windows programs. First introduced in 1997, Inno Setup today rivals and even surpasses many commercial installers in feature set and stability. Key features: •

• • • •

A. Setting up the Web-Application running environment For the purpose of the subsequent discussions, we would take the following assumptions: •

C++ is a general purpose programming language with a bias towards systems programming. According to the Inventor of C++, Bjarne Stroustrup (1997), C++: is a better C, supports data abstraction, supports object oriented programming, and Supports generic programming.

Copyright © 2016 MECS

Support for every Windows release since 2000, including: Windows 10, Windows 8.1, Windows 8, Windows Server 2012, Windows 7, Windows Server 2008 R2, Windows Vista, Windows Server 2008, Windows XP, Windows Server 2003, and Windows 2000. (No service packs are required.) Extensive support for installation of 64-bit applications on the 64-bit editions of Windows. Both the x64 and Itanium architectures are supported. (On the Itanium architecture, Service Pack 1 or later is required on Windows Server 2003 to install in 64-bit mode.) Supports creation of a single EXE to install your program for easy online distribution. Disk spanning is also supported. Standard Windows wizard interface. Customizable setup types, e.g. Full, Minimal, Custom. Complete uninstall capabilities etc. III. LET’S BEGIN

C. C++

• • • •

29

That we have created a Web-Application written in PHP to computationally find the exact solution to some Differential Algebraic Equations (DAE) problems in which some members of the system are differential equations and the others are purely algebraic, having no derivatives in them. There exists some interested users who need to run this program on their computers, and they know nothing about PHP or any complex information about servers and their installations. Hereby, bringing in the question of distributing this application for use in a very convenient manner whereby the user easily installs the application and opens it conveniently without any help. That the web-application scripts are written and kept in a folder in the system path C://

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


30

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

We would now go over the installation procedures for installing Apache 2.4, PHP 5.4 and MySQL (If needed) on your local machine. However, installing a Web server (and its related programs) is not as simple as installing commercial applications. There are a lot of variables involved and many things that can go wrong. However, with patience, it can be done without errors occurring. Several components are needed to build a standalone PHP development system. PHP development is often done with either a system called LAMP (Linux, Apache, MySQL, and PHP) or WAMP (Windows, Apache, MySQL, and PHP). B. Needed Files For the purpose of this discussion, needed configured versions of Apache 2.4, PHP 5.4 and MySQL 5.5 can be downloaded at https://drive.google.com/open?id=0B2ORI93vyUo8RGl meVVadmVLdjg Please Note that all necessary initial/basic configurations have been done on the Apache/conf file and the php.ini settings have also been set appropriately for use on any machine it is installed on. If a different Apache, PHP or MySQL server, downloading a copy and doing some crosschecking whenever problems are encountered with installation. We would configure the downloaded files when we are dealing with the compiling procedures. C. Methodology We simply create a GUI Application using the Qt Framework to help the user navigate to the address where the web-application is stored on the server. This means that the GUI Application will act as a bridge between the web-app and the user. We would discuss and show practically how we could write the configuration script in C++ and make the GUI application access and run the process asynchronously, communicate with the process and give the user the results generated from the C++ server installation process. Therefore, after compiling, the user installs the setup file, and gets access to an (exe) application that can either give him options to install servers either by clicking a button, or do it automatically when the (exe) application is opened, and also be able to perform some other functions that might be useful for the web-application like providing navigations to the web application. Although the whole process is a bit tricky and for nonQt programmers can be very burdensome and time consuming to understand, but with the examples and explanations here, it would be surprisingly easy to develop and implement. D. Writing the Server Installation Program with C++ We would now write a simple C++ program to access the server configuration files and install them as a service. The purpose of installation as a service is to enable an auto start-up of the servers during system start-up.

Copyright Š 2016 MECS

We now create the C++ file named installation.cpp 1 #include <iostream> 2 #include <stdlib.h> 3 #include <stdio.h> 4 using namespace std; 5 void mysqlserver(); 6 void startapache(); 7 void startmysql(); 6 void main() 7 { 8system("C:\\MySoftwareFolder\\Apache24\\bin\\httpd - k install"); 9 mysqlserver(); 10 startapache(); 11 startmysql(); 12 } 13 void startapache() 14 { 15system("C:\\MySoftwareFolder\\Apache24\\bin\\http d - k start"); 16 } 17 void startmysql() 18 { 19 system("C:\\MySoftwareFolder\\mysql\\bin\\mysqld"); 20 cout<< "Program Complete" <<endl; exit(0); 21 } 21 void mysqlserver() 22 { 23 system("C:\\MySoftwareFolder\\mysql\\bin\\mysqld - install")) ; 24 } Line 1 to 3 includes the necessary standard header files needed for the task we want to accomplish. Line 5, 6 & 7 performs prototyping of the functions - mysqlserver(), startapache() and startmysql(). This is done because these functions when called inside the main function on Line 9, 10 and 11, will produce a compilation error because it has not been declared prior to that time. Line 8 inside the main() function, performs a system command operation that installs the apache server if and only if the path included as its argument is correct/found. Line 9 installs the MySQL server, while Line 10 and 11 starts Apache's Server and MySQL Server respectively by performing the system commands on Line 15 and 19. Once installed as service and started successfully at first, this servers requires no future additional effort in starting up again, because, it starts automatically during system start-up. For this discussion we would take MySoftwareFolder as the folder where the intended software is to be compiled into, i.e. C:/MySoftwareFolder/.../. This is where the earlier downloaded servers would be extracted into. Otherwise, the files can also be extracted to any other folder, as long as the path passed as the argument in the system() function tallies with it. The System() function then performs an automated command-line shell/batch operation by looking for the

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

path specified as its argument, and if found, tries running the code. If successful, a success message shows up, else, an error message would be shown. The Apache 2.4 server does not install itself as a service twice, once installed at first, it would not install again when next it is told to except it no longer exists on the local disk or has been stopped manually for reasons best known. Same goes for the MySQL server too. Now compile installation.cpp file in a C++ compiler to get an exe file called installation.exe Move this installation.exe program into a folder called bin inside the MySoftwareFolder to produce the path C:/MySoftwareFolder/bin/installation.exe. When this is done we are set to move into the next process.

IV. DEVELOPING THE GUI INTERFACE USING QT A. Getting and setting up Qt IDE for C++ Qt for C++ IDE can be downloaded from its website at https://www.qt.io/download-open-source/section-2. Problems with installation might be encountered if the wrong bit version for the development system is downloaded. The version being used for the purpose of this work is 64-bit Qt 5.5.1. Most functions covered during this discussion would be duly explained to induce understanding to the Novice Qt User and increase the knowledge of Qt Developers. The example given here serves as a possible tool for further development in similar regards. Graphics User Interface Applications can be developed in Qt either by using the built in Qt Visual Designer or by totally hand coding to produce the designs you need. For large projects, many Qt Developers prefer the Qt Visual Designer because they find it more natural and faster than hand-coding, and they want to be able to experiment with and change designs more quickly and easily than is possible with hand-coded forms. Using the Qt Designer, here is how we create a Dialog with two buttons in Horizontal layout and a label to display 'Welcome to My App'. If Qt IDE is configured properly, create a new Project Choose Application - Qt Widgets Application - Enter Project Name - set class name to Myguiapp and change base class to QDialog - Save. Once project has been created, on the toolbar, then click on the "design tab", drag two Push Buttons from the section between the Dialog's UI form and the Toolbar and rename them to Start Application and Quit Application respectively. Highlight them both and press Ctrl+H on your keyboard. Drag another label to the form above the buttons and rename it "Welcome to My App". Highlight them all and press Ctrl+V on your keyboard. If all goes well, the result should be identical to the image below

Copyright Š 2016 MECS

31

Fig.1. Qt Designer’s Window

Here is the source code. First we create the dialog's header file myguiapp.h 1 #ifndef MYGUIAPP_H 2 #define MYGUIAPP_H 3 #include<QProcess> 4 #include <QDialog> 5 namespace Ui { 6 class Myguiapp; 7} Lines 1 and 2 protects the header file against multiple inclusions. Line 3 includes The QProcess class which is used to start external programs and to communicate with them. Line 4 includes the definition of QDialog, the base class for dialogs in Qt. QDialog inherits QWidget. Next, we define Myguiapp as a subclass of QDialog 8 class Myguiapp : public QDialog 9{ 10 Q_OBJECT 11 public: 12 explicit Myguiapp(QWidget *parent = 0); 13 ~Myguiapp(); The Q OBJECT macro at the beginning of the class definition on Line 10 is necessary for all classes that define signals or slots. The Myguiapp constructor is typical of Qt widget classes. The parent parameter specifies the parent widget. The default is a null pointer, meaning that the dialog has no parent. Line 13 is a deconstructor which handles memory management of the dialog. 14 private slots: 15 void begin_installations(); 16 void on_action_started(); 17 private: 18 Ui::Myguiapp *ui; 19 QProcess myProcess; 20 }; 21 #endif // MYGUIAPP_H

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


32

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

In the class's private section, we declare two slots. The first slot is called whenever the start button emits a clicked signal, while the second slot is called whenever the process emits the readyRead() signal. The slots keyword is, like signals, a macro that expands into a construct that the C++ compiler can digest. Line 19 declares a QProcess object which will be used in Myguiapp.cpp implementation file. We would now look at the Myguiapp.cpp file which is an implementation of the Myguiapp dialog class. 22 #include "myguiapp.h" 23 #include "ui_myguiapp.h" 24 #include<QDesktopServices> 25 #include<QUrl> 26 #include<QString> 27 #include<QMessageBox> 28 #include<QProcess> 29 #include<QFile> 30 #include<QDir> 31 Myguiapp::Myguiapp(QWidget *parent) : 32 QDialog(parent), 33 ui(new Ui::Myguiapp) 34 { 35 ui->setupUi(this); 36 connect(ui->pushButton, SIGNAL(clicked(bool)), this, SLOT(begin_installations())); 37 connect(ui->pushButton_2, SIGNAL(clicked(bool)), this, SLOT(close())); 38 connect(&myProcess, SIGNAL(readyRead()), this, SLOT(on_action_started())); 39 } Line 24 to 30, contain all the necessary header files. In line 32, we pass on the parent parameter to the base class constructor. In line 36, we connect the first pushbutton to a slot when a clicked() signal is emitted by the button. This takes the action to the function which executes and returns required result. Line 37 connects the second pushbutton to a slot that closes the GUI App when it emits the clicked() signal. Line 38 connects the readyRead() signal to a slot that performs a specific action. i.e. When QProcess executes an external process, it emits readyRead() whenever data is available to be read from that process. 40 Myguiapp::~Myguiapp() 41 { 42 delete ui; 43 } 44 void Myguiapp::begin_installations() 45 { 46 QString program; 47 program = "C:/Easytrades/bin/syscommands.exe"; 48 myProcess.setProcessChannelMode(QProcess::Merge dChannels); 49 myProcess.start(program); 50 } Copyright Š 2016 MECS

The slot on Line 44 is called whenever the first pushbutton's clicked() signal is emitted. The function declares a QString variable on line 46, sets the path to the C++ installation program created earlier. Line 48 sets the Process channel mode to MergedChannels using the QProcess object 'myProcess'. Line 49 starts the program. 51 void Myguiapp::on_action_started() 52 { 53 QFile file; 54 QDir::setCurrent("C:/Easytrades"); 55 file.setFileName("donotdelete.xml"); 56 if(file.exists()) 57 { 58 if(!myProcess.waitForFinished()){ 59 QMessageBox::warning(this, tr("Error"), tr("<p>Error has occurred. App wont start")); 60 } The slot in line 51 is called for execution whenever the QProcess Object emits a readyRead() signal. To disable multiple installation attempts, we resort into creating a file called 'donotdelete.xml' place it into the installation folder of the application. This file can be of any type. The idea is just to find the file in the directory stipulated and install the server if the file is available. Once the install program has been executed, the object 'file' deletes that file from its directory and keeps the installation from happening multiple times. This method is only one of the many ways to eradicate multiple installations. Line 58 checks the waitforFinished() function to see if the process has started and ended, if it returns false, a message box is displayed with an error message. 61 else{ 62 if(myProcess.Running) 63 { 64 QByteArray newData = myProcess.readAll(); 65 int exitstatus = myProcess.exitCode(); 66 if(exitstatus == 0) 67 { 68 QMessageBox::information(this, tr("Server Installation"), 69 tr("<h3>Installation Status</h3>" + newData + "\n")); 70 QString link="http://localhost/appurl/"; 71 QDesktopServices::openUrl(QUrl(link)); 72 else{ 73 QApplication::quit(); 74 } 75 } 76 } 77 file.remove(); 78 } 79 else{ 80 QString link="http://localhost/appurl/"; 81 QDesktopServices::openUrl(QUrl(link)); 82 } 83 }

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

If the Process has emitted the waitforFinished() signal, Line 61 begins execution. The readAll() functions outputs results which take the form of QByteArray, hence, declaring newData to be QByteArray, we can retrieve the value of myProcess.readAll(). Line 65 gets the value of the process' exitcode by retrieving the data into exitstatus. If the process exitcode is 0, this indicates that the program completed properly and returned the correct exit code. If the exitcode returned is 0, a message is displayed to the user on the status of the installation. Line 71 then triggers the link set in Line 70, if found, it opens the link which is the url pointing to where the web application is stored in the server's htdocs folder. Otherwise, in the case of not finishing the process properly and inability to return exitstatus of 0, the application quits. The file (donotdelete.xml) is then deleted from its directory on Line 78. Line 80 to 83 gets executed whenever the file (donotdelete.xml) is not found. We then create the main.cpp file.

In order to run a Qt Application successfully on a computer different from the one in which the application was compiled or in a path that is not accessible by the Qt Application, we would need to transfer some dynamic-link library (DLL) files to share code and other resources necessary to perform particular tasks in the course of running the Qt application. There are some fundamental Qt DLLs that must be compiled alongside your application in order to enable the application run on almost any computer system it is installed. • •

Line 85 includes the myguiapp header file while the Line 86 include the definitions of the QApplication class. For every Qt class, there is a header file with the same name (and capitalization) as the class that contains the class's definition. Line 89 creates a QApplication object to manage application-wide resources. The QApplication constructor requires argc and argv because Qt supports a few command-line arguments of its own. Line 91 makes the object created from the Myguiapp widget visible and Line 92 passes control of the application on to Qt. At this point, the program enters the event loop. This is a kind of standby mode where the program waits for user actions such as mouse clicks and key presses. Running the application should produce the result similar to the diagram below:

Fig.2. Simple GUI Dialog for navigating Web-Application with Qt

Copyright © 2016 MECS

B. Qt Applications Running Environment

85 #include "myguiapp.h" 86 #include <QApplication> 87 int main(int argc, char *argv[]) 88 { 89 QApplication a(argc, argv); 90 Myguiapp w; 91 w.show(); 92 return a.exec(); 93 }

33

Open this path c:/your-path/Qt5.5.1/5.5/msvc2013 64/ All contents in the folder 'bin' and 'plugins' would need to be copied as they are into the root folder of the application. i.e. MySoftwareFolder. This is to ensure all needed files for the application you are creating are all included without any omissions. Next, we run myguiapp.exe, and while it is running, we highlight all files we copied from bin and platform folder which are now in the Application's root folder, and we delete them. Once the delete button is pressed, all DLL files and folders selected which are not in use by the application will be deleted and the files that are in use would be left behind. This way, we have exactly all the DLL files we need for our application.

V. COMPILING WITH INNO SETUP COMPILER For the purpose of this work, the version of Inno Setup Compiler used is 5.5.8. Once downloaded and installed, a new script can be created by Ctrl+N. On the Inno Setup Script Wizard dialog, Check the ‘create a new empty script’ box and click finish. Please Note that the wizard can also be used, but it does not provide all available options. A. Setting & Configuring Server Files Before we dive into compilation, we need to make sure all files are where they should be and the paths in the Apache Configuration file are correctly linked. Some configurations have to be changed in the Apache configuration file C:/yourpath/Apache24/conf/httpd.conf/ It is also necessary that the folder which contains the web application is stored inside the htdocs folder which can be found in the path C:/your-path/Apache24/ First, the ServerRoot will have to be changed. For the sake of this demonstration, we want the compiled setup to extract its contents into a specified folder during setup called MySoftwareFolder. Since the path to the Apache server files have changed, we need to change some values in the configuration file –

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


34

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

ServerRoot "c:/MySoftwareFolder/Apache24". Next, we need to change the directory address to some php modules to match the directory path on the user’s computer. LoadModule php5_module "c:/MySoftwareFolder/php/php5apache2_4.dll" Then we do the same to the DocumentRoot, i.e. the directory out of which you will serve your documents. DocumentRoot "c:/MySoftwareFolder/Apache24/htdocs" <Directory "c:/MySoftwareFolder/Apache24/htdocs"> Then the CGI directory <Directory "c:/MySoftwareFolder/Apache24/cgi-bin"> And Lastly, the PHP ini Directory PHPIniDir C:/MySoftwareFolder/php Next, we need to correct some paths in the php.ini file to match the installation directory. extension_dir = "C:\MySoftwareFolder\php\ext" session.save_path = "C:\MySoftwareFolder\php\tmp" session.save_path = "C:\MySoftwareFolder\php\tmp" B. Compiling We then create a file with the name ompilemyapp.iss Here is the source code 1 #define MyAppName "Your Software Name" 2 #define MyAppVersion "1.0" 3 #define MyAppPublisher "Publisher's Name" 4 #define MyAppURL "http://www.example.com" 5 #define MyAppExeName "Myguiapp.exe" Line 1 to 5 defines the identity of the to-be-compiled software. 6 ; NOTE: The value of AppId uniquely identifies this application. 7 ; Do not use the same AppId value in installers for other applications. 8 [Setup] 9 AppId={{2EDA0D0A-43B1-4B39-9A6C47EA23F4D2E5} 10 AppName={#MyAppName} 11 AppVersion={#MyAppVersion} 12 ;AppVerName={#MyAppName} {#MyAppVersion} 13 AppPublisher={#MyAppPublisher} 14 AppPublisherURL={#MyAppURL} 15 AppSupportURL={#MyAppURL} 16 AppUpdatesURL={#MyAppURL} 17 DefaultDirName=C:/MySoftwareFolder 18 DisableProgramGroupPage=yes Copyright Š 2016 MECS

19 LicenseFile=C:\your-path\license.txt ; 20 OutputDir=C:\your-path\Desktop 21 OutputBaseFilename=myguiapp_setup_3.01 22 SetupIconFile=C:\your-path\icon.ico 23 Compression=lzma 24 SolidCompression=yes 25 ;Additional Options 26 AppContact=ewebstech@gmail.com 27 AppCopyright=Copyright (C) 2014-2016 company, Inc. 28 AppSupportPhone=+234 000 000 000 29 ChangesEnvironment=yes 30 CloseApplications=Force 31 VersionInfoVersion=3.0 Line 21 defines the name of the setup file after compilation. Compression lzma is the method of compression employed by the 7-Zip LZMA compressor. It typically compresses significantly better than the zip and bzip methods. Line 29, When set to yes, at the end of the installation Setup will notify other running applications (notably Windows Explorer) that they should reload their environment variables from the registry. Line 30, If set to yes or force and Setup is not running silently, Setup will pause on the Preparing to Install wizard page if it detects applications using files that need to be updated by the [Files] or [InstallDelete] sections, showing the applications and asking the user if Setup should automatically close the applications and restart them after the installation has been completed. If set to yes or force and Setup is running silently, Setup will always close and restart such applications, unless told not to via the command line. If set to force Setup will force close when closing applications, unless told not to via the command line. Use with care since this may cause the user to lose unsaved works. If your installation creates or changes an environment variable but doesn't have ChangesEnvironment set to yes, the new/changed environment variable will not be seen by applications launched from Explorer until the user logs off or restarts the computer. 32 [Languages] 33 Name: "english"; MessagesFile: "compiler:Default.isl" 34 [Tasks] 35 Name: "desktopicon"; Description: "{cm:CreateDesktopIcon}"; 36 GroupDescription: "{cm:AdditionalIcons}"; Flags: unchecked 37 Name: "quicklaunchicon"; Description: "{cm:CreateQuickLaunchIcon}"; 38 GroupDescription: "{cm:AdditionalIcons}"; Flags: unchecked; OnlyBelowVersion: 0,6.1 Line 35 gives the user the option of creating desktop shortcut icon and Line 37 a quick launch icon in the case of older operating systems.

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

35

39 [Files] 40 Source: "C:\path-where-Myguiapp.exe-app-isstored\Myguiapp.exe"; 41 DestDir: "{app}"; Flags: ignoreversion 42 Source: "C:\path-to-downloaded-apacheserverfolder\Apache24\*"; 43 DestDir: "{app}\Apache24"; Flags: ignoreversion recursesubdirs createallsubdirs 44 Source: "C:\Users\EWEBS\Documents\Easytrades\imageforma ts\*"; 45 DestDir: "{app}\imageformats"; 46 Flags: ignoreversion recursesubdirs createallsubdirs 47 Source: "C:\Users\EWEBS\Documents\Easytrades\mysql\*"; 48 DestDir: "{app}\mysql"; Flags: ignoreversion recursesubdirs createallsubdirs 49 Source: "C:\Users\EWEBS\Documents\Easytrades\php\*"; 50 DestDir: "{app}\php"; 51 Flags: ignoreversion recursesubdirs createallsubdirs

The user also gets an opportunity to uninstall the application, without the risk of leaving the servers behind.

The Source is the path to where the files to be compiled reside in, while the DestDir is the destination for the compiled application. If there exists some files to be extracted into its own separate folder inside the application folder MySoftwareFolder, then a '*' is placed after the folder's name as shown above. The ags recursesubdirs createallsubdirs needs to be placed in front of the Flag for that file. Otherwise, it would put all files and sub-folders into the same directory with every other file in MySoftwareFolder. More files can be added following the procedure for the Source, DestDir & Flags. Note: Don't use "Flags: ignoreversion" on any shared system files.

With the proposed outlined steps and algorithms above, we can successfully create a system through which web applications could be compiled, distributed and installed by any one easily. These also gives an opportunity for navigating to the software through the GUI application that acts as a bridge between the Web application, Server, and the user. Fig. 2 shows a simple typical example of a GUI dialog created using Qt that can function as the bridge between the Web application, server, and the user. Any type of GUI application can be created for these purpose, i.e. any type of style could be adopted. Therefore, if the outlined procedures are followed, Web Applications can now be transferred as compiled (exe) applications, its servers can be installed silently by Qt and C++ communications and can also be navigated by Qt GUI applications.

52 [Icons] 53 Name: "{commonprograms}\{#MyAppName}"; Filename: "{app}\{#MyAppExeName}" 54 Name: "{commondesktop}\{#MyAppName}"; Filename: "{app}\{#MyAppExeName}"; 55 Tasks: desktopicon 56 Name: "{userappdata}\Microsoft\Internet Explorer\Quick Launch\{#MyAppName}"; 57 Filename: "{app}\{#MyAppExeName}"; Tasks: quicklaunchicon 59 60 [Run] 61 Filename: "{app}\{#MyAppExeName}"; 62 Description: "{cm:LaunchProgram,{#StringChange(MyAppName, '&', '&&')}}"; 63 Flags: nowait postinstall skipifsilent Running this script would compile the files to produce our setup file which contains our assumed Web-application which solves DAE problems which also can now be distributed easily at a compressed size.

Copyright Š 2016 MECS

Fig.3. Compilation

However, for the purpose of installing the servers, the GUI application would have to be opened in administrator's mode to give appropriate access for installation. This is typical for Windows Operating Systems.

VI. DISCUSSION OF RESULT

VII. CONCLUSION In this paper, the methodology of compiling webapplications into executable file formats, obtaining seamless server installations and GUI navigations, through Qt and C++ process communications is proposed and its potential manifests good performance in the indication and confirmation of offline web-applications’ installation and usage. Undoubtedly, an expanded use of web-applications has been illustrated in easy-toimplement procedures of describing significantly, the methods for the silent and automated installation of a web server using native C++ for its distribution and usage offline; the steps to be taken when creating basic GUI applications that run the installation and navigation of the web-application without complexities on the user's end; and finally, the method for compilation into exes at a well compressed size. This ideology would aid the distribution of, and enhance opportunities in developing countries, for the usage of offline web-applications, especially in

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


36

Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications

countries with little or no internet access. Most Lines in the displayed source codes were also explained for clarity on the author’s style of implementation. ACKNOWLEDGMENT The author wishes to thank the Almighty God for bringing him this far. Also, many thanks to Professor Christopher Thron, University of Edinburgh, U.S.A, Dr. O.T. Olotu, Dept. of Mathematics, University of Ilorin and J.B. Okeowo, University of Ilorin, for their support and constructive reviews. REFERENCES [1] Gabe Rudy, Cross-platform C++ Development Using Qt, 2005. [2] Lee Babin, Beginning Ajax with PHP: From Novice to Professional, 2007. [3] Wolfram Mathematica, Differential Equation Solving With Dsolve. [4] Jasmin Blanchette and Mark Summer_eld. C++ GUI Programming with Qt, 2006. [5] Juan Souli, The C++ Language Tutorial. [6] Harry Fuecks, The Php Anthology, Volume 1: Foundations. [7] Quentin Zervaas, Practical Web 2.0 Applications with PHP, 2008. [8] Bjarne Stroustrup, The C++ Programming Language Third Edition, AT&T Labs, Murray Hill, New Jersey, 1997. [9] Brian Schaffner, Matt Zandstra, Teach Yourself PHP4 in 24 Hours, 2000. [10] Steve Suehring, Tim Converse and Joyce Park, PHP6 and MySQL, 2009. [11] Koch, N., Wirsing, M.: Software Engineering for Adpatative Hypermedia Applications. In: 3rd Workshop on Adaptative Hypertext and Hypermedia (2001).

[12] Muruguesan, S., Desphande, Y.: Web Engineering. Software Engineering and Web Application Development. Springer LNCS – Hot Topics (2001). [13] Chen, J.Q, & Heath, R. D. (2005). Web application development methodologies. In W. Suh (Ed.), Web Enginerring: Principles and techniques. Hershey, PA: Idea Group Publishing [14] Coda, F., Ghezzi, C,. Vigna, G., & Garzotto, F. (1998, April 16-18). Towards a software engineering approach to Web site development. Paper presented at the Ninth International Workshop on Software Specification and Design (IWSSD-9), Ise-shima, Japan. [15] Deborah Kurata, Doing Web Development: Client-Side Techniques, 2008. [16] Make a windows installer file (.exe file) using Inno Setup Compiler, published on 31st, July 2015, https://www.scirra.com/tutorials/4779/make-a-windowsinstaller-file-exe-file-using-inno-setup-compiler/page-3.

Authors’ Profiles Emmanuel C. Paul, born in Lagos, Nigeria on the 28th April, 1994, is an undergraduate student for bachelor’s degree for Mathematics in University of Ilorin, Ilorin, Nigeria. Emmanuel C. Paul majors in computational and applied mathematics. His current research is based on the mathematical modelling of bacteria resistance to multiple antibiotics and immune system response.

How to cite this paper: Emmanuel C. Paul,"Methodology of Compiling Web-Applications into Executables, Obtaining Seamless Server Installations and GUI Navigations through Qt and C++ Process Communications", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.28-36, 2016. DOI: 10.5815/ijieeb.2016.06.04

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 28-36


I.J. Information Engineering and Electronic Business, 2016, 6, 37-45 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.05

The Impact of Dots Representation in Recognition of Isolated Arabic Characters Nehad H A Hammad Palestine Technical College,P.O.Box 6037, Gaza, Palestine E-mail: nehadhh@hotmail.com

Mohammed Elhafiz Musa Sudan University for Science and Technology,P.O.Box 382, Mokren , Khartoum ,Sudan E-mail: hafiz85@hotmail.com

Abstract—The Arabic Optical Characters Recognition (AOCR) is one of the challenging recognition tasks nowadays, as Arabic handwriting is cursive and contains many dots. Dots are a big challenge for Arabic recognizers, as writers sometimes connect them. Moreover, Dots are prone to be considered noise. This paper proposes a new divide and conquers based approach that tries to conquer dots problems. The novelty of the proposed approach is in its feature extraction method. The extracted features are used to train Artificial Neural Network (ANN) Feed-forward. The result is interesting and shows that this method should be further investigated. Index Terms—Arabic Character Recognition, Feature Extraction, Artificial Neural Network, Image Normalization, Image Resizing.

I. INTRODUCTION Handwriting recognition refers to the identification of written characters. The problem can be viewed as a classification problem where we need to give the input sample the appropriate class (each character is a class of infinite number of samples). The challenge of the recognition system increases with the increase of a number of classes. One of the big challenges for AOCR is segmentation because Arabic handwriting is cursive, however, this problem is not addressed here because this paper addresses another challenge (dots). The main challenge for Arabic isolated character recognition is the recognition of the dots (its number and position). In the literature some authors use the term diacritics for these dots, however, the authors of this paper prefer dots. Dots in Arabic handwriting is very crucial as difference between the shapes of many characters is few dots. For example, letters Baa(‫)ب‬, Taa(‫)ت‬, and Thaa(‫ )ث‬are three different letters, but they have similar body shape, they only differ in number and position of dots (one, two or three dots under or above the body of the character). Also the letter Jeem(‫)ج‬, Hhaa(‫ )ح‬and Khaa(‫ )خ‬they differ only in one dot, and the same situation happens other characters [1]. Copyright © 2016 MECS

The method proposed here uses "Divide and conquer" technique to divide Arabic character into four groups based on a number of components in characters. Each character may have up to four components. Number of components is not fixed as some writers connect some components. (e.g. some writers draw dots (one, two, three) in a continuous line). Omer et. al. has explored similar ideas. They have broken the recognition into two stages. In their experiments they have used the same datasets used in this paper. However, the result achieves here in much better [14]. The connected component labeling algorithm using to determine the character member of which group, works by scanning an image, pixel-by-pixel (from top to bottom and left to right) in order to identify connected pixel In this paper author represent a new method for handwritten characters passed in several processes start with preprocessing processes until recognition, the famous process is binarization , resizing , noise removal , thinning , feature extraction and classification for common Arabic character recognition. This paper describes the Arabic optical character recognition (AOCR) method used a data collected from a large group of people was selected a random sample from dataset. A multi-layer Artificial Neural Network was used to classification of Arabic handwritten characters.

II. RELATED WORK Ahmed Sahlol and Cheng Suen (2013) proposes new methods for handwritten Arabic character recognition which is based on novel preprocessing operations including different kinds of noise removal also different kind of features like structural, Statistical and Morphological features from the main body of the character and also from the secondary components. Evaluation of the accuracy of the selected features is made. The system was trained and tested by the back propagation neural network with CENPRMI dataset. The proposed algorithm obtained promising results as it is able to recognize 88% of our test set accurately. [11] Ved Prakash Agnihotri (2012) proposed recognition system using neural network. Diagonal based feature

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


38

The Impact of Dots Representation in Recognition of Isolated Arabic Characters

extraction is used for extracting features of the handwritten Devanagari script. After that these feature of each character image is converted into chromosome bit string of length 378. More than 1000 sample is used for training and testing purpose in this proposed work. It is attempted to use the power of genetic algorithm to recognize the character. Diagonal based feature extraction method to extract 54 features to each character. In the next step character recognize image in which extracted feature in converted into Chromosome bit string of size 378. In recognition step using fitness function in which find the Chromosome difference between unknown character and Chromosome which are store in data base. The experiment is conducted on more than 1000 characters. The testing characters are separated into data sets, the training data set and testing data set. The training set contains 904 characters and testing set contains 204 characters. The precision of offline Devanagari system is 85.78% match, 13.35% mismatch. [15] Khaoula addakiri and mohamed bahaj (2012) represent an approach for the recognition of on-line Arabic handwritten characters. The method employed involves three phases: First, pre-processing in which the original image is transformed into a binary image. Second, training neural networks with feed-forward back propagation algorithm. Finally, the recognition of the character through the use of Neural Network techniques. The characters recognition has served as one of the principal proving grounds for neural network methods and has emerged as one of the most successful applications of this technology. The experimental results for characters' recognizer accuracy average about 83%.[12] Alaei et al. (2010) proposed a two-stage approach for isolated handwritten Persian character recognition. They extracted features based on modified chain code directional frequencies and employed an SVM for classification. They obtained 98.1% and 96.6% recognition accuracy with 8-class and 32-class problems, respectively [4]. Desai (2010) presented a technique for Gujarati handwritten numeral recognition. the author used features abstracted from four different profiles of digits with a multilayered feed-forward neural network and achieved an approximate 82% recognition accuracy for Gujarati handwritten digit identification [5] . Sharma and Jhajj (2010) extracted zoning features for handwritten Gurmukhi character recognition. They employed two classifiers, namely k-NN and SVM. They achieved a maximum recognition accuracy of about 72.5% and 72.0% with k-NN and SVM, respectively [6].

1.

Preprocessing

The feature extraction step require a one pixel width trace, however, the input image is gray scale letter image. Therefore standard preprocessing methods are used here to generate these traces. These are preprocessing steps are: binarization, thinning, noise removal, resizing. These steps are depicted in Fig 1, Fig 2, Fig 3 and fig. 4 respectively.

Fig.1. Binarization Step using Threshold.

Fig.2. Noise Removing using Median Filter.

Fig.3. Binary Image Normalization(Cropping).

Fig.4. Arabic Isolated Character Thinning.

2.

Connected Components

A pixel p at coordinates (x,y) has adjacent neighbors whose coordinates are (x+1,y), (x-1,y), (x,y+1) and (x,y1). This set of 4 neighbors of p denoted N4 (p) is illustrated in Fig. 5, where the four diagonal neighbors of p have coordinates (x+1,y+1), (x+1,y-1), (x-1,y+1) and (x-1,y-1) respectively.

III. THE PROPOSED METHOD As the novel part of the proposed method is in feature extraction, here we will try to give it full description while standard steps like preprocessing will have small summary.

Copyright Š 2016 MECS

Fig.5. (a) Adjacent neighbors (b) Diagonal neighbors

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


The Impact of Dots Representation in Recognition of Isolated Arabic Characters

Connected component extraction is an important step in this method. Most of the Arabic characters contain one or more connected components like ( ‫أ‬, ‫ب‬, ‫ت‬, ‫ث‬, ‫خ‬, ‫ذ‬, ‫ض‬, ‫غ‬, ‫ق‬, ‫ك‬, ‫) ي‬. Fig 6. Shows the groups of characters

39

used in the proposed methods. The Proposed method identifies the main body easily as it is usually the largest component. Therefore, all other component is much smaller

Arabic Characters

Group Four

Up

Group Three

Down

Group Two

Up

Down

Mid

Group One

Up

Fig.6. The main 4 groups and their constituent characters

3.

Feature extraction

The proposed method generates a feature vector that contains 17 values 15 represent characters' contours and 2 represent groups the position of their dots (fig 8). ( )

(( ( )

( )) ( ( )

( ))

(1)

9D

10D

12D

15D

After thinning ,15 points are taken from character trace, the angles between two adjacent points of 15 points is calculated using Equation (1). To find the most appropriate number of directions 10 dirctions are proposed to tested. Table 1 illustrates these directions. Table 1. The Selected Directions to be used in Proposed Methods

Table 2. illustrates how the first two features in feature vectors are generated (Fig. 8). The groups appear in Fig. 6, can be textually described for more illustrations as follows:

3D

5D

4D

6D

   

group 1: one component characters. (Fig. 11) group 2: two component characters. (Fig. 13) group 3: three component characters. (Fig. 15) group 4: four component characters. (Fig. 17)

There is important subtle point we want to reemphasize Fig. 8 presents the typical situation (typewriter letters). However as some writers connect some components (typically dots). Some letters appear in other group. For instance the letter ( ‫ ) ث‬in Fig 8 appears in group 4. However, this letter appear in group 2 and group 3 (see the last character in Fig 13 and Fig 15). As in group 2 the writer connects the three dots and group 3, he connects only 2. Table 2. Illustrates these Categories and how the first two elements in the features vectors are filled.

7D

Copyright © 2016 MECS

8D

Number of Dots No Dots Level 1 (Up) Level 2 (Mid) Level 3 (Bottom)

One Dot [00] [11] [21] [31]

Two Dots [00] [12] [32]

Three Dots [00] [13] -

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


40

The Impact of Dots Representation in Recognition of Isolated Arabic Characters

Dot Detection process: 1. 2. 3. 4. 5.

6. 7.

Calculate the number isolated objects in the image. If a number of objects = 1 then pattern belongs to group one. If number of objects >= 2 then The Longest object size is character body and other is a dot(s). The character dots recognized using another process to calculate a number of dots and position level. The number of dots depends on dot length using thresholds and dots shape. The dots level extracted when to remove all space around character after divide the character height into 3 areas on the Y axis.

Level 1 Level 2 Level 3 Fig.7. Character Dots Levels Character Body

Dots Code

output Boolean values. The function generates outputs between 0 and 1 as the neuron’s net input goes from negative to positive infinity see figure 10. In building the network, the data was divided randomly into three categories. Training data consisted of 70% of the data. The remaining 30% of the data was assigned 15% to validation and 15% for the testing data. After many trials of eliminating, adding and modifying features and also adjusting network hidden layers: A network of 100 neurons in hidden layer was able to predict about 96.91% of the input characters correctly.

IV. DATASET A Datasets of Arabic isolated characters were collected from students at the Sudan University for Science and Technology (SUST). After paper scanning, a manual classification process was performed to establish the desired standard dataset [13]. The datasets contain 28 isolate characters. Each character appears in various shapes as the subjects has not been asked to write in specific way. The dataset is off-line. Each subject contributed 28 basic characters. The total number of samples for one character is 1400. The total number of samples in the dataset is 39200 ( = 28 × 1400 ). The images resolution is 88x88 pixels [13].

V. EXPERIMENTS AND RESULT Here the results for each group is presented. As 10 directions have been tested with three numbers of points in each character ( 10,15 and 20) all results table show the results of these 30 selections (i.e. 10x3). Also for each group two result tables are presented, the first one shows the overall recognition result and the second shows the result for each character.

Fig. 8. Feature Vector Structure Design

1. Group One Group one contains any isolated Arabic characters with one part so the character written using one connected line, this group contains 12 characters ( , ‫ ط‬, ‫ ص‬, ‫هـ‬, ‫ ع‬, ‫ ح‬, ‫ د‬, ‫ا‬ ‫ ر‬,‫ و‬, ‫ س‬, ‫ ل‬, ‫) م‬.

Fig.9. Arabic Character Representation

4.

Neural Network Architecture

The architecture of the proposed neural network has s 17 inputs and 50 neurons in its output layer to identify the letters.

Fig.10. Log-sigmoid activation function.

The network is a two-layer network. The log-sigmoid [50] activation function at the output layer was picked because its output range (0 to 1) is perfect for learning to Copyright © 2016 MECS

Fig.11. Group One Isolated Arabic Characters Handwritten

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


The Impact of Dots Representation in Recognition of Isolated Arabic Characters

41

Table 4. Group One Characters Recognition Accuracy Rate. No

10 Points

15 Points

20 Points

Fig.12. Group One Represented by 20,15 and 20 Points.

1.1 Accuracy Rate for Group One in Multi Directions Table 3. Accuracy Rate for Group One Directions. Points Number

10

15

20

Best Accuracy

3

82.24

81.31

82.96

82.96%

4

82.90

85.75

87.09

87.09%

5

87.74

84.25

87.41

87.74%

6

85.37

84.89

85.65

85.65%

7

86.00

83.22

87.62

87.62%

8

90.63

88.11

90.43

90.63%

9

92.53

93.06

92.28

93.06%

10

90.57

94.34

90.89

94.34%

12

92.44

92.86

92.78

92.86%

15

92.25

91.85

92.80

92.80%

Direction

2.

Character

20 Points

15 Points

10 Points

Best Accuracy

1

‫ع‬

Ain

86.60

86.56

87.48

87.48%

2

‫ا‬

Alef

98.54

98.38

93.74

98.54%

3

‫د‬

Dal

90.42

88.58

90.27

90.42%

4

‫ح‬

Hah

86.19

86.01

86.06

86.19%

5

‫هـ‬

Heh

77.22

75.17

73.95

77.22%

6

‫ل‬

Lam

93.96

91.74

93.44

93.96%

7

‫م‬

Meem

91.10

90.69

85.33

91.10%

8

‫ر‬

Reh

90.97

89.77

92.32

92.32%

9

‫ص‬

Saad

86.70

84.33

84.85

86.70%

10

‫س‬

Seen

88.16

88.72

88.58

88.72%

11

‫ط‬

Tah

84.15

82.54

80.71

84.15%

12

‫و‬

Waw

89.24

85.67

86.59

89.24%

Group Two

Group two contains isolated Arabic characters with two parts so the characters are written using two connected line. Group two contains 16 characters ( , ‫ب‬ ‫ ز‬, ‫ ظ‬, ‫ ش‬, ‫ ي‬, ‫ ن‬, ‫ ك‬, ‫ ذ‬, ‫ ض‬, ‫ ق‬, ‫ ف‬, ‫ غ‬, ‫ خ‬, ‫ ج‬, ‫ ث‬, ‫ )ت‬.

Therefore, the best representation of characters to extract features in the group one is 10-directions and represent the characters using 15 points to get the recognition rate 94.34% see Table 3. 1.2 Accuracy Rate for Group One Characters The character recognition is affected by the extent of a character complexity of character shape, the degree of curvature, crosses and overlaps. Therefore, the Heh (‫)هـ‬ character is the most difficult to drawing between characters in group one and Alef(‫ )ا‬is an easier character for drawing. So the character’s recognition accuracy in the group one are varies due to the foregoing factors. Table 4. describes more detail the variation in recognition rate between characters in the group one, where represents characters using 10, 15 and 20 points. The saad(‫)ص‬,Heh(‫ )هـ‬and Tah(‫ )ط‬have low recognition rate than easier character drawing like Alef(‫ )ا‬, Lam(‫)ل‬ and Reh(‫)ر‬. These results are logical where the recognition rate inversely proportional to the difficulty of the character drawing.

Copyright © 2016 MECS

Fig.13. Group Two Isolated Arabic Characters Handwritten.

Fig.14. Arabic Character Group Two Represented by 20,15 and 20 Points.

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


42

2.1

The Impact of Dots Representation in Recognition of Isolated Arabic Characters

Table 6. Group Two Characters Recognition Accuracy Rate.

Accuracy Rate for Group Two Multi Directions

Character

20 Points

15 Points

10 Points

Best Accuracy

1

‫ب‬

Beh

97.94

98.71

98.83

98.83%

2

‫ض‬

Zah

90.57

89.65

86.38

90.57%

3

‫ف‬

Feh

94.35

93.57

93.35

94.35%

4

‫غ‬

Ghen

93.60

94.17

94.71

94.71%

5

‫ج‬

Jeem

96.96

98.45

97.73

98.45%

6

‫ك‬

Kaf

98.65

99.08

99.03

99.08%

7

‫خ‬

Khah

89.54

91.02

89.19

91.02%

8

‫ن‬

Noon

95.26

95.73

93.89

95.73%

No Table 5. Accuracy Rate for Group Two Direction. 10 Points

15 Points

20 Points

Best Accuracy

3

93.35

93.07

97.53

97.53 %

4

93.77

94.40

93.60

94.40 %

5

92.95

94.16

91.63

94.16 %

6

95.04

94.58

92.29

95.04 %

7

92.89

94.73

93.32

94.73 %

8

92.35

95.48

93.59

95.48 %

9

94.45

96.28

94.21

96.28 %

10

92.75

95.86

95.26

95.86 %

9

‫ق‬

Qaf

96.57

98.27

97.23

98.27%

12

94.88

96.00

94.53

96.00 %

10

‫ش‬

Sheen

94.71

96.50

95.17

96.50%

11

‫ت‬

Teh

94.22

95.34

94.02

95.34%

12

‫ذ‬

Theh

94.45

94.85

93.09

94.85%

13

‫ث‬

Thah

93.96

95.51

92.89

95.51%

14

‫ي‬

Yeh

98.46

99.67

99.82

99.82%

15

‫ظ‬

Zeh

82.13

84.61

80.26

84.61%

16

‫ز‬

Zen

91.62

92.48

92.09

92.48%

Directions

Points Number

15

96.71

97.90

95.28

97.90 %

Therefore, the best representation of characters to extract features in the group two is 15-directions and represent the characters using 15 points to get the recognition rate 97.90% see Table 5. 2.2

Accuracy Rate for Group Two Characters

The character recognition is affected by the extent of a character complexity of character shape, the degree of curvature, crosses, overlaps, dots counts and dots position. The characters in group two have one, two or three dots used to distinguish between similar characters. Therefore, the Zeh (‫ )ظ‬character is the most difficult to drawing characters with one dot in upper like most of the character’s dots in group two, but Yeh (‫ )ي‬character is approximately easy for drawing, but the number of dot(s) and position leading to distinguish it from the other characters in the group two. It turns out that the strategy of representation dots using dots number and dots location in character leading to make classification easier because most of the isolated Arabic characters have dots so some specific character has individual dots take high recognition rate such as Jeem(‫ )ج‬one dot in middle, Yeh(‫ )ي‬two dots in bottom, Beh(‫ )ب‬one dot in bottom. So the character’s recognition accuracy in the group two is varied due to the foregoing factors. Table 6. describe more detail the variation in recognition rate between characters in the group two, where represents characters using 10, 15 and 20 points. The Zah(‫)ض‬,Zeh(‫ )ظ‬and Khah(‫ )خ‬have low recognition rate than characters like Yeh(‫)ي‬, Beh(‫)ب‬, Jeem(‫ )ج‬and Kaf(‫ )ك‬because these characters have special characteristics facilitating recognition process. These results are logical where the recognition rate inversely proportional to the difficulty of the character drawing and directly proportional to the characters that contain distinct dot(s).

Copyright © 2016 MECS

3.

Group Three

Group three contains any isolated Arabic characters with three components so the character written using three connected line, This group contains five characters ( ‫ ق‬, ‫ ث‬, ‫ ش‬, ‫ ي‬, ‫)ت‬.

Fig.15. Group Three Isolated Arabic Characters Handwritten.

Fig.16. Arabic Character Group Three Represented by 20,15 and 20 Points.

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


The Impact of Dots Representation in Recognition of Isolated Arabic Characters

3.1

Accuracy Rate for Group Three Multi-Directions Table 7. Accuracy Rate for Group Three. Points Number

10

15

20

Best Accuracy

3

97.12

98.67

98.58

98.67%

4

96.51

97.75

96.13

97.75%

5

96.05

97.61

97.62

97.62%

6

97.44

97.97

98.23

98.23%

7

96.34

98.48

97.62

98.48%

8

95.98

97.58

97.69

97.69%

9

96.91

96.98

96.35

96.98%

10

97.29

97.44

95.95

97.44%

12

96.72

97.67

96.63

97.67%

15

97.01

98.19

97.40

98.19%

Direction

43

recognition process. These results are logical where the recognition rate inversely proportional to the difficulty of the character drawing and directly proportional to the characters that contain distinct dot(s). 4.

Group Four

The group four is contained any isolated Arabic characters with four parts so the character written using four connected line, This group has two characters (‫ش‬,‫) ث‬.

Fig.17. Group Four Isolated Arabic Characters Handwritten.

Therefore, the best representation of characters to extract features in the group three is 3-directions and represent the characters using 15 points to get the recognition rate 98.67% see Table 7. 3.2

Fig.18. Arabic Character Group Four Represented by 20,15 and 20 Points.

Accuracy Rate for Group Three Characters

Group three contain five characters Qaf(‫)ق‬, Sheen(‫)ش‬, Teh(‫)ت‬, Theh(‫)ث‬, Yeh(‫ )ي‬each member have three parts one for body and other two is dots. The character recognition is affected by the extent of a character complexity of character shape, the degree of curvature, crosses, overlaps, dots counts and dots position. The characters in group three have two or three dots used to distinguish between similar characters like Teh(‫)ت‬ and Theh(‫ )ث‬have same contour and dots level but different in dots count, this dots count able to distinguish between them. Therefore, the Sheen (‫ )ش‬and Theh(‫ )ث‬have low accuracy rate because they have the same dots counts and level in group three, but Yeh (‫ )ي‬and Qaf(‫ )ق‬have high accuracy rate because they have different dots level. Table 8. Group Three Characters Recognition Accuracy Rate. Character

20 Points

15 Points

10 Points

Best Accuracy

1

‫ق‬

Qaf

99.40

99.57

99.60

99.60%

2

‫ش‬

Sheen

94.19

94.96

92.69

94.96%

3

‫ت‬

Teh

98.51

98.96

98.10

98.96%

4

‫ث‬

Theh

92.89

94.73

91.94

94.73%

5

‫ي‬

Yeh

99.96

99.90

99.97

99.97%

No

Table 8. describe more detail the variation in recognition rate between characters in the group three, where represents characters using 10, 15 and 20 points. The Qaf(‫)ق‬, Teh(‫ )ت‬and Yeh(‫ )ي‬have high recognition rate than characters like Sheen(‫ )ش‬and Theh(‫ )ث‬because some characters have special characteristics facilitating Copyright © 2016 MECS

A therefore, the best representation of characters to extract features in the group four is 3-directions and represent the characters using 10 ,15 and 20 points to get the recognition rate 100% see Table 13. 4.1

Accuracy Rate for Group Four Characters

Group Four contain two characters Sheen(‫)ش‬, Theh(‫)ث‬ each member have four parts one for character body and other three is dots. The character recognition is affected by the extent of a character complexity of character shape , the degree of curvature, crosses , overlaps , dots counts and dots position. The characters in group four have only three dots on the same level for both characters so the main factor to classified them by using characters' contour feature when the dots is similar to distinguish between them. Therefore, the Sheen (‫ )ش‬and Theh(‫ )ث‬have high accuracy rate because they have different contour facilitate for the classifier to recognize between them. Table 9. Group Four Characters Recognition Accuracy Rate. No

Character

20 Points

15 Points

10 Points

Best Accuracy

1

‫ش‬

Sheen

96.19

98.28

96.75

98.28%

2

‫ث‬

Theh

95.57

97.90

95.86

97.90%

Table 9. describe more detail the variation in recognition rate between characters in the group four, where represents characters using 10, 15 and 20 points. The Sheen(‫ )ش‬and Theh(‫ )ث‬have very high recognition rate because group contains only two characters so easier to classify them using character contour.

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


44

5.

The Impact of Dots Representation in Recognition of Isolated Arabic Characters

Proposed Approach Results Accuracy

(

The proposed approach accuracy includes the accuracy of four classifiers for four Arabic characters' group. The first, a neural network (NN) classifier for group one contains 12 characters the result about 94.34% recognition rate. A second neural network (NN) classifier for group two contains 16 characters the result about 97.9% recognition rate. A third neural network (NN) classifier for group three contains 5 characters the result about 98.67% recognition rate. A fourth neural network (NN) classifier for group four contains 2 characters the result about 100% recognition rate.

) (

) (2)

( )⁄

()

(3)

Table 10. Total Recognition Accuracy Applied on SUST Dataset. Classifier1

Classifier2

Classifier3

Classifier4

Total Character

Character No

12

16

5

2

35

Accuracy Rate

94.34

97.9

98.67

100

Average Rate

Rate Balance

1132.08

1566.4

493.35

200

96.91%

Distributed Directions Accuracy Rate

REFERENCES [1]

100 95 90 85 80 3

4

5

6

7

8

9

10

12

Classifier 1

Classifier 2

Classifier 3

Classifier 4

15

Fig.19. Arabic Characters Group Classifier Accuracy on MultiDirection.

VI. CONCLUSION AND FURTHER WORK Arabic character recognition methods are affected by several factors like characters' shape, the complexity of drawing the character and the amount of rotation of the character on the horizontal line. Moreover, Arabic characters contain many dots this increase the challenge of building accurate Arabic classifiers. To overcame these problems, we propose divide and conquer based recognition system the proposed method divide the Arabic characters into 4 groups. A neural network has been designed to classify each group. The result is encouraging and more accurate results could be achieved. For instance, by increasing the dataset size. And testing other powerful classifiers like SVM.

Copyright © 2016 MECS

Abdurazzag Ali ABURAS and Salem M. A. REHIEL, (2007),"Off-line Omni-style Handwriting Arabic Character Recognition System Based on Wavelet Compression", International Islamic University Malaysia, Electrical and Computer Engineering, PP10, 50728, 53100. [2] Haraty, R. and Ghaddar, C. Neuro-Classification for Handwritten Arabic Text. Proceedings ACS/IEEE International Conference on Computer Systems and Applications, 2003. [3] Amin, A. Recognition of Hand-Printed Characters Based on Structural Description and Inductive Logic Programming. Pattern Recognition Letters, vol. 24, pp. 3187-3196, 2003. [4] A. Alaei, P. Nagabhushan, U. Pal, ―A new two-stage scheme for the recognition of Persian handwritten characters,‖ in Proc. of 12th ICFHR, pp.130-135, 2010. [5] A. A. Desai, ―Gujarati handwritten numeral optical character reorganization through the neural network,‖ Pattern Recognition, vol. 43, no. 7, pp. 2582-2589, July 2010. [6] D. V. Sharma, P. Jhajj, ―Recognition of isolated handwritten characters in Gurmukhi script,‖ International Journal of Computer Applications, vol. 4, no. 8, pp. 9-17, 2010. [7] M. Kumar, M. K. Jindal, R. K. Sharma, ―k-NN based offline handwritten Gurmukhi character recognition,‖ in Proc. of ICIIP, pp. 1-4, 2011. [8] Jia Yonghong, ―Digital image processing (The Second Edition)‖.WuHan China: Wu Han university press, pp 114-116, 2010. [9] Lam, Seong-Whan Lee, and Ching Y. Suen, "Thinning Methodologies-A Comprehensive Survey," IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol 14, No. 9, 1992. [10] Kohavi, R.," A Study of Cross-Validation and Bootstrap for Accuracy Estimation and model Selection", Proceedings Of the 15th International Conference on Artificial Intelligence (IJCAI). pp. 1137-1143, 1995.

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


The Impact of Dots Representation in Recognition of Isolated Arabic Characters

[11] Ahmed Sahlol and Cheng Suen, ―A Novel Method for the Recognition of Isolated Handwritten Arabic Characters‖, Department of Computer Teacher preparation, Damietta University, Damietta, Egypt and Department of Computer Science, Concordia University, Canada, 2013. [12] Khaoula Addakiri and Mohamed Bahaj, "On-line Handwritten Arabic Character Recognition using Artificial Neural Network", International Journal of Computer Applications (0975– 8887) Volume 55– No.13, October 2012. [13] Mohamed E. M. Musa, ―Arabic Handwritten Datasets for Pattern Recognition and Machine Learning‖, Application of Information and Communication Technologies (AICT 2011) – Baku, Azerbaijan. [14] Omer Balola, adnan Shaout, Mohamed E. M. Mustafa ―Two stage classifier for Arabic Handwritten Character Recognition‖ International Journal of Advanced Research in Computer and Communication Engineering, ol. 4, Issue 12, December 2015. [15] Ved Prakash Agnihotri, "Offline Handwritten Devanagari Script Recognition ", International Journal of Information Technology and Computer Science (IJITCS), Vol 8, PP 37-42, 2012.

45

Authors’ Profiles Nehad H A Hammad, Assistant Professor at Palestine Technical College, Gaza- Palestine. I obtained Bachelor degree from Islamic University of Gaza, Palestine in 2002.Master degree from Near East University, Cyprus 2005, Ph.D. degree from Omdurman Islamic University, Sudan 2015, My research interests include pattern recognition, optical character recognition and image processing.

Mohamed Elhafiz Musa, Associate Professor at Sudan University for Science and Technology, Sudan-Khartoum. His research interests include Artificial Neural Network, pattern recognition, data mining and image processing.

How to cite this paper: Nehad H A Hammad, Mohammed Elhafiz Musa,"The Impact of Dots Representation in Recognition of Isolated Arabic Characters", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.37-45, 2016. DOI: 10.5815/ijieeb.2016.06.05

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 37-45


I.J. Information Engineering and Electronic Business, 2016, 6, 46-54 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.06

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection K.Mani Department of Computer Science, Nehru Memorial College, Puthanampatti, 621 007, Tiruchirappalli (DT), India Email: nitishmanik@gmail.com

P.Kalpana Department of Computer Science, Nehru Memorial College, Puthanampatti, 621 007, Tiruchirappalli (DT), India Email: parasuramankalpana@gmail.com

Abstract—Feature selection is an indispensable preprocessing technique for selecting more relevant features and eradicating the redundant attributes. Finding the more relevant features for the target is an essential activity to improve the predictive accuracy of the learning algorithms because more irrelevant features in the original feature space will cause more classification errors and consume more time for learning. Many methods have been proposed for feature relevance analysis but no work has been done using Bayes Theorem and Self Information. Thus this paper has been initiated to introduce a novel integrated approach for feature weighting using the measures viz., Bayes Theorem and Self Information and picks the high weighted attributes as the more relevant features using Sequential Forward Selection. The main objective of introducing this approach is to enhance the predictive accuracy of the Naive Bayesian Classifier. Index Terms—Feature Selection, Irrelevant and Redundant Attributes, Feature Relevance, Feature Weighting, Bayes Theorem, Self Information, Sequential Forward Selection and Naive Bayesian Classifier.

I. INTRODUCTION Feature Selection (FS) is an effective pre-processing technique commonly used in data mining, machine learning and artificial intelligence in reducing dimensionality, removing irrelevant and redundant data, increasing learning accuracy and reducing the unnecessary increase of computational cost [10]. Large number of features given as input to the classification algorithms may lead to insufficient memory and also require more time for learning. The features which do not have any influence on the target is said to be irrelevant. The irrelevant features present in the original feature space will produce more classification errors and sometimes may produce even worse results. Thus it is essential to select the more relevant features which will provide useful information to the target and it can be Copyright © 2016 MECS

performed through FS. Feature relevance is classified into three categories viz., strongly relevant, weakly relevant and irrelevant. A strongly relevant feature is always necessary for the optimal subset and it cannot be removed. A feature is said to be weakly relevant if it is necessary for an optimal subset only at certain conditions. An irrelevant feature is one which is not necessary at all and hence it must be removed. Thus an optimal subset of features should include all strongly relevant, a subset of weakly relevant and none of the irrelevant features [2]. This paper focuses on finding the strongly relevant features from the feature space and comprises two phases viz., Feature Weighting (FW) and Feature Selection (FS). Also it integrates the metrics viz., Bayes Theorem (BT) and Self Information (SI) for FW and Sequential Forward Selection (SFS) for FS. The SFS considers the optimal subset to be empty initially and adds features one by one until the best feature subset is obtained. The objective of the proposed work is to enhance the predictive accuracy of the Naive Bayesian Classifier (NBC) with limited subset of selected features. The NBC is a statistical classifier based on BT. Since the Bayesian analysis suffers from high computational cost especially in models with a large number of features and to reduce the model construction time of NBC, this work uses BT in the pre-processing step in finding the more relevant features. The rest of the paper is organized as follows. Section 2 depicts the related work. The mathematical background necessary for understanding the proposed work is explained in Section 3. Section 4 describes the proposed methodology. The experimental study and their results are shown in Section 5. Finally, Section 6 ends with conclusion.

II. RELATED WORK Research in improving the accuracy of the classifier is a common issue and is in high demand today. An important problem related to mining large datasets both

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

in dimension and size is of selecting a subset of original features using FS. It is essential that, the reduced set should retain the optimal salient characteristics of the data not only by decreasing the processing time but also leads to more compactness of the models learned and better generalization. Several FS algorithms have been proposed in literature but this section presents a brief overview of them which provides a stronger lead to the proposed work. Mark A. Hall (2000) has described a Fast Correlation Based Filter (FCBF) for discrete and continuous cases and it is proved that there is a drastic reduction of attributes and outperformed well than ReliefF [1]. Lei Yu and Huan Liu (2004) introduced a new framework for FS called FCBF, which combines feature relevance and redundancy analysis. It uses Symmetric Uncertainty (SU) for both relevance and redundant analysis and they achieved a high degree of dimensionality reduction and also demonstrated that the predictive accuracy with selected features are either enhanced or maintained [2]. A FS method has been proposed by Jacek Biesiada and Wlodzislaw Duch (2007) using Pearson Chi-Square test for nominal (discretized) features particularly for finding redundant features and proved that it provides better accuracy [4]. Subramanian Appavu et al., (2009) have proposed a FS method for finding dependent attributes from the datasets using the joint probabilities with BT and proved that it provides better accuracy [6]. Gauyhier Doquire and Michel Verleysen (2011) have described a FS method based on Mutual Information for handling mixed type of data using both wrapper and filter and proved that it finds more relevant features than Correlation based Feature Selection (CFS) [7]. Subramanian Appavu et al., (2011) have proposed a FS method using BT and IG. They discovered the dependant attributes from the feature space using BT and removing the feature which has high IG as the redundant attribute and further proved that the accuracy has been increased significantly for the classifiers such as C4.5 and Naive Bayes [8]. John Peter and Somasundaram (2012) have launched a novel FS method by combining CFS and BT. The CFS algorithm reduces the number of attributes by SU. The selected attributes from CFS is again fed into BT for selecting the optimal subset of features and they proved that it provides better accuracy than the traditional algorithms [9]. Rajeshwari et al., (2013) has initiated a FS method using Principal Component Analysis (PCA) and Apriori based association rule mining and showed that Apriori gives 100% accuracy for the selected features and PCA requires more time for building model [10]. Mani and Kalpana (2015) have proposed a filter based FS method using Information Gain (IG) with Median Based Discretization (MBD) for continuous features and proved that it provides high accuracy than IG with standard unsupervised discretization methods viz., Equal Width Interval Discretization (EWID), Equal Frequency Interval Discretization (EFID) and Cluster Based Discretization (CBD) particularly for Naive Bayesian Classifier [11].

Copyright © 2016 MECS

47

Muhammad Atif Tahir et al., (2007) has introduced a Tubu Search method for simultaneous feature selection and feature weighting using K-NN rule and proved that it provides high classification accuracy and also reduces the size of the feature vector [15]. From the existing literatures, it is noted that no authors have proposed a feature weighting method by amalgamating BT and SI. But some authors have utilized BT only to deduct dependency among the features and not for finding feature relevance. Thus the proposed work uses MBD for continuous features, BT and SI for FW and SFS for FS.

III. MATHEMATICAL PRELIMINARIES This section presents an overview of the mathematical concepts which are essential for the proposed work. A. Bayes Theorem Bayes theorem describes usage of conditional probability with a set of possible causes for a given observed event. It is computed from the knowledge of the probability of each cause and the conditional probability of the outcome of each cause. It relates the conditional and marginal probabilities of stochastic events A and B. It is stated as P( A | B) 

P ( A) P ( B | A) P( B)

(1)

Where i) P(A) and P(B) are the prior or marginal probability of A and B respectively. ii) P(A|B) is the conditional probability of A, given B and it is called the posterior probability because it is derived from B. iii) P(B|A) is the conditional probability of B, given A and it is called the prior probability. In general, the BT is stated as [6] P ( Ai ) P ( B | Ai ) P ( Ai | B )  n  P ( Ai ) P ( B | Ai ) i 1

(2)

B. Measure of Self Information Let a discrete random variable X with the possible outcomes X = xi, i = 1, 2, 3, ... , n, then the measure of Self Information of the event X = xi is defined as [5]

    log 2 P( xi )  P( xi ) 

I ( xi )  log 2 

1

(3)

From (3), it is noted that the high probability event conveys less information than that of a low probability event and vice versa. i.e., for an event with P(xi) = 1 then I(xi)=0.

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


48

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

C. Naive Bayesian classifier It is a statistical classifier based on the Bayes Theorem. Let D be a set of training tuples with class label C. Suppose there are m distinct classes C1,C2,…,Cm. The role of this classifier is to predict that the given tuple X belongs to the class having the highest posterior probability contained on X. i.e., the tuple X belongs to Ci iff P(Ci | X )  P(C j | X ) for 1 ≤ j ≤ m and j ≠ i. P(Ci | X ) is computed as [3] P (Ci ) P ( X | Ci )

P (Ci | X ) 

(4)

P( X )

D. Evaluation Measures Performance of the classifier can be analyzed using the most widely used metrics viz., Accuracy, Precision and Recall [3]. Accuracy is the percentage of test tuples that are correctly classified by the classifier [3] [14]. Precision is a measure of exactness and Recall is a measure of completeness. These measures are computed as [3]. Accuracy 

TP  TN

(5)

TP  TN  FP  FN

Precision 

Recall 

TP

(6)

TP  FP TP

Where True Positive (TP) and True Negative (TN) refer to the positive and negative tuples that are correctly identified by the classifier respectively, False Positive (FP) and False Negative (FN) refer to the negative and positive tuples that are incorrectly classified by the classifier respectively [3] [14].

IV. PROPOSED WORK The main idea of the proposed work is to find the strongly relevant features from the feature space so as to improve the predictive accuracy, precision and recall of the NBC. It consists of two phases viz., FW and FS. First it converts the continuous attributes if any in the given dataset into discrete using MBD. After converting, the resultant dataset is fed into FW process. It assigns different weights to attributes based on the results of the computation of BT and SI. Finally the weighted attributes are given to SFS, which selects the features which have the weight greater than the user defined threshold δ. i.e., the more relevant features have higher weight. Performance of the proposed work is analyzed with NBC using the measures viz., accuracy, precision and recall. The optimal feature subset SFopt obtained are fed into NBC for determining the above said measures and results are analyzed. The framework of the proposed work is shown in Fig. 1. The steps involved in the proposed work are shown in algorithms 1 and 2. Algorithm 1 calculates the feature weight for each attribute using BT and SI. The features whose weight exceeds the threshold are selected to form the final subset SFopt and it is shown in algorithm 2.

(7)

TP  FN

Original features Proposed Method

P h a s e 1

FW

BT FS SFS SI

Weighted Attributes

NBC

P h a s e 2

Optimal Subset

NBC

Accuracy, Precision and Recall

Accuracy, Precision and Recall

Compare the results Fig.1. Framework of the proposed work

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

Algorithm 1: Feature Weighting using BT and SI Input Training set TS with 'n' attributes Fi, 1 ≤ i ≤ n, each with 'r' instances and 'y' distinct values and a target attribute C with 'm' distinct values.

Where

WF i

49

is the weight of the feature Fi

Algorithm 2: Select the relevant features from WF 's i using SFS

Output Weighted feature list

Input Weighted attribute list WFi 's and the user defined

Method 1. For each continuous attribute Fi in TS, use MBD to convert it into discrete

Output An optimal subset SFopt

a)

threshold δ

Compute Median M

Method

i. Sort the values of a continuous feature Fi in ascending order

1. SFopt ← Ø 2. For each attribute Fi ∈ TS

ii. For each unique value xi in Fi, calculate the frequency of occurrence f and cumulative frequency cf

if

WFi ≥ δ then SFopt ← SFopt ∪ {Fi}

iii. Mid ← (N+1)/2 where N=∑f iv. The item which has cf ≥ Mid is M.

A. Proposed work - An Example To show the relevance of the proposed work, the weather dataset has been taken from UCI machine learning repository. The dataset comprises 5 fields, out of which 2 are continuous and 3 are discrete. The dataset contains 14 instances. The target attribute contains two distinct values 'yes' and 'no'. The entire content of the weather dataset is shown in table 1.

b) Perform discretization Fi_des ←{low,high} For each xi ∈ Fi if xi > M then xi ← Fi_des [1]

Table 1. Weather Dataset

else xi ← Fi_des [0] 2.

Outlook

Temperature

Humidity

Windy

Play

Sunny

85

85

FALSE

No

Sunny

80

90

TRUE

No

Overcast

83

86

FALSE

Yes

Rainy

70

96

FALSE

Yes

Rainy

68

80

FALSE

Yes

Rainy

65

70

TRUE

No

P (C1 | Fi )  P (C2 | Fi )  ...  P (Cm | Fi )

Overcast

64

65

TRUE

Yes

P ( Fi )

Sunny

72

95

FALSE

No

Sunny

69

70

FALSE

Yes

Rainy

75

80

FALSE

Yes

For each feature Fi, 1 ≤ i ≤ n and the unique value in the class label Ck , 1 ≤ k ≤ m, compute P(C|Fi) a) P (Ck | Fi ) 

b) P (C | Fi ) 

P (C )  P ( Fi | Ck )

k

P ( Fi )

Where For each feature Fi,1≤ i ≤ n For each unique instance ul in Fi, 1≤ l ≤ y P ( Fiu )  l

count (ul ) r

Also  P ( Fiu )  1

Sunny

75

70

TRUE

Yes

Overcast

72

90

TRUE

Yes

Overcast

81

75

FALSE

Yes

Rainy

71

91

TRUE

No

l

3.

Compute Self Information for P(C|Fi), 1≤ i ≤ n a)

 1  WF  log 2     log 2 P(C | Fi ) i  P(C | Fi ) 

Copyright © 2016 MECS

The median of the attribute 'Temperature' is 72 based on step 1(a) of algorithm 1 and it is illustrated in table 2. After finding the median, the 'Temperature' is discretized as {High, High, High, Low, Low, Low, Low, Low, Low, High, High, Low, High, Low} based on step 1(b) of algorithm 1. Similar calculations are performed for other

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


50

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

continuous attributes in the dataset. Table 3 shows the complete content of weather dataset after MBD. Table 2. Calculation of Median Unique values of Temperature

Frequency of occurrence

Cumulative frequency

64

1

1

65

1

2

68

1

3

69

1

4

70

1

5

71

1

6

72

2

8

75

2

10

80

1

11

81

1

12

83

1

13

85

1

14

Table 3. Weather Dataset after MBD Outlook

Temperature

Humidity

Windy

Play

Sunny

High

High

False

No

Sunny

High

High

True

No

Overcast

High

High

False

Yes

Rainy

Low

High

False

Yes

Rainy

Low

Low

False

Yes

Rainy

Low

Low

True

No

Overcast

Low

Low

True

Yes

Sunny

Low

High

False

No

Sunny

Low

Low

False

Yes

Rainy

High

Low

False

Yes

Sunny

High

Low

True

Yes

Overcast

Low

High

True

Yes

Overcast

High

Low

False

Yes

Rainy

Low

High

True

No

P(Sunny|Yes)=2/9= 0.222222 P(Sunny|No)=3/5= 0.6 P(Overcast|Yes)=4/9= 0.444444 P(Rainy|Yes)=3/9= 0.333333 P(Rainy|No)=2/5=0.4 P(Outlook|Yes)=0.222222×0.444444×0.333333 =0.032917 P(Yes|Outlook)=0.032917×0.64285 =0.021164 P(Outlook|No)=0.6×0.4=0.24 P(No|Outlook)=0.24×0.357149=0.085714 P(Play|Outlook)= P(Yes|Outlook) + P(No|Outlook) = 0.021164+0.085714 = 0.106878 ii) Compute I(Outlook) using step 3 of algorithm 1 I (Outlook) = log2(1÷0.106878)=3.225959 Woutlook =3.225959 Finding the feature weight for the feature 'Temperature' with the target 'Play': i) Calculate P(Play|Temperature) P(Temperature)=P(High)+P(Low)=6÷14+8÷14=1 P(High|Yes)=4/9=0.444444 P(High|No)=2/5=0.4 P(Low|Yes)=5/9=0.555555 P(Low|No)=3/5=0.6 P(Temperature|Yes) = 0.444444×0.555555=0.246914 P(Yes|Temperature) = 0.246914×0.642857=0.158730 P(Temperature|No) = 0.4×0.6=0.24 P(No|Temperature)=0.24×0.357142=0.085714 P(Play|Temperature)= P(Yes|Temperature)+P(No|Temperature) = 0.158730+ 0.085714 = 0.244444 ii) Compute I(Temperature) I (Temperature) = log2 (1÷0.244444) =2.032421 WTemperature = 2.032421 Similar calculations are performed for the remaining fields of weather dataset viz., Humidity and Windy. Table 4 shows the summary of the results for the weather dataset using the proposed work. Table 4. Final Results of the Proposed Work for the Weather Dataset Fi

P(Play|Fi)

WFi

Computation of WFi :

Outlook

0.106878

3.225959

F = {Outlook, Temperature, Humidity, Windy, Play} Outlook={Sunny,Overcast,Rainy} Temperature={High,Low} Humidity={High, Low} Windy={TRUE, FALSE} Play={Yes, No}

Temperature

0.244444

2.032421

0.2

2.321928

0.228571

2.129283

Humidity Windy

P(Yes)=9/14=0.642857 P(No)=5/14=0.357149

From table 4, it is evident that the feature which has less posterior probability has more information relevant to the target and vice versa. The work assigns median of WF ' s as the threshold δ for each dataset and selects the

Finding feature weight for the feature 'Outlook' with the target 'Play': i) Calculate P(Play|Outlook) using step 2 of algorithm 1 P(Outlook)=P(Sunny)+P(Overcast)+P(Rainy) =5÷14+4÷14+5÷14=1

features whose WF is greater than δ. According to this SFopt for weather dataset contains 'Outlook' and 'Humidity'. As median is the midpoint average of the class, the proposed method enhances the performance of NBC with limited number of selected features say approximately 50%.

Copyright © 2016 MECS

i

i

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

V. EXPERIMENTAL RESULTS In order to analyze the effectiveness of the proposed method, an empirical study has been performed with 6 datasets which are taken from UCI machine learning repository [12]. Each dataset comprises of both nominal and continuous features. The comprehensive description of the datasets is illustrated in Table 5. The proposed method for selecting the more relevant features has been implemented in python. The number of features selected and the selected features for each dataset is shown in Table 6 and its graphical representation is shown in Fig. 2. From Table 6, it is observed that the number of features in the optimal subset is approximately 50% from the original features because the proposed algorithm considers only the top 50% of relevant features in the original feature space. As the optimal subset contains a fewer number of features (approximately 50%) in the original which results in dimensionality reduction. The original and the newly obtained dataset containing only the selected attributes using the proposed algorithm are fed into NBC using WEKA, for determining the predictive accuracy with 10-fold cross validation method and the results are shown in Table 7. Its corresponding graph is shown in Fig. 3. From Table 7, it is observed that the predictive accuracy of NBC for 4 datasets viz., Weather, Pima Indian Diabetes, Statlog Heart and Eeg is significantly improved for the selected features. For Anntrain dataset, the accuracy is decreased for the selected features. But for Breast Cancer, the accuracy remains the same for the original and the selected features. Thus on an average, it has been found that the accuracy of NBC is improved with the subset of selected features. The reason for this is that both BT and SI combination helps to identify the perfect features, which provide more

51

information to the target and there is no possibility for irrelevant and least relevant features in the resultant optimal subset. Hence it is concluded that the proposed FS using BT, SI and SFS enhances the accuracy of NBC with approximately 50% of the original features and the accuracy enhancement is 1.02%.

Fig.2. Original features vs. selected features

Fig.3. Accuracy comparison of NBC with original features and selected features

Table 5. General Characteristics of Datasets S. No

Data Sets

#Features Nominal 3

Total 5

#Classes

#Instances

1

Weather

Numeric 2

2

14

2

Breast Cancer

10

1

11

2

699

3

Pima Indian Diabetes

8

1

9

2

768

4

Statlog Heart

13

1

14

2

270

5

Ann-train

21

1

22

3

3772

6

Eeg

14

1

15

2

14979

Table 6. Comparison of Original Features and Selected Features Datasets

Total no. of Features

No. of Features Selected

Selected Features

Weather

5

2

1,3 3,6,7,8,9,10

Breast Cancer

11

6

Pima Indian Diabetes

9

5

1,2,6,7,8

Statlog Heart

14

7

3,6,9,10,11,12,13

Ann-train

22

11

1,4,5,9,12,14,17,18,19,20,21

Eeg

15

7

1,6,7,11,12,13,14

Total (%)

76 (100%)

38 (50%)

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


52

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

Table 7. Accuracy Comparison of NBC with Original Features and Selected Features Accuracy (%) With Original Features With Selected Features (All) (Using Proposed Method) 64.2857 71.4286

Dataset Weather Breast Cancer

95.9943

95.9943

Pima Indian Diabetes

76.3021

76.6927

Statlog Heart

83.7037

84.0741

Ann-train

95.6522

95.1485

Eeg

48.0406

48.9285

Average

77.32

78.71

Further, this work also evaluates the performance of the proposed work by determining precision and recall of NBC for each datasets using (6) and (7) and they are shown in Tables 8 and 9 respectively. The corresponding graphs are shown in Fig. 4 and 5 respectively. From Tables 8 and 9, it has been found that the weighted average of precision and recall of NBC using the selected features have been considerably increased than that of the original features. Table 8. Precision Comparison of NBC with Original Features and Selected Features

Dataset Weather Breast Cancer Pima Indian Diabetes Statlog Heart Ann-train Eeg Average

Precision With Selected Features With Original (Using Proposed Features Method) 0.607 0.706 0.961 0.962 0.759

0.762

0.837 0.95 0.529 0.774

0.841 0.942 0.537 0.792

P(Cj), P(Fi) and P(Fi |Cj) in the pre-processing step will be useful to reduce the time for constructing the model of NBC if they are used in the learning phase. It is evident from the literatures that the wrapper model always uses the specific learning algorithm itself to assess the quality of the selected features and it generally provides better performance than filter because the feature selection process is optimized for the particular classification algorithm to be used [13]. As the proposed work has been used to promote the accuracy of the NBC, it is recommended that the proposed framework can be used as wrapper model in future using NBC as the learning algorithm. Normally the wrapper models are very expensive than filter and filters are faster than wrappers, but if the recommended framework is used, the time and cost may be saved because BT is used both in FS and NBC.

Table 9. Recall Comparison of NBC with Original Features and Selected Features

Dataset Weather Breast Cancer Pima Indian Diabetes Statlog Heart Ann-train Eeg Average

Recall With Selected With Original Features (Using Features Proposed Method) 0.643 0.714 0.96 0.96 0.763 0.767 0.837 0.841 0.951 0.957 0.48 0.489 0.773 0.787

As the Accuracy, Precision and Recall of NBC have been increased for the selected features using the proposed work, it is indicated that the combination of BT, SI and SFS for FS is more suitable for enhancing the performance of NBC. The main advantage of the proposed work is that the values computed for BT viz.,

Copyright Š 2016 MECS

Fig.4. Precision comparision of NBC with original features and selected features

Fig.5. Recall comparision of NBC with original features and selected features

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

VI. CONCLUSION

[12]

This paper presents a novel method of integrating BT and SI in measuring the feature weight. The weighted features are fed into SFS for selecting the strongly relevant features from the feature space. The more relevant features selected by the proposed method are fed into NBC to determine the predictive accuracy, precision and recall. From the experimental results it has been observed that the predictive accuracy of the NBC is increased approximately by 1% using the subset of features chosen by the proposed method. Similarly, the precision and recall of NBC have also been increased considerably. Further the time taken in building the model of the NBC can be reduced if the computed values of BT in the preprocessing are used in the learning phase. The other main advantage of the proposed method is that it uses MBD for discretizing the continuous features which eliminates outliers and the need to specify the number of bins. REFERENCES [1]

Mark A. Hall. (2000) 'Correlation-based Feature Selection for Discrete and Numeric Class Machine Learning' in ICML 2000: Proceedings of the Seventeenth International Conference on Machine Learning, Morgan Kaufmann Publishers Inc. San Francisco, CA, USA, pp. 359-366. [2] Lei Yu and Huan Liu. (2004) 'Efficient Feature Selection via Analysis of Relevance and Redundancy', Journal of Machine Learning Research, pp. 1205-1224. [3] Jiawei Han. and Micheline Kambar (2006) Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann Publisher. [4] Jacek Biesiada and Wlodzislaw Duch (2007) 'Feature Selection for High-Dimesional Data: A Pearson Redundancy Based Filter', Computer Recognition System 2, ASC, Vol. 45, pp.242-249. [5] Ranjan Bose (2008) Information Theory, Coding and Cryptography, 2nd edition, Tata McGraw-Hill publishing company Limited. [6] Subramanian Appavu Alias Balamurugu et al (2009) 'Effective and Effective Feature Selection for Large-scale Data using Bayes Theorem', International Journal of Automation and Computing, pp.62-71. [7] Gauyhier Doquire and Michel Verleysen. (2011) 'Mutual information based feature selection for mixed data', ESANN proceedings, pp. 27-29. [8] Subramanian Appavu and et al (2011) 'Bayes Theorem and Information Gain Based Feature Selection for maximizing the performance of classifier', SpringerVerlag Berlin Heidelberg, CCSIT, pp. 501-511. [9] John Peter, T and Somasundaram, K. (2012) 'Study and Development of Novel Feature Selection Framework for Heart Disease Prediction', International Journal of Scientific and Research Publication, Vol. 2, No. 10. [10] Rajeshwari, K. et al. (2013) 'Improving efficiency of Classification using PCA and Apriori based attribute selection technique', Research Journal of Applied Sciences, Engineering and Technology, Maxwell scientific organisation, pp. 4681-4684. [11] Mani, K and Kalpana, P. (2015) 'A Filter-based Feature Selection using Information Gain with Median Based Discretization for Naive Bayesian Classifier',

Copyright Š 2016 MECS

[13]

[14]

[15]

53

International Journal of Applied and Engineering Research, Vol. 10 No. 82, pp. 280-285. UCI Machine Learning Repository - Center for Machine Learning and Intelligent System. [online] http://archive.ics.uci.edu (Accessed 10 October 2015). Mr. Saptarsi Goswami and Dr. Amlan Chakrabarti. (2014) 'Feature Selection: A Practitioner View', International Journal of Information Technology and Computer Science, Vol. 11, pp. 66-77. Eniafe Festus Ayetiran and Adesesan Barnabas Adeyemo (2012) 'A Data Mining-Based Response Model for Target Selection in Direct Marketing', International Journal of Information Technology and Computer Science, Vol. 1, pp. 9-18. Muhammad Atif Tahir, Ahmed Bouridane and Fatih Kurugollu (2007) 'Simultaneous feature selection and feature weighting using Hybrid Tubu Search/K-nearest neighbor classifier', Pattern Recognition Letters, Elsevier, pp. 438-446.

Authors’ Profiles Mani. K received his MCA, M.Phil and M.Tech. degrees from the Bharathidasan University, Trichy , India in Computer Applications, Computer Science and Advanced Information Technology respectively. He completed his Graduation in Operations Research from Operational Research Society of India, Kolkata. Since 1989, he has been associated with the Department of Computer Science of Nehru Memorial College, affiliated to Bharathidasan University where he is currently working as an Associate Professor. He completed his PhD in Cryptography with primary emphasis on evolution of framework for enhancing the security and optimizing the run time in cryptographic algorithms. He published and presented around 20 research papers in international journals and conferences. His research area includes Cryptography, Data Mining, Coding Theory, Computer Simulation and Optimization of Algorithms.

P.Kalpana received her B.Sc and M.Sc degrees in Computer Science from Seethalakshmi Ramaswami College, affiliated to Bharathidasan University, Tiruchirappalli, India in 1999 and 2001 respectively. She received her M.Phil degree in Computer Science in 2004 from Bharathidasan University. She also received her MBA degree in Human Resource Management from Bharathidasan University in 2007. She is presently working as an Assistant Professor in the Department of Computer Science, Nehru Memorial College, Puthanampatti, Tiruchirappalli, India. She is currently pursuing PhD degree in Computer Science in Bharathidasan University. Her research interests include Algorithms, Data Pre-processing and Data Mining techniques.

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


54

An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection

How to cite this paper: K.Mani, P.Kalpana,"An Efficient Feature Selection based on Bayes Theorem, Self Information and Sequential Forward Selection", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.46-54, 2016. DOI: 10.5815/ijieeb.2016.06.06

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 46-54


I.J. Information Engineering and Electronic Business, 2016, 6, 55-61 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.07

Design & Optimization of Reversible Logic Based ALU Using ACO Shaveta Thakral Manav Rachna International University/ECE, Faridabad, 121004, India Email: shaveta.fet@mriu.edu.in

Dipali Bansal Manav Rachna International University/ECE, Faridabad, 121004, India Email: dipali.fet@mriu.edu.in

Abstract—Portable consumer electronics is most demanding in every segment of electronic industry and to satisfy the needs of low power electronics, comprehensive approaches and techniques have been proposed by various researchers. Reversible logic is one among emerging and competent technologies with profound applications in fields of computer graphics, optical information processing, quantum computing, DNA computing, ultra low power CMOS design and communication. ALU is a fundamental component of all processing units. Portability in computing system highly demands for reversible logic based ALU. Many researchers have proposed exact synthesis approaches of ALU design based on reversible logic but few have come up with reduced quantum cost without long computation overhead. Here in this paper heuristic approach has been used which not only provides solution for large number of variables but also avoids sufferings caused by long computation overhead. The main goal of this paper is to propose reversible logic based ALU and further it is optimized by Ant Colony Optimization (ACO) algorithm combined with Depth First Search (DFS) in terms of reduced quantum cost. Index Terms—Ant colony optimization, Arithmetic logic unit, Depth first search, Quantum cost, Reversible logic

I. INTRODUCTION Landauer proposed that ―Amount of energy dissipated for every bit erasure during an irreversible operation is given by KTln2 joules where K is Boltzman‘s constant and T is the operating temperature. Bennett provided the solution to Landauer statement that ―The KTln2 energy dissipation would not occur, if computation is done in a reversible manner since amount of energy dissipated in a system depends directly on numbers of bits erased during computation‖. Classical gates like two input AND, OR, NAND, NOR, XOR and XNOR are irreversible as input states from output states can‘t be uniquely reconstructed.

Copyright © 2016 MECS

Here two-bit input state is mapped to one-bit output state leads to the erasure of one bit and consequently loss of energy. This energy loss can be avoided by mapping n bit input states to n bit output states so that input states can be uniquely recovered from output states and under such circumstances, a gate is said to be reversible. The Main feature of reversible logic highlights number of inputs must be equal to number of outputs, no fan out is allowed and every output must be used only once. Section A briefs about reversible logic gates and section B introduces with multicontrol Toffoli gate. A. Reversible Logic Gates It is very important to know that out of four 1*1 onequbit gates; only two are a reversible i.e. trivial gate and not the gate. Similarly out of 256 possible 2*2 two-qubit gates; only 24 are reversible. There exist 16777216 different 3*3 three-qubit gates however number of reversible 3*3 gates is much smaller i.e.40320. B. Multi control Toffoli Gate (n*n) An n*n reversible gate has n inputs and n outputs. In multi control Toffoli (MCT) gate first n-1 bits are known as control bits and generally last bit i.e nth bit is known as target bit although nth bit is mentioned just for simplicity and it could be any of n bits. Multi control Toffoli gate passes all the input bits to output and inverts the target bit when all control bits are 1.A multi control Toffoli gate may have one or more negative controls. I t means all the bits get passed to output when all positive control bits are 1 and negative control bits are 0.Symbol is put on target line; is put for positive control and is put for negative control. Generally Toffoli gate is represented by TOF (c: t) where c is set of control bits and t is target bit.Toffoli network consists of three basic gates as TOF1(X) called as NOT gate shown, TOF 2(X: Y) called as CNOT gate or Feynman gate and TOF 3 (X, Y: Z) called as Toffoli gate Primitives of Toffoli network for reversible logic synthesis are shown in Fig.1.

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


56

Design & Optimization of Reversible Logic Based ALU Using ACO

Fig.1. Primitives of Toffoli network for reversible logic synthesis

Some popular reversible logic gates that are used in proposed design of ALU are given in Table 1 with their

specification, expression, quantum cost and quantum implementation.

Table 1.Popular Reversible Logic Gates in ALU Design Reversible Logic Gate

Specification

NOT gate

1*1

1

CNOT Gate/Feynman Gate

2*2

1

Toffoli/CCNOT Gate

3*3

5

Fredkin Gate/CSWAP Gate

3*3

5

Expression

II. RELATED WORK Ant Colony Optimization is based on the behavior of nature given heuristic, like Ants follow the shortest path toward the food from source, also taken into account, the feasibility and the shortest path to reach the food. Ants decide to travel for the destination/food according to the visibility and the feasibility. There may be different paths for reaching the destination but the ants choose the shortest and the hurdle free path to reach destination. The ants lay down the pheromones on the path they travels the other ant follows the trail laid down by the ant. So using the similar technique that ants use to reach their destination, the algorithm implements the way to deliver the optimal solution/path for the given problem. To employ the best path several parameters are to be accounted. Several approaches have been proposed by various authors for reversible logic based ALU design. Min Li [1] focused on the search for the best path of reversible logic synthesis using ACO. This technique is superior to previously proposed ones. Mayukh Sarkar [2] proposed a technique to synthesize Reversible circuit; first, the classical Boolean logic is converted using synthesis method and then Quine-McCluskey method is used for Copyright Š 2016 MECS

Quantum Cost

Quantum Implementation

depth search and the breadth search in ACO. The results are found to be very satisfying. In paper [3], Ant colony optimization is explained in two different perspectives, the authors of paper [1] has also taken reference from this work. This article gives the guarantee for the optimal or optimal near solution for the synthesis of the problem. ESOP method is proposed by K. Fazel [4] for the cascaded structure that enables the function to be converted into EX-OR form. This allows forming the reversible circuit using cascading of Toffoli gate, this algorithm is found to be fast and use simple cost metric heuristic using divide and conquering rule. Ravi Raj Singh [5], proposed ALU circuit via the use of nanotechnology and quantum computing with minimum power dissipation; the reversible approach results in the improved computer; carry save adder is used to implement the ALU design and also to minimize the Quantum cost. Paper [6] proposed and verified reversible logic gate; ALU design with ripple carry adder has also been implemented.Zhijin Guan [7] discussed the construction of ALU based on reversible logic gates, traditional OR, AND and other gates are replaced by reversible logic gates; the aim was to reduce the power consumption. A function generator is constructed to reduce the effort of an adder used in the circuit. The

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


Design & Optimization of Reversible Logic Based ALU Using ACO

comparison has been made between the results of classical ALU and their proposed ALU; the power consumption has been found to be quite less. The proposed ALU in the paper [8] consists of multiplexers and the control signals generated by a control unit which controls the operands to be manipulated. It has been concluded that several approaches have been proposed by various authors for reversible logic based ALU designs. Yet lot of scope is there to improve quantum cost of ALU. In this paper an ALU design is proposed and after applying ACO algorithm it is optimized in terms of various parameters like number of gates, quantum cost, logic operations and garbage outputs. Section A briefs about popular reversible logic gates used in ALU designs. Section B briefs n*n multicontrol Toffoli gate and section C gives background on ant colony optimization. Section III presents proposed methodology for reversible logic synthesis i.e. ACO the Meta heuristic method of search and optimization.

57

Pp  EVR  Pp  Pn Pp : previous pheromone value Pn : new pheromone value EVR : Evaporation rate B. ACO Algorithm At first most, initialize the pheromone table and now initialize the global loop that will make (G_loop), number of iterations of upper limit. Then number of fix ants are released to move and select the path by selecting the Toffoli gate and the output of the Toffoli will be the function that will provide path and the pheromone will be evenly distributed by all the ants i. The target bits of gates are been selected by probability function which is calculated using

N

Where Wt (csi , k )   si ,i (t ) , is the sum of pheromone

III. PROPOSED METHODOLOGY

i 0

In proposed ACO approach the path found by the artificial ants will be the shortest path in the form of a synthesized reversible function. The ants selectively choose the path i.e. Toffoli gate during the search from the library that is created. The decisions of ants are based on the availability and the pheromone quantity which will gradually increase with the selection of the ants. More number of ants decide to travel from the path more the strength of pheromone that path will have. To guide ants efficiently the pheromone table is constructed in which the pheromone values are updated (either increased or decreased). When the ants have traveled the whole path, if they reached the destination or near to destination the pheromone table will be updated. The pheromones will favor the path of the other ants for selection of gate. A. Pheromone Table This table is used to manage the pheromone levels of the ants traveling the path and also helps in defining the shortest path, i.e. the functions of Toffoli gates and number of gates to be used in implementing the circuit/function. There will be pre defined amount of pheromone in the table every time the ACO update the pheromone table the value of pheromone will be decreased or be increased. And the gate with the target value will be updated. The artificial ants do the randomize walk through whole path of the derived graph by ACO, the ants lay down the pheromones on the path by selecting favorable Toffoli gate. Pheromone trails are associated with the connection of the components; more the quantity of pheromone more will be the probability of other ants to chose the trail. The same type of models is used before this and they have been proven efficient. At the same time the evaporation in pheromone will also take place if the less no of ant chose the path. The pheromone update factor is given by: Copyright © 2016 MECS

for tth bit. The probability of selecting control bits of the selected toffoli gate is calculated using

Where Wc (csi , k ) 

N

 i 0

( si ,i )

(t , m, cm ) , is the sum of

pheromone for the mth bit set as value cm when the tth bit is the target bit. C. Proposed ALU Design The ALU was designed using the reversible logic gates. This ALU supports following operations: 1. 7 arithmetic and 2. 5 logical operations The quantum cost of proposed ALU is 37. This quantum cost shows complexity of the design. It is necessary to optimize the design. There are many factors on which the efficiency of the design depends like size, number of gates, number of garbage outputs, quantum cost etc.Here work is not taking all of those into account but as much as we can do to optimize the design to get better and more effective performance. D. ACO applied on ALU for Optimization The goal of this optimization is to decrease the gate count which will ultimately decrease the quantum cost and increase the efficiency of the circuit. The total of i ants are employed. Finding the route each ant i till the L_loop (local loop) is terminated or the disti becomes ‗0‘. Which means ants reached the fI i.e. identity function, at the iteration index L_loop. The local

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


58

Design & Optimization of Reversible Logic Based ALU Using ACO

search is implemented at lines8−17, which nests depthfirst searches in a breadth-first search controlled by parameters BREADTH and DEPTH, respectively. Each ant i will evaluate the breadth and depth of every function and the best route, there may be some loops or repeated path are formed delete those paths and loops, mentioned in line 20-21, now after removal of loops updated the route and pheromone every time in the pheromone table, line 23. E. Working of Proposed ACO ACO is employed to overcome the above mentioned problem to achieve the goal. This method has proven effective for solving many difficult combinatorial Optimization problems. Algorithm works in some different fashion that it is used to take the output function of a circuit as its input and the output of the ACO will be the input function of the circuit. To achieve this goal certain parameters are applied. The algorithm is architected in a linear phase manner. Firstly, a pheromone table is implemented than ACO algorithm that will employ this implemented pheromone table to update the ant path and defining he shortest path for the solution. The algorithm and the pheromone table are explained in above sections. This can be understood with the help of an example.Let the function be f(n) = {4,1,7,5,2,6,0,3}, the optimization algorithm is to be applied on f(n). Table2, showing the truth table for the reversible function. Step1. initialize pheromone table step2. repeat K <G_loop step3. repeat for all ants i step4. initializedisti = Ham(CSi, FS)//CS –Current State, FS – Final State step5. start CS of the function step6. CSi as ‗P‘ for ants and initialize the routei. step7. lif = routei.end //local iteration function step8. repeat until (L_loops = 10 &disti≠ 0) step9. j = 1 for all the childs step10. CSi =toffoli_func(lif) step11. CSi =toffoli_func(CSi) step12. addCSi to routeij step13. if (dist (CSi) <disti) step14. disti = dist (CSi) step15. best_route = j step16. end if step17. go to step8 step18. go to step3 step19. take the routei step20. remove the loop, if formed step21. add the best_route to routei step22. end for step23. do pheromone update step24. end Fig.2. Proposed ACO Algorithm

The ants which algorithm is employing will select a promising Toffoli gate, having t as a target bit and ck as control bits, where k is number of control bits, through the probability function explained above. All the ant will try to find the path to reach its destination which is the identity function fI = {0, 1, 2, 3, 4, 5, 6, 7}, each ant having specific amount of pheromone will lay down the pheromones and move towards the destination. The ants will travel according to the Tour_graph, figure 1.3; give the idea about the graph is having the virtual paths and non virtual paths. Virtual paths and non virtual path is calculated using hamming distance between the states. The selection of the path will be according to Tour_graph.When one ant will complete its tour then the route will be updated, the other ants will also do the same process following the trails of the ants having the more amount of pheromone. When all the ants have completed the tour then pheromone table will be updated for the next tour. This technique helps us to find the optimal path without searching whole space by using the probability and the pheromone values. Once ants reach the destination fI, then the shortest path will be updated.

0 7

1

6

2

3

5 4

Fig.3. Tour_graph

Fig. 4 shows here the path searched by ants to reach the destination, the shortest path traced by the ants and with the highest value of pheromone is give by 4 Toffoli gate. There are different paths shown but the shortest path is traced by the ants. The algorithm gave the advantage, not to search whole space of the possibilities but the most probabilistic path using heuristic.

Table 2. Reversible function f(n) ={4,1,7,5,2,6,0,3} X 0 0 0 0 1 1 1 1

Y 0 0 1 1 0 0 1 1

Z 0 1 0 1 0 1 0 1

Copyright © 2016 MECS

XY 1 0 1 1 0 1 0 0

Z 0 0 1 0 1 1 0 1

0 1 1 1 0 0 0 1

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


Design & Optimization of Reversible Logic Based ALU Using ACO

59

Fig.4.ACO Path

IV. ALU AFTER ACO IMPLEMENTATION After implementation of ACO the shortest path is obtained, the traced path is given by Toffoli gates and its target bit and the control bits with its polarity value. It is then converted into circuit. The circuit can be optimized more as it is having the Toffoli gates that are of 5x5 and 4x4due to which quantum cost is not decreased to that extent. They can be replaced by their equivalent as shown in Fig.5.Circuit of ALU consist of 10 reversible logic gates including four Feynman gates, three R gates, two Fredkin gates and one HNG gate. Circuit can perform total 12 operations including seven arithmetic and five logical operations.

Fig.6.RTL view of ALU

V. SIMULATION & RESULT ANALYSIS The Fig. 7 shows the output waveform of the functional ALU design after implementation of ACO. The Toffoli circuit is converted into other gates to make ALU more efficient and to reduce the delay. It provides Quantum Cost of 32. By applying ACO the quantum cost is reduced by 5.

Fig.7. Simulation waveform Table 3. Comparison Table Fig.5. Block Diagram of ALU

Fig. 6 shows the RTL view of the ALU implemented after applying ACO. RTL schematic is designed to understand the architecture of the design. The RTL view gives us the idea about implemented design and also it gives the tree synthesis that how these all are connected. It allows visual representation of the design. The schematic is designed for the top level module.

Copyright Š 2016 MECS

ALU DESIGNS No. of Gates Quantum Cost Logic operations Garbage O/Ps

ALU before ACO 9 37 12 12

Type of Gates Used

Toffoli, Feynman, Fredkin

ALU after ACO 10 32 12 11 Feynman, Fredkin, R-Gate, HNG

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


60

Design & Optimization of Reversible Logic Based ALU Using ACO

Table 5. Function Table for Arithmetic Operations

Table 3 gives the detailed picture of the ALU implemented before and after ACO synthesis, the major parameters affecting ALU performance are being compared here in the table. Some Toffoli gates also been replaced by other gates for the sake of quantum cost. Table 4 and table 5describe the logical function table and the arithmetical function table respectively. After ACO although one gate is increased yet it provides reduced quantum cost and reduced number of garbage outputs which are considered to be two top most optimization metrics demanded by latest reversible circuits. Table 4. Function Table for Logical Operations

S0

S1

S2

Cin

Output

Function

0

0

0

0

A

Transfer A

0

0

0

1

A+1

Increment A

1

0

0

0

A+B

Addition

1

0

0

1

A+B+1

0

1

0

0

A+B'

0

1

0

1

A+B'+1

Add with Carry Subtract with Borrow Subtraction

1

1

0

0

A-1

Decrement A

VI. CONCLUSION

S0

S1

S2

Cin

Output

Function

0

0

1

x

A+B

OR

1

0

1

x

A.B

AND

0

1

1

x

1

1

1

x

A'

NOT

1

1

0

1

A

Transfer A

XOR

In this paper, Ant colony optimization algorithm has been proposed on ALU design; the optimization approach is probabilistic and is formulated for the ALU using reversible gates. This method of optimization does not search the whole space but selects the paths according to decision made by ACO and gives the optimization in the form of reduced quantum cost as shown in Fig.8.

40 35 30 25 20 15 10 5 0

Reversible Logic Gates Quantum cost Logic Operations Garbage Outputs

ALU ALU before after ACO ACO

Fig.8. Optimization metrics comparison before & after ACO

REFERENCES [1]

[2]

[3]

[4]

[5]

M.Li, Y. Zheng, MS. Hsiao and C. Huang, "Reversible logic synthesis through ant colony optimization," Design, Automation & Test in Europe Conference & Exhibition; 8-12 March 2010; Dresden.Europe: IEEE, pp.307-310. M. Sarkar, P. Ghosal, SP. Mohanty, "Reversible circuit synthesis using ACO and SA based quine-mcCluskey method," IEEE 2013 56th International Midwest Symposium on Circuits and Systems; 4-7 August 2013; Columbus.OH, IEEE.pp.416-419. WJ. Gutjahr,―ACO algorithm with guaranteed convergence to the optimal solution,‖ Information Processing Letters, 82(3):145–153,2002. K. Fazel, M. A. Thornton, J. E. Rice, "ESOP-based Toffoli Gate Cascade Generation," IEEE 2007 Pacific Rim Conference on Communication, Computers and Signal Processing;,22-24 August 2007;Victoria.BC: IEEE.pp.206-209. R. Singh, S.Upadhyay, S.Saranya S, Soumya,

Copyright © 2016 MECS

[6]

[7]

[8]

[9]

KB.Jagannath, SA. Hariprasad,"Efficient Design of Arithmetic Logic Unit using Reversible Logic Gates," IJARCET 2014, vol. 3, pp. 1474-1477 M.Matthew, L.Matthew,M. Richard and R.Nagarajan,"Design of a Novel Reversible ALU using an Enhanced Carry Look-Ahead Adder," 11th IEEE International Conference on Nanotechnology; 15-18 August 2011; Portland.Marlott:IEEE.pp.1436-1440. Z. Guan, W. Li, W. Ding, Y. Hang, L.iNi,"An arithmetic logic unit design based on reversible logic gates," Pacific Rim Conference on Communication, Computers and Signal Processing; 23-26 August 2011; Victoria.BC:IEEE.pp.925-931. Y. Syamala, A. V. N. Tilak, "Reversible arithmetic logic unit," 3rd International conference on Electronics Computer Technology (JCECT); 8-10 April,2011; Kanyakumari:IEEE.pp.207-211. MB. Ali, MM. Hossin and ME. Ullah,‖ Design of Reversible Sequential Circuit Using Reversible Logic Synthesis‖, International Journal of VLSI design & Communication Systems (VLSICS) Vol.2, No.4,

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


Design & Optimization of Reversible Logic Based ALU Using ACO

December 2011 [10] F. Sharmin, RK. Mitra, R. Hasan, A. Rahman "Low cost reversible signed comparator" International Journal of VLSI design & Communication Systems‖ Vol.4, No.5,pp:19- 33,2013 [11] B.Dehghan, A.Roozbeh, J. Zare "Design of low power comparator using DG gate" Scientific research ciruits and systems doi.org/10.4236/cs.2013.51002pp: 7-12, 2014 [12] N. Pandey, N.Dadhich, MZ. Talha "Realization of 2-to-4 reversible decoder and its applications" International Conference on Signal Processing and Integrated Networks (SPIN) pp: 349- 353, 2014 [13] V. Oklobdzija, "High -Speed VLSI Arithmetic Units: Adders and Multipliers", in "Design of High Performance Microprocessor Circuits", Book Chapter, Book edited by A. Chandrakasan, IEEE Press, 2000. [14] M. K. Thomson, Robert Gluck and Holger Bock Axelsen,―Reversible arithmetic logic unit for quantum arithmetic‖, Journal of Physics A: Mathematical and Theoretical. Vol.43, 2010. [15] Towards a Design Flow for Reversible Logic -Robert Wille and RolphDrechsler, ISBN: 978-90-481-9578-7 Springer Dordrecht Heidelberg London New York

61

Authors’ Profiles Shaveta Thakral is presently working as an Associate Professor at Electronics & communication department,Faculty of Engineering and technology, Manav Rachna International University, Faridabad.She obtained her BE in Electronics and communication from Lingayas Institute of management and Technology,Faridabad;MTECH from IASE Deemed University,Rajasthan.Currently she is pursuing PhD from Manav Rachna International University,Faridabad.Her current research area includes Analog and Digital circuits, VLSI and Microprocessor.She has work experience of 11.5 years.She has published 18 research papers.

Dr. Dipali Bansal is presently Professor & Head at Electronics & communication department, Faculty of Engineering and technology, Manav Rachna International University, Faridabad. She obtained her BE in Electronics and Telecommunication from BIT, indri; ME in Instrumentation and Control Engineering from MDU, Rohtakand PhD from Jamia Milia Islamia,New Delhi. Her current research area includes Digital Signal Processing, Bio signal acquisition and automated analysis. She has work experience of 19.5 years. She has published 58 research papers.

How to cite this paper: Shaveta Thakral, Dipali Bansal,"Design & Optimization of Reversible Logic Based ALU Using ACO", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.55-61, 2016. DOI: 10.5815/ijieeb.2016.06.07

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 55-61


I.J. Information Engineering and Electronic Business, 2016, 6, 62-68 Published Online November 2016 in MECS (http://www.mecs-press.org/) DOI: 10.5815/ijieeb.2016.06.08

An Analysis of Fuzzy and Spatial Methods for Edge Detection Pushpa Mamoria Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India Email: p.mat76@gmail.com

Deepa Raj Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India Email: Deepa_raj200@yahoo.co.in

Abstract—An image segmentation is an area in which image is subdivided into sub-regions for extracting characteristics of images which will help to analysis in various applications. For getting accuracy sharp changes of intensity is an important issue which is known as edge detection. In this paper various spatial edge detection methods and fuzzy based edge detection method has described and spatial edge detection methods and fuzzy if-then-else are compared to know which method will be more suitable to find edges for the enhancement of images. Index Terms—Image segmentation, Edge, Threshold, Fuzzy method.

I. INTRODUCTION A method in which various input image properties is obtained as an output known as transition. Transition state of the input image to output image lay in the segmentation. An Image segmentation splits an image into its subregions to collect the details which are found in a subregion of the image. These details are helpful to analyzing images for the enhancement of images. The accuracy of enhancement is based on accurate segmentation method. For this reason, more care should be taken to improve images with the help of accurate segmentation. It is helpful to control the environment in many applications like in military to detect objects, industrial inspection applications etc. An image segmentation method has two types of categories to achieve accuracy. The first category is describing a partition of images based on Sharpe changes in intensity known as edge and the second category is based on a set of predefined criteria known as Thresholding, region growing and region splitting and merging [1] [2]. With the help of thresholding edge, detection can be possible to overcome the criteria of the noisy condition. In this method first order derivative of Gaussian filter used for convolving the images [3]. Edges found by canny can also create some false edges. The structure of image can also find with the help of method known as USAN, which is helpful in edge detection [4]. Edges may also Copyright © 2016 MECS

find with the help of zero crossing [5]. For measurement of the degree of fuzziness, entropy has used, which produced 1-pixel wide edges [6]. Image segmentation is a difficult task to make images meaningful. By using canny method image segmentation may give many false edges to increase the complexity of image characteristics. Instead of Canny method, a two steps Chan These method is helpful to detect best edges [7]. All the classical spatial methods like Canny edge detector, Sobel method, Prewitt edge detection, and Laplacian of Gaussian are not been able to detect correct and smooth edges in images. Latterly a new method intuitionistic fuzzy set (IFS) theory was proposed to detect correct and smooth edges. The intuitionistic fuzzy method used the concept of entropy in various clustering algorithm [9]. The remaining paper is divided as follows. Section II is describing various classical edge detection techniques available in the literature. Section III is describing fuzzy domain edge detection techniques. Section IV is based on experimental study and results. And in the last section V conclusions are drawn.

II. SPATIAL DOMAIN EDGE DETECTION TECHNIQUES Spatial filtering is used in the Edge detection to used break off between gray levels. For the edge detection in images first order derivative and second order derivatives are used. First order derivative are worked out by using gradient and second order derivative are found by the Laplacian. A. Gradient operator The gradient of an image defined as the vector: [ ]

at a location

[

⁄ ⁄

]

is

(1)

Magnitude of above vector: √

(2)

The direction of the gradient vector:

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


An Analysis of Fuzzy and Spatial Methods for Edge Detection

(3)

 

Different types of edge detection methods are based on three categorizations:

( )

B. Classical edge detection methods: i.

63

It uses smoothing as a process to remove noise. Finding the gradient of images which has a large magnitude. By using double thresholding.

D. Second, order edge detection method

Sobel Mask

i.

Mask used by this method is:

Laplacian

The Laplacian of a 2-D function f(x,y) is a secondorder derivative. The Laplacian is merged with smoothing to find edges via zero-crossing.

Fig.1. Sobel Mask

Using the below equations Sobel mask is used for edge detection. +2

(4)

+2

(5)

Fig.4. Laplacian mask

In Laplacian filtering enhanced image can be found by: (10)

ii. Prewitt Mask Mask used by this method is:

ii. Laplacian of a Gaussian (LOG) Laplacian of a Gaussian sometimes is called the Mexican hat function. Gaussian function is used to smooth the image and Laplacian operator is used to constitute the location of edges by zero findings. Here, Figure-5 is a 5×5 mask to approximate the shape of Maxican hat function.

Fig.2. Prewitt Mask

Using the below equations Prewitt mask is used for edge detection. +

(6)

+

(7)

iii. Robert Mask Mask used by this method is:

Fig.5. Laplacian of Gaussian Filter Fig.3. Robert mask

Using the below equations Robert mask is used for edge detection. (

(8) (9)

C. First order edge detection method i.

Canny edge detection

Copyright © 2016 MECS

III. FUZZY DOMAIN EDGE DETECTION TECHNIQUES In many applications, edge detection becomes very important due to the criticality of use of images like in medical images diagnosis of disease is very crucial. Due to this reason, images must be free of poor contrast, vagueness, blurred or broken edges. That’s why because of the above reasons fuzzy method is suitable to take into account the unclearness and equivocalness present in the image [17]. Several Fuzzy edge detection methods are following:

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


64

An Analysis of Fuzzy and Spatial Methods for Edge Detection

When = =½ (17) The best set of parameter ( ̃ , ̃ ) will satisfy the following condition:

A. Fuzzy Sobel edge detector [13] In this method the image is divided into two regions: i.

Fuzzy edge region

H (̃, ̃) =

In this method, if pixels have a high difference in the gray level with their neighborhood region then the pixels are separated by the fuzzy edge region.

Edge – image(x, y) = 1

In this method, if pixels have less difference of gray level with their neighborhood region then the pixels are separated by the fuzzy smooth region. By using fuzzy reasoning the modified fuzzy edge detector is generalized by using following fuzzy rules: R(x, y) = 255, if G(x, y) ≥ HT = 0, if G(x, y) ≤ LT = G(x, y).max ( (x, y), (x, y)) Otherwise

(11)

B. Entropy-Based Fuzzy Edge Detection [11] As per information theory, defined formula of entropy is given as: H (t1, t2) = -

=∑

)–

( )

(

(12)

are

.

(k)

.

(13)

(k)

(14)

Here and are weight area on the gradient histogram and when membership function and are weights. In edge detection, the best parameter values are compact edge representation of images that’s why minimum entropy H ( ) parameters are selected. The necessary assumptions for minimum and maximum entropy are: ∂H (

= 0 = - (∂ 1-

/∂

Χ log (

= 0 = - (∂ / 1-

Copyright © 2016 MECS

Two types of fuzzy template based edge detector method are available. In first method, fuzzy edge templates are designed and these templates are convolved with the image [12]. For finding the existence of edge: (20)

Where n=number of templates. The new images are found with the support of threshold, the value which is below the threshold are set as 0 and above the threshold are set as 1. In second method fuzzy divergence used in between image window and a set of 16 fuzzy templates [8]. D. Fuzzy If-then Rules based Edge Detection This fuzzy technique is based on rule-based fuzzy logic. The fuzzy rule-based concept has been taken from Fuzzy set theory [14] [15] because of its simplicity and effectiveness. It is an inference form of uncertain knowledge to handle and analyze information in an effective manner. It could be combined different filters along with fuzzy if-then rules to detect edges for enhancement of images [16]. For edge detection fuzzy logic algorithm using following rules [1]: (1) If a pixel belongs to the same region, then make it brighter; else make it darker, where values related to brighter and darker are fuzzy sets. (2) A 3x3 pixel neighborhood and corresponding intensity differences between the center pixels and its neighbors are shown below.

/

/∂

Fig.6. 3x3 mask

The followings are the if-then-else rules based on fuzzy values.

Χ log (

)) (16)

The entropy will be minimum,

(19)

)) (15)

∂H (

(18)

C. Fuzzy template based edge detector

H(x, y) =

Where and are the membership functions of the image smooth and edge regions. G(x, y) is the gradient value using Sobel operator. R(x, y) is the resultant pixel at location(x, y).

=∑

(r)))

The edge image is calculated as

ii. Fuzzy smooth region

Where Probability distributions.

(r),

IF IF IF IF

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


An Analysis of Fuzzy and Spatial Methods for Edge Detection

Image 2

Image 3

Image 4

SOBEL

12698.83

14161.36

17974.31

17702.02

PREWITT

12698.91

14161.36

17974.38

17702.12

ROBERT

12699.58

14163.92

17974.16

17702.16

CANNY

12683.91

14161.25

17962.88

17693.80

FUZZY IF-THEN ELSE

0.34

0.34

0.24

0.27

18000 16000 14000 12000 10000 8000 6000 4000 2000 0

(

)

Image 2

(21)

Where, MSE (Mean square error) is as follows:

FUZZY IF-THEN ELSE

The mathematical formula of the PSNR is as follows: PSNR = 20

Image 1 CANNY

In this, we are comparing various classical method of edge detection with fuzzy if-then-else rule-based edge detection method. Based on performance evaluation parameters, we can compare various methods. In the experiments, we have compared various methods of edge detection on the basis of two performance analysis parameters. One is PSNR and another is MSE, below every output image these values are written and we can compare on the basis of these values, which method is giving better results.

Image 1

ROBERT

Step-1: Apply the simple filters (convolution) to obtain the image gradients. Step-2: Define the image gradients of input image with respect to x-axis and y-axis direction. Step-3: Design a Fuzzy Inference System (FIS) with 2 inputs and 1 output. Step-4: Set the Fuzzy-If Then Rules into FIS. Step-5: Obtain the membership function of image gradients as Ix and Iy. Step-6: Apply the Ix and Iy to above FIS for Evaluation. Step-7: Store the result obtained from output of FIS. This resultant would produce the edge of the image.

Table 1. MSE values of different images

PREWITT

In this study, several images have taken from internet and Matlab for the experiment of edge detection. Followings are the procedure of edge detection using fuzzy If-then rules:

SOBEL

IV. EXPERIMENTAL STUDY AND RESULTS

Where f = matrix data of the original image, g = matrix data of the degraded image, m = no of rows of pixels of an image, I = index of that row, n = no of columns of pixels of an image, j = index of that column. Higher the values of PSNR will show better results because only on the basis of visualization strongly we cannot reach the conclusion, which method is better.

MSE Value

Membership functions ZE, BL, and WH is used for the fuzzy sets zero, black, and white.

65

Image 3 Image 4

Edge detection methods MSE =

(22) Fig.7. PSNR values of different images

INPUT IMAGES

(a)

Kid image

Copyright © 2016 MECS

(b)

Rice image

(c)

Cameraman image

(d)

Lena image

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


66

An Analysis of Fuzzy and Spatial Methods for Edge Detection

SOBEL

Prewitt

Robert

Canny

Fuzzy Ifthen

Fig.8. Results of various edge detection methods

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


An Analysis of Fuzzy and Spatial Methods for Edge Detection

[3]

Table 2. PSNR values of different images. Image 1

Image 2

Image 3

Image 4

SOBEL

7.40

6.65

5.62

11.71

PREWITT

7.40

6.65

5.62

11.71

ROBERT

7.40

6.65

5.62

11.71

CANNY FUZZY IFTHEN ELSE

7.40

6.65

5.62

11.71

53.14

52.79

54.34

59.87

60

[4]

[5] [6]

[7]

PSNR Value

50 40 30 Image 1

20

[8]

Image 2

10

Image 3

0

[9]

Image 4

[10]

Edge detection methods [11] Fig.9. PSNR values of different images

[12]

V. CONCLUSION In this paper, edge detection based on image segmentation of various spatial methods and fuzzy methods are discussed and compared the result of various spatial edge detection technique and fuzzy ifthen method. The experimental study is performed on various images collected from internet and Matlab. As per experimental results fuzzy method is giving better results as compared to spatial fuzzy methods like Sobel, Prewitt, Robert and Canny on the basis of visualization and with the help of quantitative values of performance parameters like PSNR, and MSE. The values of PSNR of the fuzzy method are higher as compared to other methods. Graphs of MSE values and PSNR values clearly showing better result of fuzzy if-then else method on the basis of their higher values of PSNR. However, fuzzy edge detection method is giving better results. REFERENCES [1] [2]

[13]

[14] [15] [16]

[17]

67

Upper Saddle River, NJ, 1997. J. F. Canny,―A computational approach to edge detection,‖ IEEE Trans. On Pattern Analysis and Machine Intelligence. 8(6), 679/698, 1986. V. K. Madasu, S.Vasikarla, ―Fuzzy edge detection in biometric systems,‖ 36th Applied Imagery Pattern Recognition Workshop, IEEE, 2007. D. Marr, and E.C. Hildreth, ―Theory of edge detection,‖ Proc. Of the Royal Society of London, 187/217, 1980. M.Hanmandlu, J.See, and S.Vasikarla, ―Fuzzy edge detector using entropy optimization‖,Proc. ITCC, 665/670, 2004. Y.-S. Chen, Y.-M. Chang, J.-C. Lin, ―Comparing Intuitionistic Fuzzy Set Theory Method and Canny Algorithm for Edge Detection to Tongue Diagnosis in Traditional Chinese Medicine‖, Proc. Of the International conference of Information Application (ICCIA 2012), 2012. T. Chaira, and A.K. Ray, ―A new measure using intuitionistic fuzzy set theory and its Application to edge detection,‖ Applied Soft Computing, vol. 8-2, March, 2008, pp. 919- 927,doi:10.1016/j.asoc.2007.07.004, 2008. T. Chaira, ―A novel Intuitionistic fuzzy-C means clustering algorithm and its application to medical images,‖ Applied Soft Computing, vol.11-2, Mar. 2011, pp. 1711-1717, doi:10.1016/j.asoc.2010.05.005, 2011. P. R. Possa, S. A. Mahmoudi, NaimHarb, C. Valderrama, ―A Multi-Resolution FPGA- Based Architecture for Real-Time Edge and Corner Detection‖, IEEE Transactions on Computers, January 2013. S. E. El-Khamy, I. Ghaleb, N. A. El-Yamany, ―Fuzzy edge detection using minimum entropy‖, in Proceedings of 11th Mediterranean Electrotechnical Conference MELECON, Cairo, Egypt, 2002. Ho Kenneth, H.L., and Ohnishi, N., FEDGE—fuzzy edge detection by fuzzy categorization and classification of edges, in Fuzzy Logic in Artificial Intelligence, JCAI’95 Workshop, Selected Papers, pp. 182–196, 1995. Khamy, E.L. et al., Modified Sobel fuzzy edge detector, in Proceedings of 17th National Radio Science Conference (NRSC 2000), C32-1-9, Minui a, Egypt, 2000. L. A. ZADEH, ―Fuzzy sets‖, Information and control 8, 338-353, 1965. K. Pal and R. A. King, ―Image Enhancement using Fuzzy Set‖, Electronics Letters, Vol. 16, No. 10, May 1980. P. Mamoria, D. Raj, ―An Analysis of Images Using Fuzzy Contrast Enhancement Techniques‖, 3rd 2016 International Conference on Computing for Sustainable Global Development, INDIACom-2016(IEEE Conference ID: 37465), BVICAM, New Delhi, India. S. K. Dubey, S. Panday, ―Measurement of Usability of Office Application Using a Fuzzy Multi-Criteria Technique‖, IJITCS, MECS Publisher, Vol. 7, No 4, march 2015.

R. C. Gonzalez and R. E. Woods. ―Digital Image Processing,‖ 3rd ed. Prentice Hall, 2009. Jang, J.-S. R., C. T. Sun, and E. Mizutani, ―Neuro-fuzzy and Soft Computing: A Computational Approach to Learning and Machine Intelligence,‖ Prentice-Hall,

Copyright © 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


68

An Analysis of Fuzzy and Spatial Methods for Edge Detection

Authors’ Profiles Pushpa Mamoria received her BE degree in Computer Science and Engineering from Shri G.S. Institute of Technology and Science, Indore (MP), INDIA and M. Tech. degree in Computer Science in from School of Computer Science, DAVV, Indore. She is currently pursuing her Ph.D. degree in the Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India. Her major research interests include Digital Image Processing, Fuzzy Logic, Neural Network, Artificial Intelligence, cognitive science, wireless sensor networks.

Dr. Deepa Raj, Working as an assistant professor in the Department of Computer Science Babasaheb Bhim Rao Ambedkar University. She did her Post Graduation from J.K Institute of applied physics and technology, Allahabad University and Ph.D. from Babasaheb Bhim Rao Ambedkar University Lucknow in the field of software engineering. Her field of interest is Software Engineering, Computer Graphics, and Image processing. She has attended lots of National and International conference and numbers of research papers published in her field.

How to cite this paper: Pushpa Mamoria, Deepa Raj,"An Analysis of Fuzzy and Spatial Methods for Edge Detection", International Journal of Information Engineering and Electronic Business(IJIEEB), Vol.8, No.6, pp.62-68, 2016. DOI: 10.5815/ijieeb.2016.06.08

Copyright Š 2016 MECS

I.J. Information Engineering and Electronic Business, 2016, 6, 62-68


Instructions for Authors Manuscript Submission We invite original, previously unpublished, research papers, review, survey and tutorial papers, application papers, plus case studies, short research notes and letters, on both applied and theoretical aspects. Manuscripts should be written in English. All the papers except survey should ideally not exceed 18,000 words (15 pages) in length. Whenever applicable, submissions must include the following elements: title, authors, affiliations, contacts, abstract, index terms, introduction, main text, conclusions, appendixes, acknowledgement, references, and biographies. Papers should be formatted into A4-size (8.27″×11.69″) pages, with main text of 10-point Times New Roman, in single-spaced two-column format. Figures and tables must be sized as they are to appear in print. Figures should be placed exactly where they are to appear within the text. There is no strict requirement on the format of the manuscripts. However, authors are strongly recommended to follow the format of the final version. Papers should be submitted to the MECS Publisher, Unit B 13/F PRAT COMM’L BLDG, 17-19 PRAT AVENUE, TSIMSHATSUI KLN, Hong Kong (Email: ijieeb@mecs-press.org, Paper Submission System: www.mecs-press.org/ijieeb/submission.html), with a cowering email clearly staring the name, address and affiliation of the corresponding author. Paper submissions are accepted only in PDF. Other formats are not acceptable. Each paper will be provided with a unique paper ID for further reference. Authors may suggest 2-4 reviewers when submitting their works, by providing us with the reviewers’ title, full name and contact information. The editor will decide whether the recommendations will be used or not.

Conference Version Submissions previously published in conference proceedings are eligible for consideration provided that the author informs the Editors at the time of submission and that the submission has undergone substantial revision. In the new submission, authors are required to cite the previous publication and very clearly indicate how the new submission offers substantively novel or different contributions beyond those of the previously published work. The appropriate way to indicate that your paper has been revised substantially is for the new paper to have a new title. Author should supply a copy of the previous version to the Editor, and provide a brief description of the differences between the submitted manuscript and the previous version. If the authors provide a previously published conference submission, Editors will cheek the submission to determine whether there has been sufficient new material added to warrant publication in the Journal. The MECS Publisher’s guidelines are that the submission should contain a significant amount of new material, that is, material that has not been published elsewhere. New results are not required; however, the submission should contain expansions of key ideas, examples, and so on, of the conference submission. The paper submitting to the journal should differ from the previously published material by at least 50 percent.

Review Process Submissions are accepted for review with the same work has been neither submitted to, nor published in, another publication. Concurrent submission to other publications will result in immediate rejection of the submission. All manuscripts will be subject to a well established, fair, unbiased peer review and refereeing procedure, and are considered on the basis of their significance, novelty and usefulness to the Journals readership. The reviewing structure will always ensure the anonymity of the referees. The review output will be one of the following decisions: Accept, Accept with minor revision, Accept with major revision, Reject with a possibility of resubmitting, or Reject. The review process may take approximately three months to be completed. Should authors be requested by the editor to revise the text, the revised version should be submitted within three months for a major revision or one month for a minor revision. Authors who need more time are kindly requested to contact the Editor. The Editor reserves the right to reject a paper if it does not meet the aims and scope of the journal, it is not technically sound, it is not revised satisfactorily, or if it is inadequate in presentation.

Revised and Final Version Submission Revised version should follow the same requirements as for the final version to format the paper, plus a short summary about the modifications authors have made and author’s comments. Authors are requested to the MECS Publisher Journal Style for preparing the final camera-ready version. A template in PDF and an MS word template can be downloaded from the web site. Authors are requested to strictly follow the guidelines specified in the templates. Only PDF format is acceptable .The PDF document should be sent as an open file, i.e. without any date protection. Authors should submit their paper electronically through email to the Journal’s submission address. Please always refer to paper ID in the submissions and any further enquiries. Please do not use the Adobe Acrobat PDFWriter to generate the PDF file. Use the Adobe Acrobat Distiller instead, which is contained in the same package as the Acrobat PDFWriter. Make sure that you have used Type 1 or True Type Fonts(cheek with the Acrobat Reader or Acrobat Writer by clicking on File>Document Properties>Fonts to see the list of fonts and their type used in the PDF document).

Copyright Submission of your paper to this journal implies that the paper is not under submission for publication elsewhere. Material which has been previously copyrighted, published, or accepted for publication will not be considered for publication in this journal. Submission of a manuscript is interpreted as a statement of certification that no part of the manuscript is under review by any other formal publication. Submitted papers are assumed to contain no proprietary material unprotected by patent or patent application; responsibility for technical content and for protection of proprietary material rests solely with the author(s) and their organizations and is not the responsibility of the MECS Publisher or its editorial staff. The main author is responsible for ensuring that the article has been seen and approved by all the other authors. It is the responsibility of the author to obtain all necessary copyright release permissions for the use of any copyrighted materials in the manuscript prior to the submission. More information about permission request can be found at the web site. Authors are asked to sign a warranty and copyright agreement upon acceptance of their manuscript, before the manuscript can be published. The Copyright Transfer Agreement can be downloaded from the web site. Publication Charges and Re-print No page charges for publications in this journal. Reprints of the paper can be ordered with a price of 150 USD. Electronic: free available on www.mecs-press.org.To subscribe, please contact the Journal Subscriptions Department, E-mail: ijieeb@mecs-press.org. More information is available on the web site at http://www.mecs-press.org/ijieeb.



Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.