Int. Journal of Electrical & Electronics Engg.
Vol. 2, Spl. Issue 1 (2015)
e-ISSN: 1694-2310 | p-ISSN: 1694-2426
Implementation of Back-Propagation Neural Network using Scilab and its Convergence Speed Improvement
Abstract—Artificial neural network has been widely used for solving non-linear complex tasks. With the development of computer technology, machine learning techniques are becoming good choice. The selection of the machine learning technique depends upon the viability for particular application. Most of the non-linear problems have been solved using back propagation based neural network. The training time of neural network is directly affected by convergence speed. Several efforts are done to improve the convergence speed of back propagation algorithm. This paper focuses on the implementation of back-propagation algorithm and an effort to improve its convergence speed. The algorithm is written in SCILAB. UCI standard data set is used for analysis purposes. Proposed modification in standard backpropagation algorithm provides substantial improvement in the convergence speed. Keywords—Back-propagtion, SCILAB, Neural network I.
INTRODUCTION
Multilayer perceptron(MLP) based neural networks are playing a crucial role in artificial intelligence. The nonlinear complex problems can be easily solved using MLP system. Gradient based back-propagation algorithm is commonly used for designing the system to optimal criterion. Back-propagation based neural network controller showing remarkable performance in trajectory tracking problem of robotic arms[1]. Water pollution forecasting models, breast cancer detection[2], heart disease classification[3], remote sensing image classification[4] are some of the wide range of applications which are employing neural network efficiently. Recent trends in back-propagation based neural network are using raw image directly as input and they handle the problems like selecting the appropriate features and number of dimension reduction efficiently. Pedestrian gender classification, traffic sign recognition, character recognition, Age estimation, vision based classification of cells, vehicle recognition are some of the raw image direct processing based applications employing gradient based backpropagation algorithm[5]-[9]. Figure 1.1 showing the multilayer perceptron neural network architecture. The architecture consists of single input layer, multiple hidden layers and output layer. All the layers except input layer are processing layers. The system processes the data in two passes. In the first pass which is the forward pass, input features are fed to the input layer, output and gradients are calculated for each layer. In the backward pass the free parameters are updated. The stochastic gradient (online updation) and batch gradient(offline updation)method are used for weight updation[10]. Continuous activation functions are used for NITTTR, Chandigarh
EDIT -2015
gradient based system design. Back-propagation based systems are slow learner.
Figure1.6 MLP architecture II. SOFTWARE IMPLEMENTATION OF BPA SCILAB is an open source software providing the same functionality as MATLAB software. The Backpropagation algorithm is written in SCILAB 5.5.1. The following pseudo code is followed for writing code. 1. Initialization 2. Hidden layer output calculation 3. Final layer output calculation 4. Error calculation and delta calculation for each processing unit 5. Calculate loss function if meeting stopping criterion stop otherwise go to next step 6. Update the input to hidden layer weights 7. Update the output to hidden layer weights 8. Go to step 2 Initialization includes the parameters initialization(learning constant, number of hidden neurons, input, desired output, weights). The output of a neuron is function of the weighted sum of inputs connected to that neuron. Mean square error is used as a loss function. The weights are updated using the equation(i). w(t+1)=w(t)+ η*r(t)*x……..(i) Where w(t+1) is the updated weight value, w(t) is the previous weight value, is learning constant, r(t) is learning signal and x is the input signal. r(t) is delta signal which is the product of error and gradient of output. The designed system accepts number of hidden neurons from the user and contains only single hidden layer with one unit in the output layer. The system updates the weights in offline mode. The maximum mean square error from the whole samples is taken and gradient is calculated based on that. The updation of all samples is done with the same learning signal. The maximum mean square error 192
Int. Journal of Electrical & Electronics Engg.
Vol. 2, Spl. Issue 1 (2015)
e-ISSN: 1694-2310 | p-ISSN: 1694-2426
minimization reduces the errors of all the input samples with time and converges to a desired level. Convergence graph is also plotted between mean square error and number of iteration or epoch. Single iteration is equal to the calculation of maximum mean square error from all the samples. The developed system is tested for the Wisconsin breast cancer dataset. The information of the dataset and system implementation for this dataset is explained in the next section. III. Data set information and System implementation Wisconsin Breast Cancer(UCI) Database of 699 patients is used for experiment purpose. This dataset contains 9 input features namely Clump Thickness, Uniformity of Cell size, Uniformity of Cell Shape, Marginal Adhesion, Single Epithelial Cell, Bare Nuclei, Bland Chromatin, Normal Nucleoli, Mitosis and corresponding to these features, there are two classes of data benign data and malignant data. There are 458 benign dataset and 241 malignant dataset. 16 instances from the dataset are of unknown category. There is replication of features for same class. After removing the replicated and unknown instances(16), the dataset for experimentation reduced to 445 having 209 benign and 236 malignant dataset. Benign and malignant dataset is divided into two groups training dataset(70%) and testing dataset(30%). 146 benign data is used as training dataset and 63 benign dataset for testing. 165 malignant dataset is used for training and 71 malignant dataset for testing purposes[11]. Architecture of 9 input units, 6 hidden neurons and 1 output is made. The benign and malignant dataset are trained individually. The learning rate is 0.5 and the mean square error criterion is 0.001. Stopping criterion is error less than mean square error. The weights are initialized in the range 0 and 0.01. Several experiments are conducted to select the weight initialization. IV. CONVERGENCE GRAPH FOR BENIGN AND MALIGNANT DATASET Figure 1.2 showing the convergence graph for benign dataset. The system achieves the stopping criterion after 1452 iterations. Figure1.3 showing the convergence graph for the malignant dataset which converges after 480 iterations.
Figure1.8 Malignant data convergence curve V. Proposed modification The convergence speed of neural based system is very crucial parameter for good training. The generalization of the system depends upon the how well the system get trained. Training usually requires lots of time for complex data. Keeping training time in mind, an idea is developed to change the order of weight updation. The standard backpropagation algorithm follows the 8 steps as discussed above. Now if we exchange the weight updation step that is 6 and 7, the updated weight will get multiplied by delta error and hence modified error will flow from output to hidden. There is chance of faster convergence without missing the steps largely. The accuracy of the proposed system will quantify the validity of new idea. Based on above assumption, a proposed modification is applied and convergence graph for benign and malignant dataset is plotted. The new proposed method is tested for the same initialization as the standard method except the step 6 and step 7. Figure1.4 showing the convergence graph for benign dataset. The system achieves the stopping criterion after 1417 iterations. Figure1.5 showing the convergence graph for the malignant dataset which converges after 434 iterations.
Figure1.9 Benign data convergence curve(modified system) Figure1.7 Benign dataset convergence curve
193
NITTTR, Chandigarh
EDIT-2015
Int. Journal of Electrical & Electronics Engg.
Vol. 2, Spl. Issue 1 (2015)
e-ISSN: 1694-2310 | p-ISSN: 1694-2426
[4]
Figure1.10 Malignant dataset convergence curve (modified system) VI.
RESULTS
Table 1 Standard and modified BPA convergence iteration Dataset
Benign Malignant
Standard BPA convergence iteration 1452 480
Proposed BPA convergence iteration 1417 434
Table 1 shows the number of iterations required for convergence. An improvement of 2.41% for benign data and 9.59% for malignant data is observed. The trained system is also testes for accuracy. The test dataset is used for measuring the accuracy. It has been found that there is no change in the accuracy owing to the proposed method. VII.
J. Jiang, J. Zhang, G. Yang, D. Zhang, and L. Zhang, “Application of Back Propagation Neural Network in the Classification of High Resolution Remote Sensing Image”, 18th International Conference on Geoinformatics, pp. 1–6, 2010. [5] C.-B. Ng, Y.-H. Tay, and B.-M. Goi, “A convolutional neural network for pedestrian gender recognition”, in Advances in Neural Networks--ISNN , Lecture Notes in Computer Science, volume 7951, Springer International Publishing, pp. 558–564, 2013. [6] S. A. Radzi and M. Khalil-Hani, “Character recognition of license plate number using convolutional neural network”, Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 7066 LNCS, pp. 45–55, 2011. [7] P. Buyssens, A. Elmoataz, and L. Olivier, “Multiscale Convolutional Neural Networks for Vision – based Classification of Cells”, in Computer Vision--ACCV 2012, Lecture Notes in Computer Science, Springer Berlin Heidelberg, volume 7725, pp. 342–352, 2013.. [8] J. Jin, K. Fu, and C. Zhang, “Traffic Sign Recognition With Hinge Loss Trained Convolutional Neural Networks”, IEEE Intelligent Transportation Systems Magazine, vol. 15, no. 5, pp. 1991–2000, 2014. [9] C. Yan, C. Lang, T. Wang, X. Du, and C. Zhang, “Age Estimation Based on Convolutional Neural Network”, in Advances in Multimedia Information Processing – PCM ,Lecture Notes in Computer Science, vol. 8879, Springer International Publishing, pp. 211–220, 2104. [10] Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, “Gradient-based learning applied to document recognition”, Proceedings of the IEEE, vol. 86, pp. 2278–2323, 1998. [11] O. L. Mangasarian, R. Setiono, and W.H. Wolberg: "Pattern recognition via linear programming: Theory and application to medical diagnosis", in: "Large-scale numerical optimization", Thomas F. Coleman and Yuying Li, editors, SIAM Publications, Philadelphia, pp 22-30, 1990. [12]P. M. Leod and B. Verma, “Variable hidden neuron ensemble for mass classification in digital mammograms”, IEEE Computational Intelligence Magazine, vol. 8, no. February, pp. 68–76, 2013. [13] V. Baskaran, a Guergachi, R. K. Bali, and R. N. Naguib, “Predicting breast screening attendance using machine learning techniques”, IEEE Trans. Inf. Technol. Biomed, vol. 15, no. 2, pp. 251–259, 2011. [14] Ghosh, S., Mondal, S., & Ghosh, B. ”A comparative study of breast cancer detection based on SVM and MLP BPN classifier”, in proceeding of IEEE Automation, Control, Energy and Systems (ACES),pp. 1-4, 2014
CONCLUSION
Open source software is always beneficial as far as concerned the resources. SCILAB software is evolving swiftly. It is good software for researchers avoiding the constraint of licensed based costly software Backpropagation based multilayer perceptron code is written in SCILAB. The system is working well. The Proposed method of weight updation is providing faster convergence while keeping the same accuracy. FUTURE SCOPE The proposed system is tested for a single dataset belonging to same class. The efforts will be made to test the effectiveness of the modified algorithm for more standard dataset and multiple class dataset. REFERENCES [1]
[2]
[3]
J. J. Rubio, “Modified optimal control with a backpropagation network for robotic arms”, IET Control Theory & Applications, vol. 6, pp. 2216–2225, 2012. Y. M. George, H. H. Zayed, M. I. Roushdy, and B. M. Elbagoury, “Remote Computer-Aided Breast Cancer Detection and Diagnosis System Based on Cytological Images,” IEEE JOURNAL, vol. 8, no. 3, pp. 949–964, 2014. K. U. Rani, “Analysis of heart diseases dataset using neural network approach”, International Journal of Data Mining & Knowledge Management Process, vol. 1, no. 5, pp. 1–8, 2011.
NITTTR, Chandigarh
EDIT -2015
194