evolutionary algorithms for neural network learning...

The First Iranian Students Scientific Conference in Malaysia, Universiti Putra Malaysia (UPM)- Kuala Lumpur- Malaysia-2011

Evolutionary Algorithms for Neural Network Learning Enhancement

Zahra Beheshti *, Siti Mariyam Shamsuddin1

* Soft Computing Research Group , Faculty of Computer Science & Information System, Universiti Teknologi Malaysia, Skudai,

81310, Johor, Malaysia ([email protected])

1 Soft Computing Research Group, Faculty of Computer Science & Information System, Universiti Teknologi Malaysia, Skudai,

81310, Johor, Malaysia ([email protected])

Abstract

Artificial Neural Network (ANN) is one of the modern computational methods proposed to solve the majority of real world problems. Back-Propagation (BP) algorithm (as a gradient descend method) is one of the most popular methods for ANN training. However, there are some unavoidable disadvantages such as slow convergence speed, easily gets into partial extreme value and infirm global searching capability. To solve these problems, many solutions have been presented so far. Among them, Evolutionary Algorithms (EAs) have shown a good performance in this regard. EAs use some mechanisms inspired by biological evolution to find an optimization solution. Genetic Algorithm (GA), Particle Swarm Optimization (PSO) and Imperialist Competitive Algorithm (ICA) are in the class of algorithms. In this study, the mentioned optimization algorithms are chosen and applied in feed-forward neural network to enhance the learning process in terms of convergence rate and classification accuracy.

1. Introduction

Artificial Neural Network (ANN) is one of the intelligent systems that inspired by biological nervous systems. The computation is highly complex, nonlinear and parallel. ANN is very powerful tool in classification (Chambayil et al., 2010), pattern recognition (Tamura et al., 2009), prediction (Patra et al., 2010), optimization (Yadav et al., 2010), stock market forecasting (Vaisla et al., 2010), function approximation (Heinen. and Engel, 2010) and noise filtering (Jagannatha Reddy et al., 2010). One of popular ANN training technique is Back-Propagation (BP) Algorithm (Rumelhart and McClelland, 1986).

BP algorithm is a supervised or associative training which learns based on the target value or the desired outputs. Unfortunately, as any other algorithms, it suffers from some limitations. The major shortcomings of this algorithm are the existence of temporary, local minima resulting from the saturation behavior of activation function, and the slow rates of convergence (Zweiri et al., 2003). Hence, many researchers have employed some optimization algorithms like Evolutionary Algorithms (EAs) to train ANN.

EAs refer to a class of algorithms based on probabilistic adaptation inspired by the principles of natural evolution. They are robust and efficient at exploring an entire solution space of optimization problems (Yi et al., 2008). EAs have been successfully applied to evolve weights, structure, and training parameters of ANNs in recent years (Nadi et al., 2009). Although, their search space can be bigger than the gradient-based algorithm, they can ensure that better solutions will be generated over generations. They can quickly locate areas of high quality solutions when the domain is very large or complex. So far, Differential Evolution (DE), Genetic Algorithm (GA), Simulated Annealing (SA), Particle Swarm Optimization (PSO) and

http://en.wikipedia.org/wiki/Biological_evolution

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Jagannatha%20Reddy,%20M.V..QT.&newsearch=partialPref

http://en.wikipedia.org/wiki/Supervised_learning

Imperialist Competitive Algorithm (ICA) have been applied to ANN training. Junyou (2007) utilized PSO to train a Feed-Forward ANN for Stock price forecasting. Also, an ANN based on DE training was designed for weather forecasting by Abdul –Kader (2009). Moreover, Ismail- Wdaa and Shamsuddin (2008) used GA, PSO and DE algorithms to ANN training for classification problem on Cancer, Iris and Heart datasets. Chen (2009) proposed an improved GE for the training of ANN connection weights. Furthermore, GA and PSO have been applied for optimizing Feed-Forward ANN structure and parameters by Palmes et al. (2005) and Yu et al. (2008). However, there are still some unavoidable disadvantages of the above mentioned approaches. For instance, GA has its inherent disadvantages of the pre-maturity and the unpredictability of the result. Hence, researchers are designing new optimization algorithms to overcome these problems. In this paper, BP, GA, PSO and a recently propose algorithm, ICA, are used to optimize the weights of ANN on GLASS and CANCER datasets for classification problems. 3. Methodology

EAs can find appropriate weights (Nadi et al., 2009) to minimize the Mean Squared Error (MSE) of ANN in training. As shown in the flowchart of Figure 1, EAs are applied in a fixed structure to obtain proper ANN weights. At first, the population is initialized. Each particle of the population has a dimension which is the number of ANN weights. Then, the ANN is trained by the weights and calculated the training error. ANN weights recalculated based on the formula of EAs and this cycle is repeated until the best weights are found to minimize the training error. When the training process is finished, the weights are used to calculate the classification error for the training patterns.

Figure 1: The flowchart of EAs for ANN training

1

Initializing Population

Particle 01

Particle 02

Particle 03 .

.

.

Training NN with

Initial Particles

2 Calculating Training Error

-2.843

3.765

1.768

-0.876

0.743

Updating the formulas of EA based on Training

Error and Generating new weights

3

4

Are Stop Conditions

satisfied?

Stop

Yes

5 -1.563

2.235

-1.568

-0.876

0.543

Train NN

No

In EAs Neural Network (EAsNN), the number of dimension is referring to number of weight and bias that is based on the dataset and ANN structure (Al-kazemi, 2002). Equation 1 shows how the dimension is calculated for each particle. In this Equation, input, output and hidden are the number of input, output and hidden layer of ANN respectively.

𝐷𝑖𝑚𝑒𝑛𝑡𝑖𝑜𝑛 = (𝑖𝑛𝑝𝑢𝑡 ∗ ℎ𝑖𝑑𝑑𝑒𝑛) + (ℎ𝑖𝑑𝑑𝑒𝑛 ∗ 𝑜𝑢𝑡𝑝𝑢𝑡) + ℎ𝑖𝑑𝑑𝑒𝑛 𝑏𝑖𝑎𝑠 + 𝑜𝑢𝑡𝑝𝑢𝑡 𝑏𝑖𝑎𝑠 (1)

In this paper, BP, GA, PSO and ICA algorithms have been considered to evaluate the performance of EAsANN. The necessary parameters for training ANN have been shown in Table 1. Also, GLASS and CANCER datasets have been chosen for classification problems. GLASS dataset includes glass component analysis for glass pieces that belongs to 6 classes. It contains 214 samples with 10 attributes. The CANCER dataset has been generated at hospitals at the University of Wisconsin Madison and includes 9 attributes that belongs to 2 classes with 699 samples. Samples are divided into training samples and testing samples. If the number of samples is S, then the number of training samples is 2S/3 and the number of testing samples is S/3.

Table1. Selected Parameters for ANN training

Algorithm Parameters

BP Activation Function=Tanh, Learning Rate=0.5, Momentum Factor=0.9

GA Mutation Rate=0.02, Crossover Rate=0.6

PSO w is from 0.7 to 0.4, C1=C2=2

ICA Zeta=0.02, Revolution Rate=0.5, γ= 0.5, β=2

4. Results

In this section, the results of EAsANN training have been shown. Figure 2 illustrates the Test error (False Classification Percent) and the Corrected Classification training for the BP, GA, PSO and ICA on GLASS and CANCER datasets. It can be seen that the PSO has less misclassified testing for GLASS and CANCER datasets. Also, the Corrected Classification training of PSO and ICA are better than other algorithms which are 99.75% and 99.6% on CANCER dataset respectively. Besides, BP has the worst performance among the algorithms in this regard.

0

20

40

60

80

BP GA PSO ICA

False Classification Test (%)

GLASS CANCER

0

20

40

60

80

100

BP GA PSO ICA

Corrected Classification Train (%)

GLASS CANCER

Figure 2: Test error (False Classification Percent) and Corrected Classification training

5. Conclusion In this paper, the EAS used to evaluate classification problems on GLASS and CANCER datasets. The results of ANN training were compared together. The significant result is that the EAs have better performance than BP algorithm in classification problems.

References

Al-kazemi, B. and Mohan, C.K. (2002). Training Feed-forward Neural Network Using Multi-phase Particle Swarm Optimization, Proceedings of the 9th International Conference on Neural Information Processing. New York.

Chambayil, B., Singla, R.and Jha, R. (2010). EEG Eye Blink Classification Using Neural Network, Proceedings of the World Congress on Engineering 2010: WCE 2010, Vol. 1, London, U.K.

Ismail-Wdaa and shamsuddin, S. M. (2008). Differential Evolution for neural networks training enhancement, M.Sc. Thesis, University Technology of Malaysia.

Heinen, M. R. and Engel, P. M. (2010). IPNN: An Incremental Probabilistic Neural Network for Function Approximation and Regression Tasks, Artificial Neural Networks – ICANN 2010, Lecture Notes in Computer Science, Vol. 6353/2010, pp. 170-179.

Jagannatha Reddy, M.V., Gupta, S. K. and Kavitha, B. (2010). Noise Load Adaptive Filter Using Neural Network, International Conference on Data Storage and Data Engineering (DSDE), pp. 197 – 200.

Junyou, B. (2007), Stock price forecasting using PSO-trained Neural Networks, Proceedings of IEEE Congress on Evolutionary Computation (CEC 2007), pp 2879-2885.

Nadi, A., Tayarani-Bathaie, S. S. and Safabakhsh, R. (2009). Evolution of Neural Network Structure and Weights Using Mutation Based Genetic Algorithm, Proceedings of the 14th International CSI Computer Conference (CSICC'09).

Palmes, P. P., Hayasaka, T. C., and Usui, S. (2005). Mutation-based genetic neural network, IEEE Transaction on Neural Networks, Vol. 16, No. 3, pp 587–600.

Patra, P.K., Sahu, M., Mohapatra, S. and Samantray, R.K. (2010). File Access Prediction Using Neural Networks, IEEE Transactions on Neural Networks, Vol. 21, No. 6, pp. 869 – 882.

Rumelhart, D.E. and McClelland, J. L. (1986). Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Vol. 1, MIT Press, Cambridge, MA.

Tamura, H., Gotoh, T. and Oku, D. (2009). A Study on the S-EMG Pattern Recognition using Neural Network, International Journal of Innovative Computing, Information and Control, Vol. 5, No. 12(B), pp. 4877–4884.

Vaisla1, K., Bhatt, A. K. and Kumar, S. (2010). Stock Market Forecasting using Artificial Neural Network and Statistical Technique: A Comparison Report, International Journal of Computer and Network Security, Vol. 2, No. 8, pp. 50-55.

Yadav, S. , Pathak, K. K. and Shrivastava R. (2010). Shape Optimization of Cantilever Beams Using Neural Network, Applied Mathematical Sciences, Vol. 4, No. 32, pp. 1563 – 1572.

Zweiri, Y. H., Whidborne, J.F. and Sceviratne, L. D. (2002). A Three-term Back-Propagation Algorithm, Neurocomputing, Vol. 50, pp 305-318.

http://www.springerlink.com/content/978-3-642-15821-6/

http://www.springerlink.com/content/0302-9743/

http://www.springerlink.com/content/0302-9743/

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Jagannatha%20Reddy,%20M.V..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Gupta,%20S.K..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Kavitha,%20B..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5452230

http://ieeexplore.ieee.org/xpl/mostRecentIssue.jsp?punumber=5452230

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Patra,%20P.K..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Sahu,%20M..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Mohapatra,%20S..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=Authors:.QT.Samantray,%20R.K..QT.&newsearch=partialPref

http://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=72

http://ieeexplore.ieee.org/xpl/tocresult.jsp?isnumber=5475433

evolutionary algorithms for neural network learning...

Documents