www.seipub.org/aee
Advances in Energy Engineering (AEE) Volume 3, 2015 doi: 10.14355/aee.2015.03.002
The Fault Diagnosis of Wind Turbine Gearbox Based on Improved KNN Long Peng1, Bin Jiao2, Hai Liu3, Ting Zhang4 School of Electrical Engineering, Shanghai Dianji University, No. 1350, GanLan Road, LinGang New City, PuDong New District, Shanghai, China 375696898@qq.com; 2jiaob@sdju.edu.cn; 3642993430@qq.com; 41013753534@qq.com
1
Abstract K-Nearest Neighbor Algorithm (KNN) is a fault pattern recognition method commonly used in the field of fault diagnosis. On account of the problem that the value of K is too difficult to determine in KNN algorithm—if K is too small, the classification results are susceptible to noise influence; if K is too big, near neighbor may contain too many other types of points. In this paper, we put forward an improved method. Meanwhile, we designed the experiment about the fault diagnosis of gearbox and applied experimental data to instance verification of improved KNN algorithm in Matlab. The experimental results indicated that this improved algorithm had high recognition efficiency. Key Words K-Nearest Neighbor Algorithm; Wind Turbine Gearbox; Matlab; Fault Diagnosis
Introduction Wind power is the fastest growing energy resource in the world and will be continuously developed for a long time. In the next 20 years, wind power in United States and Europe will account for 20% of the total amount of their energy [1]. The cost of the wind turbine gearbox accounts for about 30% of the whole wind turbine equipment. The wind turbine gearbox is the equipment which is easy to cause fault and lead to the shutdown of the wind turbine. The downtime of the wind turbine gearbox approximately accounts for more than 60% of the total downtime. Therefore, so many researches focus on the fault diagnosis and the monitoring of the wind power gearbox and hope to effectively reduce the generation of wind turbine fault and save the cost of wind power through effective maintenance and repair by the judgment of running state of the gearbox [2]. The process of the fault diagnosis of wind turbine gearbox is essentially the recognition process of its fault type. The process infers the current running state of the wind turbine gearbox through the extraction and classification of feature information gain from vibration signal. At present, widely applied methods of gearbox fault diagnosis are: Decision tree, K-Nearest Neighbor, Support vector machine, Bayes method, Back propagation neural network, etc [3-9]. The KNN algorithm is relatively simple. It’s easy to understand and implement without estimate parameters and training. KNN is particularly suitable for multi-modal problem, whose object has multiple class labels. For example, the performance of KNN is better than that of SVM while judging the classification of its function according to genetic characteristics. K-Nearest Neighbor Algorithm The Principle of K-Nearest Neighbor Algorithm KNN algorithm is relatively simple classification algorithm [10]. The main idea of this algorithm is to calculate differences between samples to be classified and train samples, sort these differences from small to large. Then the KNN algorithm chooses the first K categories with smallest difference, counts the category which is the most frequent in K categories as the most similar category and finally assigns samples to be classified to the most similar train samples. The process of KNN algorithm is mainly divided into five points. Firstly, calculate the distance between classification samples and train samples; Secondly, sort these distances in a certain order; Thirdly, choose corresponding categories of the first K distances; Fourthly, calculate the frequency of occurrence of corresponding
8
Advances in Energy Engineering (AEE) Volume 3, 2015
www.seipub.org/aee
categories of the first K points; Fifthly, get the predicted classification results. In order to calculate differences effectively, the common calculation method of KNN is Euclidean distance [11]:
= D
n
∑ ( xi − yi )
(1)
2
i =1
In this formula, xi and yi respectively are training sample and test sample; n is the dimension of xi and yi. The Improved Method of K-Nearest Neighbor Algorithm The choice of the K value in KNN algorithm also has a major influence on algorithm. If the K value is too small, running results will be very sensitive to noise; If the K value is too large, running results will include too many points of other categories. Therefore, how to accurately and effectively choose the K value is the key of successful modeling of KNN algorithm. There are some methods to determine the K value, such as the empirical approach, exhaustive method, etc. It’s obvious that the empirical approach completely relies on experience to get the K value, and there is a great deal of randomness. Exhaustive method brings different K values into the network to do training respectively according to experience. The training process of this method is not only tedious but also time-consuming. Therefore, we put forward an improved method, which is described as follows: First, according to empirical rule the K value is generally less than the number of train samples to determine preliminary value range of K: K1≤K≤K2. Second, calculate the KNN network when K takes the value of K2, get the recognition rate Accuracy2 of the test sample of the network and take recognition rate of the test sample as performance evaluation parameters of selecting the K value. Accuracy=length(y==t)/length(y)*100, in this formula, y is the expected output of the test sample, t is the actual output of the test sample and length is the function to calculate the string length. Third, when K=(K1+K2)/2 and K is integer, calculate the recognition rate Accuracy1 of the test sample of KNN network. Fourth, if Accuracy1>Accuracy2, K2=K, Accuracy2=Accuracy1; On the contrary, K1=K. Fifth, if K1<K2, return to step 2 and restart the program; On the contrary, exit the program and terminate the algorithm. Finally, the K value got from this program is the better K value. The Experimental Design and Modeling Simulation Analysis of Wind Turbine Gearbox The Integral Design of the Experimental System The whole hardware structure of this system includes a three-phase asynchronous motor, a variable speed gearbox, a variable-frequency driver, a set of sensors got from New Zealand Commtest, vbOnline data acquisition card and the terminal server. The software of this experimental system adopts fully functional and strong Ascent data acquisition and analysis software, which forms a complete set of data acquisition card got from Commtest. The structure is shown in Fig. 1: Variablefrequency
Motor
driver Gearbox
Terminal Server Data Acquisition Card
Sensor
FIG. 1 EXPERIMENTAL SYSTEM STRUCTURE
9
www.seipub.org/aee
Advances in Energy Engineering (AEE) Volume 3, 2015
Fault Sample Data Acquisition In order to verify the effectiveness of this method in the fault diagnosis of the wind turbine gearbox, we make the practical analysis of the gearbox fault diagnosis data collected by above system. Because the gearbox has a complex transmission system and its fault mode also has a complicated non-linear relationship with characteristic quantity, so itâ&#x20AC;&#x2122;s necessary to make comprehensive analysis of multiple sets of fault characteristic parameter of time domain and frequency domain. In this paper, we give 15 sets of data in Table 1, they respectively are three typical fault samples of the gearbox (which have noise reduction, normalization [14-15] processing). Among these samples, 1-9 sets are the training samples and 10-15 sets are the test samples. The Optimization of KNN Gearbox Fault Identification We apply the data from Table 1 and the optimized KNN algorithm to fault diagnosis of gearbox and get the K value by means of the improved approach mentioned above, that is K1=1. According to the experience that the K value is best not larger than training samples, so K2=9. Then we get the optimal value of K, K=5. TABLE 1 VIBRATION SIGNAL CHARACTERISTIC PARAMETERS OF SAMPLES FOR DIAGNOSIS OF GEARBOX
Serial Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Actual Fault Type Normal Gear Face Wear Tooth Breakage Normal Gear Face Wear Tooth Breakage
Peak Index
Kurtosis Index
Margin Index
Skewness Index
Spectrum Center
0.796 0.772 0.782 0.369 0.36 0.377 0.675 0.667 0.649 0.348 0.768 0.654 0.375 0.639 0.776
0.196 0.22 0.209 0.117 0.105 0.106 0.167 0.175 0.163 0.121 0.224 0.177 0.114 0.162 0.213
0.002 0.002 0.002 0.128 0.143 0.133 0.04 0.038 0.046 0.136 0.002 0.042 0.13 0.046 0.003
0.005 0.005 0.005 0.168 0.165 0.161 0.049 0.051 0.06 0.166 0.005 0.056 0.164 0.067 0.006
0 0 0 0.003 0.004 0.003 0.001 0 0 0.003 0 0.001 0.003 0.001 0
Frequency Domain Variance 0 0 0 0.005 0.004 0.005 0.001 0.001 0.002 0.004 0 0.001 0.004 0.001 0
Harmonic Factor
Impulse Factor
0.001 0 0.001 0.127 0.147 0.147 0.043 0.044 0.049 0.137 0 0.045 0.137 0.056 0.002
0 0 0 0.083 0.072 0.068 0.024 0.023 0.031 0.084 0 0.025 0.073 0.028 0
To illustrate the optimization of the K value got from this paper, we list the correct recognition rate under all K values. As shown in figure 2, when K value is 5, the recognition accuracy rate of the sample reaches 100%. When K value changes from 5 to 9, the recognition accuracy rate of the sample is gradually reduced. Itâ&#x20AC;&#x2122;s easy to find that the K value is not the bigger the better.
FIG. 2 THE TOTEL RECOGNITION RATE OF TEST SAMPLE UNDER DIFFERENT K VALUES
10
Advances in Energy Engineering (AEE) Volume 3, 2015
www.seipub.org/aee
Meanwhile, in order to prove the fact that the improved KNN algorithm in this paper has more advantages than traditional KNN algorithm, we give figure 3 to show the comparison between the actual result and the forecast result of test set got by the improved KNN algorithm in this paper. As shown in figure 3, three fault categories of six test samples forecast by the improved KNN algorithm is exactly the same as expected output, the recognition accuracy rate of the sample reaches 100%; Figure 4 is the comparison between the actual result and the forecast result of test set got by the traditional KNN algorithm. As shown in figure 4, in three fault categories of six test samples forecast by the traditional KNN algorithm, errors respectively appear in the third and the fifth sample, both two errors make the recognition accuracy rate of the sample only reach 66.7%.
FIG. 3 THE COMPARISON OF FORECAST RESULTS IN TEST SET GOT BY THE IMPROVED KNN ALGORITHM
FIG. 4 THE COMPARISON OF FORECAST RESULTS IN TEST SET GOT BY THE TRADITIONAL KNN ALGORITHM
From above figures and analysis, itâ&#x20AC;&#x2122;s not hard to find that the improved KNN algorithm exactly reduces the training volume and increases the efficiency and the recognition rate in contrast with the traditional KNN algorithm. Conclusions In this paper, we briefly introduced the basic principle of KNN and put forward an improved method about how to identify the K value quickly and effectively. Then we designed the experiment of fault diagnosis of wind turbine gearbox and verified fault diagnosis of wind turbine gearbox based on the improved KNN. The results showed that the improved KNN algorithm is exactly more effective. 11
www.seipub.org/aee
Advances in Energy Engineering (AEE) Volume 3, 2015
ACKNOWLEDGEMENT
This work is supported by the project of Shanghai Science and Technology Commission (Grant No.13DZ0511300). REFERENCES
[1]
Zhu Xiao. Present Situation and Development of Wind Power Generation [J]. Guide of Sci-tech Magazine, 2014 (23): 153-154.
[2]
Li Junyuan. Study of Gearbox Fault Diagnosis System of Wind Turbine [J]. Wind Energy Industry, 2014 (3): 32-36.
[3]
Feng Yongxin, Yang Wenguang, Jiang Dongxiang. Fault Recognition Method for Gear Case of Wind Power Generator Based on Decision Tree Classification Algorithm and Expert System [J]. Guangdong Electric Power, 2013, 26(4):17-21.
[4]
Cai He, Zhang Rui. Analysis and Research on K-Nearest Neighbor Algorithm [J]. Gansu Science and Technology, 2012, 28(18):15-16.
[5]
Guo Hui, Liu Heping, Wang Ling. Method for Selecting Parameters of Least Squares Support Vector Machines and Application [J]. Journal of System Simulation, 2006, 18(7): 2033-2037.
[6]
Vapnik V, Golowich S, Smola A. Support vector method for function approximation, regression estimation and signal processing[J]. Advances in Neural Information Processing Systems, 1996, 9(2): 281-287.
[7]
Zhang Ru, Zhou Zhen. Fault Analysis and Diagnosis of the Wind Turbine Gearbox Based on Bayesian Network Model [D]. Heilongjiang: Harbin University of Science and Technology, 2014.
[8]
Liu Tianshu, Wang Fulin. The Research and Application on BP Nerual Network Improvement [D]. Heilongjiang: Northeast Agricultural University, 2011.
[9]
Meik Schlechtingen, Ilmar Ferreira Santos. Comparative analysis of neural network and regression based condition monitoring approaches for wind turbine fault detection [J]. Mechanical Systems and Signal Processing, 2011, 25: 1849-1875.
[10] Feng Guohe, Wu Jingxue. The Improved Progress of KNN Classification Algorithm [J]. Library and Information Service, 2012, 56(21):97-100. [11] Bu Fanjun, Qian Xuezhong. The Improvement of KNN Algorithm and the Application in Text Classification [D]. Jiangsu: Jiangnan University, 2009. [12] Huang Juanjuan. Research and Improvement Based on the Choice of Text Classification Feature and Classification Algorithm of KNN [D]. Fujian: Xiamen University, 2014. [13] Hu Zhi, Duan Lixiang, Zhang Laibin. Application of Improved KNNC Method in Fault Pattern Recognition of Rolling Bearings [J]. Journal of Vibration and Shock, 2013, 32(22): 84-87. [14] E. Al-Ahmar, M.E.H. Benbouzid, Y. Amirat and S.E. Ben Elghal. DFIG-Based Wind Turbine Fault Diagnosis Using a Specific Discrete Wavelet Transform [C]. IEEE International Conference on Electrical Machines, 2008:1-6. [15] Liu Xiaotong. Study on Data Normalization in BP Neural Network Input Layer [J]. Mechanical Engineering & Automation, 2010, 3:122-123. Long Peng was born in Xinyang, Henan Province on October 10, 1988. Now he is the postgraduate majoring in electrical engineering of Shanghai Dianji University which is located in Shanghai, China. The masterâ&#x20AC;&#x2122;s degree will be got in April 2016. He once worked in Wiscom as an electrical engineer in Nanjing, Jiangsu Province. Jiao Bin was born in Jiangsu Province in June, 1958. He once was majored in electrical engineering and got doctorâ&#x20AC;&#x2122;s degree from East China University of Science and Technology which is located in Shanghai, China. He served as VICE PRESIDENT of Shanghai College of Electrical Machinery Technology in 2003 and also served as VICE PRESIDENT of Shanghai Dianji University since 2004 in Shanghai, China. He published several articles as Genetic Algorithm for Improving the Signal Neuron PID Controller and Its Application (Shanghai, China: Jiao Bin, Lin Jiajun, 2005). His main research directions include intelligent optimization, intelligent control, power electronics and its application and automatic control. Prof. Jiao is the member of Shanghai Automation Association and also the member of Shanghai Chemical Engineering Computer Application Professional Committee. Over the years, he undertook National Natural Science Foundation Project as
12
Advances in Energy Engineering (AEE) Volume 3, 2015
www.seipub.org/aee
the main member of his team and won the provincial and ministerial Science and Technology Award. Hai Liu was born in Yancheng, Jiangsu Province on January 28, 1990. Now he is the postgraduate majoring in electrical engineering of Shanghai Dianji University which is located in Shanghai, China. The masterâ&#x20AC;&#x2122;s degree will be got in April 2016. Ting Zhang was born in Yancheng, Jiangsu Province on November 20, 1990. Now he is the postgraduate majoring in electrical engineering of Shanghai Dianji University which is located in Shanghai, China. The masterâ&#x20AC;&#x2122;s degree will be got in April 2016.
13