Prediction of Methane Inclusion Types Using Support Vector Machine

Scientific Journal of Earth Science June 2015, Volume 5, Issue 2, PP.18-27

Prediction of Methane Inclusion Types Using Support Vector Machine Guangren Shi

Department of Expert, Research Institute of Petroleum Exploration and Development, PetroChina, P. O. Box 910, Beijing 100083, China Email: grs@petrochina.com.cn

Abstract Three classification algorithms and one regression algorithm have been applied to forecast methane inclusion types. The three classification algorithms are the classification of support vector machine (C-SVM), the naïve Bayesian (NBAY), and the Bayesian successive discrimination (BAYSD), while the one regression algorithm is the multiple regression analysis (MRA). In the four algorithms, only MRA is a linear algorithm whereas the other three are nonlinear algorithms In general, when all these four algorithms are used to solve a real-world problem, they often produce different solution accuracies. Toward this issue, the solution accuracy is expressed with the total mean absolute relative residual for all samples, R(%). And two criteria are proposed: a) an algorithm is applicable if its R(%)<10, otherwise the algorithm is imapplicable; and b) R(%) of MRA is employed to measure the nonlinearity degree of a studied problem: weak if R(%)<10, moderate if 10≤ R(%)≤30, and strong if R(%)>30. A case study at the Puguang Gasfield has been used to validate the proposed approach. This case study is a classification problem. The calculation results indicate that a) this case study is a strongly nonlinear problem besause the R(%) value of MRA is 32; and b) C-SVM is applicable since its R(%) value is 0, whereas NBAY and BAYSD are all imapplicable besause their R(%) values are 18 and 38, respectively. For the case study, it is concluded that the preferable algorithm is C-SVM, while BAYSD can serve as a promising dimension reduction tool. Keywords: Data Mining; Naïve Bayesian; Bayesian Successive Discrimination; Multiple Regression Analysis; Nonlinearity; Solution Accuracy; Dimensionality Reduction; Puguang Gasfield

1 INTRODUCTION In the recent years, the regression and classification algorithms have seen preliminary success in petroleum geology[1, 2] , but the application of these algorithms to methane inclusion type prediction has not started yet. Liu et al (2013) presented the occurrence and genesis of multiple types of high density methane inclusions in Puguang Gasfield[3]. Using all the samples with complete parameters given by [3], this paper presents a prediction of methane inclusion types using four algorithms below. The benefit of this proposed approach is to reduce the experment of determing the methane inclusion types. Three classification algorithms and one regression algorithm have been applied to forecast methane inclusion types. The three classification algorithms are the classification of support vector machine (C-SVM), the naïve Bayesian (NBAY), and the Bayesian successive discrimination (BAYSD), while the one regression algorithm is the multiple regression analysis (MRA). In the four algorithms, only MRA is a linear algorithm whereas the other three are nonlinear algorithms In general, when all these four algorithms are used to solve a real-world problem, they often produce different solution accuracies. Toward this issue, the solution accuracy is expressed with the total mean absolute relative residual for all samples, R(%). It is proposed that a) R(%) of MRA is employed to measure the nonlinearity degree of a studied problem: weak if R(%)<10, moderate if 10≤ R(%)≤30, and strong if R(%)>30; and b) an algorithm is applicable if its R(%)≤10, otherwise the algorithm is imapplicable. A case study at the Puguang Gasfield, a classification problem, has been used to validate the proposed approach. - 18 http://www.j-es.org

Turn static files into dynamic content formats.

Create a flipbook