GRD Journals- Global Research and Development Journal for Engineering | Volume 5 | Issue 5 | April 2020 ISSN- 2455-5703
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms Snehal Mastud Department of Computer Engineering Smt. Indira Gandhi College of Engineering
Shweta Pandit Department of Computer Engineering Smt. Indira Gandhi College of Engineering
Ankita Mungekar Department of Computer Engineering Smt. Indira Gandhi College of Engineering
Prof. Sachin Desai Department of Computer Engineering Smt. Indira Gandhi College of Engineering
Abstract In this paper, we have implemented the Comparative Study of ML and DL approaches employed in the modelling of cancer progression. Here we are showing the comparative analysis of ML and DL techniques and their classification algorithms. And our system also contains a website where in hospitals lab technician can detect type of cancer easily. Keywords- SVM (Support Vector Machine), ML (Machine Learning), DL (Deep Learning), ANN (Artificial Neural Network), CNN (Convolution Neural Network), NaĂŻve Bayes, Decision Tree
I. INTRODUCTION We aim at creating a system that can detect whether that person having a cancer or not. We have implemented ML and DL techniques and their classification algorithms. Implemented Machine Learning algorithms are SVM, NaĂŻve Bayes, and Decision Tree. Implemented Deep Learning algorithms are ANN and CNN. These algorithms are applied on a Data Set taken from Kaggle. Above algorithms are implemented on a Data set and algorithms gives its accuracy in percentage. Our System also contains a website. This website is useful for lab technician for predicting cancer. By using website we can easily detect two types of cancer. That two types of cancer are Malignant cancer and Metastatic cancer. A. Support Vector Machine Algorithm Support Vector Machine (SVM) is a supervised machine learning algorithm which can be used for both classification challenges. However, it is mostly used in classification problems. In this algorithm, plot each data item as a point in n-dimensional space (where n is number of features you have) with the value of each feature being the value of a particular coordinate. Support Vectors are simply the coordinates of individual observation. In this paper mainly we will consider the input is based upon Support Vector Machine as training data, testing data is decision value. In this method we consider the following steps like Load Dataset, after loading the dataset will Classify Features (Attributes) based on class labels then estimate Candidate Support Value, like the condition is While (instances!=null), Do condition if Support Value=Similarity between each instance in the attribute then finding the Total Error Value. Suppose if any instance < 0 then the estimated decision value = Support Value\Total Error, repeated for all points until it will be empty. Therefore mainly we have calculated the entropy and gini index. B. Artificial Neural Network Artificial Neural Networks (ANN) is an interconnected group of nodes that uses a computational model for information processing. It changes its structure based on external or internal information that flows through the network. ANN can be used to model a complex relationship between inputs and outputs and find patterns in data. C. NaĂŻve Bayes Algorithm NaiĚ&#x2C6;ve Bayes is a relatively simple machine learning technique based on probability models - Bayesian theorem. It belongs to the family of probabilistic classifiers in machine learning based on Bayesâ&#x20AC;&#x2122; theorem with a strong statistic independence assumed between the features. đ?&#x2018;&#x192; â&#x201E;&#x17D;đ?&#x2018;&#x2DC; đ?&#x2018;Ľđ?&#x2018;&#x2014; = đ?&#x2018;&#x192; đ?&#x2018;Ľđ?&#x2018;&#x2014; â&#x201E;&#x17D;đ?&#x2018;&#x2DC; đ?&#x2018;&#x192; â&#x201E;&#x17D;đ?&#x2018;&#x2DC; đ?&#x2018;Ľđ?&#x2018;&#x2014; đ?&#x2018;&#x203A; đ?&#x2018;&#x2013;=0 ;0< đ?&#x2018;&#x2DC; < đ?&#x2018;&#x203A; + 1 ; đ?&#x2018;&#x2013;,đ?&#x2018;&#x2014;, đ?&#x2018;&#x2DC; đ?&#x153;&#x2013; đ?&#x2018;? (2) This classification technique analyses the relationship between each feature and the class for each instance to derive a conditional probability for the relationships between the feature values and the class.
All rights reserved by www.grdjournals.com
18
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms (GRDJE/ Volume 5 / Issue 5 / 004)
D. Decision Tree Algorithm Decision tree learning uses a decision tree (as a predictive model) to go from observations about an item (represented in the branches) to conclusions about the item's target value (represented in the leaves). E. Convolutional Neural Network A convolutional Neural Network (CNN) is comprised of one or more convolution layers (Often with a subsampling step) and then followed by one or more fully connected layers as in a standard multilayer neural network. The architecture of a CNN is designed to take advantage of the 2D structure of an input image (or other 2D input such as a speech signal).
II. LITERATURE SURVEY Classification of Cancerous Profiles using Machine Learning Algorithms: - Many existing methods are available for lung cancer identification. This type of treatment recommended for an individual is influenced by various factors such as cancer-type, the severity of cancer (stage) and most important the genetic heterogeneity. In such a complex environment, the targeted drug treatments are likely to be irresponsive or respond differently. Hence, there is need to analyse cancer data for predicting optimal treatment options. Analysis of such profiles can help to predict and discover potential drug targets and drugs. In this paper the main aim is to provide machine learning based classification technique for cancerous profiles. A Comparative Study of Machine Learning Algorithms applied to Predictive Breast Cancer Data: - Diagnostic errors are the most frequent non-operative medical errors. Diagnosis should be more data-driven than trial-and error. Machine Learning provides techniques for classification and regression purposes which can be used for solving diagnostic problems in different medical domains. Predictive analysis of fatal ailments like cancer using existing data can serve as a diagnosis tool for doctors. The paper aims at a comparative study of Machine Learning algorithms on a predictive breast cancer dataset. The algorithms used for comparison - Artificial Neural Networks (ANN), k-Nearest Neighbors (kNN) and Bayesian Network Classifiers â&#x20AC;&#x201C; are supervised learning algorithms used widely for classification purposes and are chosen for their diversity. Based on analysis of this data, Artificial Neural Networks are better at classification with 97.4% accuracy than kNN and Bayesian Classifiers. Keywords: machine learning, medical diagnosis, breast cancer, neural networks, k nearest neighbors, Bayesian classifiers.
III. METHODOLOGY A. Classification Step 1: Load the data into Python for classification Step 2: Pre-process the data if required Step 3: Split the data into training and testing data set Step 4: Implement algorithms on Dataset stated previously Step 5: Take accuracy of all algorithms Step 6: Compare all algorithms on the basis of their accuracy and take algorithm having higher accuracy B. User Interface Step 1: Create a User interface Step 2: Enter user ID and password. Step 3: Click on the submit button. C. System Requirement 1) Hardware Requirements Workstation with minimum 4 GB Ram, i5 core processor (or anything equivalent), 16GB or more Hard Disk Space 2) Software Requirements Python 3.6.3 Anaconda(Jupyter notebook) Mysql Pycharm IDE Web Browser (Google Chrome Preferred) D. Functional Requirements Technologies Python Platform Python Application Frameworks/APIs All rights reserved by www.grdjournals.com
19
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms (GRDJE/ Volume 5 / Issue 5 / 004)
The following frameworks /APIs will be required Sklearn Pandas Numpy Tkinter Matplotlib Python CV E. Objectives of the Project We aims to achieve the following through this project: â&#x20AC;&#x201C; Provide an intelligent and interactive system for detecting cancer accuracy. â&#x20AC;&#x201C; To evaluate if it is to identify a cancer to represent recent method that improved algorithm performance and accuracy in distributed environment.
IV. FLOWCHART
V. RESULT AND CONCLUSION The proposed system as planned after extensive research during a literature survey includes the following features: Implementation of ML and DL algorithms on dataset.
All rights reserved by www.grdjournals.com
20
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms (GRDJE/ Volume 5 / Issue 5 / 004)
Fig. 1: Result Accuracy
In this bar chart, we got comparison models between algorithms. SVM, ANN and CNN got 90% accuracy, Decision Tree got 80% accuracy, NaĂŻve Bayes got 60% accuracy. SVM, ANN and CNN got highest accuracy. Next Decision Tree got 80% accuracy. We found that DL algorithms are more accurate than ML algorithms. A. Website Snapshots After entering user ID and password.
All rights reserved by www.grdjournals.com
21
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms (GRDJE/ Volume 5 / Issue 5 / 004)
All rights reserved by www.grdjournals.com
22
Comparative Study of Classification of Cancerous Profile using Deep Learning and Classification Algorithms (GRDJE/ Volume 5 / Issue 5 / 004)
REFERENCES [1] [2] [3] [4] [5]
Pirooznia, M., Yang, J.Y., Yang, M.Q. et al. A comparative study of different machine learning methods on microarray gene expression data. BMC Genomics 9, S13 (2008). https://doi.org/10.1186/1471-2164-9-S1-S13 Danso, SO, Atwell, ES and Johnson, O (2013) A comparative study of machine learning methods for verbal autopsy text classification. IJCSI International Journal of Computer Science Issues, 10 (6). ISSN 1694-0784 Yaramala Sushma, Vagolu S Prasad Babu, Vanitha Kakollu, "Classification of Cancerous Profiles using Machine Learning Algorithms" International Journal of Engineering Trends and Technology 67.3 (2019): 99-101. Potdar, Kedar & Kinnerkar, Rishab. (2016). A Comparative Study of Machine Learning Algorithms applied to Predictive Breast Cancer Data. International Journal of Science and Research (IJSR). 5. 1550. Er, Orhan & YumuĹ&#x;ak, Nejat & Temurtas, Feyzullah. (2010). Chest diseases diagnosis using artificial neural networks. Expert Systems with Applications. 37. 7648-7655. 10.1016/j.eswa.2010.04.078.
All rights reserved by www.grdjournals.com
23