
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 07 | July 2024 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 07 | July 2024 www.irjet.net p-ISSN: 2395-0072
Vidyasre N 1 , Manasa T P 2
1 Student, Bangalore Institute of Technology, Karnataka, India
2 Assistant Professor, Dept. of Computer Science & Engineering, Bangalore Institute of Technology, Karnataka, India
Abstract - The prevention and early detection of ovarian cancer are critical for improving women's health and wellbeing. Ovarian cancer, often termed the "silent killer," typically presents without apparent symptoms in its early stages, making early detection challenging and significantly reducing the chances of successful treatment and survival. Machine Learning (ML) techniques offer powerful tools for identifying complex patterns in biomarkers, surpassing conventional diagnostic methods in accuracy. However, these advanced algorithms often lack transparency, making it difficult to gain physical insights into the diagnostic process. To address this, integrating eXplainable Artificial Intelligence (XAI) can enhance our understanding of how ML models make decisions, providing valuable insights into the diagnostic process. This paper aims to showcase best practices for combining XAI and ML methods in biomarker validation applications, ultimately improving early detection and treatment outcomes for ovarian cancer.
Key Words: Ovarian Cancer, Machine Learning Algorithms, eXplainable Artificial Intelligence (XAI), BiomarkerValidation,EarlyDetection.
Ovariancanceroftenreferredtoasthe"silentkiller,"posesa significant threat to women's health due to its typically asymptomatic early stages, which frequently result in delayed diagnosis and decreased treatment effectiveness. The undetected nature of this disease underscores the urgent need for improved detection techniques. Early detection is paramount for enhancing treatment effectiveness and achieving better patient outcomes. To address this critical issue, our study proposes a novel approachforpredictingovariancancer.Thisstudyprovides acomprehensiveanalysisofthedevelopmentandevaluation processoftheproposedmodel.Byleveragingthestrengths of both boosting and bagging techniques, our approach overcomes the limitations of individual classifiers. Additionally,theinterpretabilityofthemodelisenhanced throughtheapplicationofeXplainableAI(XAI)techniques, suchasSHapleyAdditiveExplanations(SHAP),whichoffer valuableinsightsintotheriskfactorsforovariancancer.This studyaimstoadvancetheprimaryobjectiveofimproving patient outcomes and reducing ovarian cancer-related mortality rates. The subsequent sections delve into the methodology, results, and implications of our approach,
providing a foundation for future advancements in the detectionandmanagementofovariancancer.
Aflowchartvisuallyrepresentsaprocess,usingstandardized symbolstoillustratesteps,decisions,andflowofinformation. It serves to clarify processes, improve understanding, and facilitatecommunicationinvariousfields,fromengineering tobusinessandhealthcare.
The flow chart diagram outlines the process of predicting and classifying ovarian cancer using machine learning techniques,startingwiththeinitiationoftheworkflow.The first step involves uploading the ovarian cancer dataset, followed by data preprocessing, where the raw data is cleanedandpreparedforanalysis.Next,featureextraction
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 07 | July 2024 www.irjet.net
identifies significant variables that contribute to the predictionofovariancancer.Themachinelearningmodelis thentrainedusingtheextractedfeatures.Aftertraining,the modelisevaluatedusingaseparatetestdatasettoassessits performance.Thetrainedmodelthenmakespredictionsor classifications based on the test data, determining the likelihood of ovarian cancer. Finally, the results of the prediction or classification process are displayed, and the workflowconcludes
Researchin the field of ovarian cancer detection has investigatedacollectionofapproachesandstrategieswith theobjectiveofenhancingtheoutcomesofearlydiagnosis andtreatment.
Thisresearchfocusesonimprovingtumorclassificationas malignant or benign, minimizing false positives through various machine learning methods, including logistic regression,RandomForests,decisiontrees,SVM,KNN,and deeplearningtechniques[1].ThegoalistopredictCritical Care Unit (CCU) admission more accurately for ovarian cancer patients undergoing surgery, with a graphical user interfaceforreal-timeanalysis[2].
The study compares premenopausal and postmenopausal women using HE4, CA125, and ROMA markers to differentiate between ovarian cancer and benign masses, alsoexploringtheimpactofageandcutoffvaluesonmarker performance[3].Itproposesahybridapproachcombining CNNandReliefFalgorithmsformicroarraydataanalysisto enhancecancerdetectionaccuracy[4].Challengessuchas highdimensionalityandsmallsamplesizesareaddressed.
The research employs Raman spectroscopy and machine learning to differentiate ovarian cancer from cysts and normalcases,aimingtoimprovediagnosticaccuracy[5].It also develops a predictive model for mortality risk using gene expression data, combining genetic algorithms and extreme gradient boosting [6]. The study uses statistical analysis and ensemble models to identify important biomarkers for distinguishing cancer from benign tumors [7].
MachinelearningmodelsdevelopedontheSEERdatabase predictsurvivalrates,withtechniquesincludingSVM,KNN, DT, RF, AdaBoost, and XGBoost [8]. The study suggests optimizing feature weights andclassifier parameters with LASSO regularization and Asynchronous Differential Evolution(ADE)toenhancedetectionefficiency[9].Italso createsamodeltoforecastprogression-freesurvival(PFS) at12months,emphasizingtheneedforlargersamplesizes forvalidation[10].
The researchaims to build a predictivemodel for ovarian cancerdetection,evaluatingbiomarkersandalgorithmsto compare with conventional methods and provide clinical
p-ISSN: 2395-0072
insights [11]. It explores dimensionality reduction and classification approaches to improve diagnostic accuracy, usingtechniqueslikeAdaBoost,DecisionTrees,andt-SNE [12].Despitealimitedsamplesizeandpotentialbiases,the study recommends further research using heuristic algorithms[13].
SVM outperforms KNN for diagnosis based on medical histories, with future research recommended to integrate advancedtechniqueslikefuzzificationandtransferlearning [14]. To enhance diagnostic accuracy, the study develops Ocys-Net,adeeplearningnetworkforcategorizingovarian cystsfromultrasoundimages[15].Italsoassessesvarious machine learning methods and compares Multi-Layer Perceptron(MLP)andLongShort-TermMemory(LSTM)for diagnosingovariancancerfromcolposcopyimages[18].
Table 1: Summary of Research Work on Ovarian Cancer Detection
Sl.No. Methodology Advantages Limitations
1 Logistic Regression, Random Forests, DecisionTrees, SVM,KNN, DeepLearning
2 Machine Learningfor CCUadmission prediction
Reducesfalse positives, improved categorization
Possible misclassificatio ns,limited samplesize
Greateraccuracy forpredictingCCU admission
Focusedon specificcancer type(HGSOC)
3
4
Comparisonof HE4,CA125, andROMA markers
Insightsinto clinicalrelevance ofmarkers
Hybrid approachusing CNNand ReliefF algorithmsfor microarray data
5 Raman spectroscopy andmachine learning
6
GA-XGBoost modelsfor mortalityrisk
Decreases mistakes, increases precision
Increases diagnostic accuracy
Reliablemortality riskestimation basedongene
Potential publication bias,ageand cutoffvalue impact
High dimensionality, smallsample sizemayaffect validity
Challengesin data pretreatment andfeature extraction
Smallsample size,limited generalizability
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 07 | July 2024 www.irjet.net p-ISSN: 2395-0072
Sl.No. Methodology Advantages Limitations prediction expression
7 Statistical analysisand ensemble modelsfor biomarker identification
8
SEERdatabase, SVM,KNN,DT, RF,AdaBoost, XGBoostfor survival prediction
9 LASSO regularization, crossvalidation error,ADEfor ovariancancer detection
10 RFE,K-Nearest Neighbors, Random Forest,Logistic Regressionfor PFSprediction
11 ReliefF,MRMR forfeature selection, machine learningfor tumor differentiation
12 Evaluationof dimensionality reductionand classification algorithms
Identifies important biomarkers
Collaborative designwith clinician,data preparation methodslike SMOTE
Limitedsample size,nocontrol group,possible misclassificatio ns
Smallsample size,needfor largerdatasets
Sl.No. Methodology Advantages Limitations
15
16
Ocys-Netfor categorizing ovariancysts from ultrasound images
Comparisonof machine learning methodsfor ovariancancer identification
Betterdiagnosis accuracy,reduces doctors'burden
Limitedto ultrasound images,needs further validation
Assessesclassifier accuracy, improvesearly diagnosis
Focusoncrossvalidationand holdout techniques
Improvesmodel efficiency, generalization
LimitedtoSVM andKNN classifiers
Encouraging outcomesdespite smallsamplesize Needs confirmation onlarger patientgroups
17 SHAP algorithmfor gene identification, Kaplan-Meier plotsfor validation
18 MLPandLSTM forcolposcopy image diagnosis
Identifies potential biomarkers
Needsfurther investigation forpractical application
Evaluates algorithm effectivenessfor stageprediction
Limitedto colposcopy images,needs broader validation
Evaluates biomarkersand algorithmsfor detection
Smallsample size,potential biasesin feature selection
Identifieseffective combinationsfor diagnosis
Complexityin implementing multiple techniques
Thisstudypresentsacomprehensiveevaluationofovarian cancerresearch,focusingonarticlesfrom2021to2023.The surveycoversvariousmachinelearninganddeeplearning approachesforimprovingthedetectionandclassificationof ovariancancer.Bycomparingdifferentbiomarkers,feature selection strategies, and algorithmic techniques, this researchhighlightstheadvancementsandchallengesinthe field. The findings emphasize the potential of machine learningmodelstoenhancediagnosticaccuracyandpatient outcomes. This study serves as an initial step towards developing a more robust and explainable AI model for predicting ovarian cancer, ultimately aiming to improve earlydetectionandtreatmentstrategies.
13 Feature selectionand extractionfor microarray genedata Improves identification accuracy
14 SVMvsKNN forovarian cancer diagnosis
SVMshowntobe superior
Limitedby samplesize, dataquality issues
Futureresearch neededfor integrating advanced techniques
[1]Aditya,Ms,I.Amrita,AshwiniKodipalli,andRoshanJoy Martis. "Ovarian cancer detection and classification using machineleaning."In20215thinternationalconferenceon electrical, electronics, communication, computer technologiesandoptimizationtechniques(ICEECCOT),pp. 279282.Ieee,2021.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 07 | July 2024 www.irjet.net p-ISSN: 2395-0072
[2] Laios, Alexandros, Raissa Vanessa De Oliveira Silva, DanielLucasDantasDeFreitas,YongShengTan,Gwendolyn Saalmink, Albina Zubayraeva, Racheal Johnson et al. "Machinelearningbasedriskpredictionofcriticalcareunit admission for advanced stage high grade serous ovarian cancer patients undergoing cytoreductive surgery: The LeedsNatal score." Journal of Clinical Medicine 11, no. 1 (2021):87.
[3] Suri,Arpita,VanamailPerumal,PrajwalAmmalli,Varsha Suryan, and Sanjiv Kumar Bansal. "Diagnostic measures comparisonforovarianmalignancyriskinEpithelialovarian cancerpatients:Ametaanalysis."Scientificreports11,no.1 (2021):17308.
[4] Kilicarslan, Serhat, Kemal Adem, and Mete Celik. "Diagnosisandclassificationofcancerusinghybridmodel basedonReliefFandconvolutionalneuralnetwork."Medical hypotheses137(2020):109577.
[5] Chen, Fengye, Chen Sun, Zengqi Yue, Yuqing Zhang, WeijieXu,SaharShabbir,LongZouetal."Screeningovarian cancerswithRamanspectroscopyofbloodplasmacoupled withmachinelearningdataprocessing."SpectrochimicaActa Part A: Molecular and Biomolecular Spectroscopy 265 (2022):120355.
[6]Hsiao,YiWen,ChunLiangTao,EricY.Chuang,andTzuPin Lu."Arisk prediction model ofgenesignaturesinovarian cancer through bagging of GAXGBoost models." Journal of advancedresearch30(2021):113122.
[7] Ahamad, Md Martuza, Sakifa Aktar, Md Jamal Uddin, Tasnia Rahman, Salem A. Alyami, Samer AlAshhab, Hanan Fawaz Akhdar, A. K. M. Azad, and Mohammad Ali Moni. "Earlystage detection of ovarian cancer based on clinical data using machine learning approaches." Journal of personalizedmedicine12,no.8(2022):1211.
[8] SorayaieAzar,Amir,SaminBabaeiRikan,AminNaemi, JamshidBagherzadehMohasefi,HabibollahPirnejad,Matin BagherzadehMohasefi,andUffeKockWiil."Applicationof machine learning techniques for predicting survival in ovarian cancer." BMC medical informatics and decision making22,no.1(2022):345.
[9] Juwono, Filbert H., Wei Kitt Wong, Hui Ting Pek, SaaveethyaSivakumar,andDonataD.Acula."Ovariancancer detection using optimized machine learning models with adaptive differential evolution." Biomedical Signal ProcessingandControl77(2022):103785.
[10]Arezzo,Francesca,GennaroCormio,DanieleLaForgia, Carla Mariaflavia Santarsiero, Michele Mongelli, Claudio Lombardi,GerardoCazzato,EttoreCicinelli,andVeraLoizzi. "A machine learning approach applied to gynecological ultrasound to predict progressionfree survival in ovarian
cancerpatients."ArchivesofGynecologyandObstetrics306, no.6(2022):21432154.
[11]Lu,Mingyang,ZhenjiangFan,BinXu,LujunChen,Xiao Zheng, Jundong Li, Taieb Znati, Qi Mi, and Jingting Jiang. "Using machine learning to predict ovarian cancer." International journal of medical informatics 141 (2020): 104195.
[12] Yesilkaya, Bartu, Matjaž Perc, and Yalcin Isler. "Manifold learning methods for the diagnosis of ovarian cancer." Journal of Computational Science 63 (2022): 101775.
[13]Kalaiyarasi,M.,andHarikumarRajaguru."Performance analysis of ovarian cancer detection and classification for microarraygenedata."BioMedResearchInternational2022 (2022).
[14] Taleb, Nasser, Shahid Mehmood, Muhammad Zubair, IftikharNaseer,BeenuMago,andMuhammadUmarNasir. "Ovary cancer diagnosing empowered with machine learning." In 2022 International Conference on Business Analytics for Technology and Security (ICBATS), pp. 16. IEEE,2022.
[15] Fan, Junfang, Juanqin Liu, Qili Chen, Wei Wang, and Yanhui Wu. "Accurate Ovarian Cyst Classification with a LightweightDeepLearningModelforUltrasoundImages." IEEEAccess(2023).
[16]Singh, Samridhi, Malti Kumari Maurya, and Nagendra PratapSingh."STRAMPN:Histopathologicalimagedataset for ovarian cancer detection incorporating AI-based methods." Multimedia Tools and Applications 83, no. 9 (2024):28175-28196.
[17] Schilling,Vincent,PeterBeyerlein,andJeremyChien."A Bioinformatics Analysis of Ovarian Cancer Data Using MachineLearning."Algorithms16,no.7(2023):330.
[18] Ch,Chandana,andN.P.G.Bhavani."InnovativeLong ShortTermMemoryisamoreaccurateandefficientmethod of predicting the stage of ovarian cancer than MultiLayer Perceptron."In20239thInternationalConferenceonSmart StructuresandSystems(ICSSS),pp.15.IEEE,2023.