[FREE PDF sample] New theory of discriminant analysis after r fisher advanced research by the featur

Page 1


New

Theory of

Discriminant

Analysis After R Fisher Advanced Research by the Feature Selection Method for Microarray Data 1st Edition Shuichi Shinmura (Auth.)

Visit to download the full and correct content document: https://textbookfull.com/product/new-theory-of-discriminant-analysis-after-r-fisher-adv anced-research-by-the-feature-selection-method-for-microarray-data-1st-edition-shuic hi-shinmura-auth/

More products digital (pdf, epub, mobi) instant download maybe you interests ...

High-dimensional Microarray Data Analysis: Cancer Gene Diagnosis and Malignancy Indexes by Microarray Shuichi Shinmura

https://textbookfull.com/product/high-dimensional-microarraydata-analysis-cancer-gene-diagnosis-and-malignancy-indexes-bymicroarray-shuichi-shinmura/

Metaprogramming in R: Advanced Statistical Programming for Data Science, Analysis and Finance 1st Edition Thomas Mailund

https://textbookfull.com/product/metaprogramming-in-r-advancedstatistical-programming-for-data-science-analysis-andfinance-1st-edition-thomas-mailund/

Modern Data Mining Algorithms in C++ and CUDA C: Recent Developments in Feature Extraction and Selection Algorithms for Data Science 1st Edition Timothy Masters

https://textbookfull.com/product/modern-data-mining-algorithmsin-c-and-cuda-c-recent-developments-in-feature-extraction-andselection-algorithms-for-data-science-1st-edition-timothymasters/

Functional Programming in R: Advanced Statistical Programming for Data Science, Analysis and Finance 1st Edition Thomas Mailund

https://textbookfull.com/product/functional-programming-in-radvanced-statistical-programming-for-data-science-analysis-andfinance-1st-edition-thomas-mailund/

Feature Engineering and Selection A Practical Approach for Predictive Models 1st Edition Max Kuhn

https://textbookfull.com/product/feature-engineering-andselection-a-practical-approach-for-predictive-models-1st-editionmax-kuhn/

Intelligent Feature Selection for Machine Learning Using the Dynamic Wavelet Fingerprint Mark K. Hinders

https://textbookfull.com/product/intelligent-feature-selectionfor-machine-learning-using-the-dynamic-wavelet-fingerprint-markk-hinders/

Advanced Object-Oriented Programming in R: Statistical Programming for Data Science, Analysis and Finance 1st Edition Thomas Mailund

https://textbookfull.com/product/advanced-object-orientedprogramming-in-r-statistical-programming-for-data-scienceanalysis-and-finance-1st-edition-thomas-mailund/

Introduction

to Data Science Data Analysis and Prediction Algorithms with R 1st Edition By Rafael A. Irizarry

https://textbookfull.com/product/introduction-to-data-sciencedata-analysis-and-prediction-algorithms-with-r-1st-edition-byrafael-a-irizarry/

Beginning Data Science in R: Data Analysis, Visualization, and Modelling for the Data Scientist 1st Edition Thomas Mailund

https://textbookfull.com/product/beginning-data-science-in-rdata-analysis-visualization-and-modelling-for-the-datascientist-1st-edition-thomas-mailund/

New Theory of Discriminant Analysis After R. Fisher

Advanced Research by the Feature-Selection Method for Microarray Data

NewTheoryofDiscriminantAnalysis AfterR.Fisher

SeikeiUniversity

Musashinoshi,Tokyo

Japan

ISBN978-981-10-2163-3ISBN978-981-10-2164-0(eBook) DOI10.1007/978-981-10-2164-0

LibraryofCongressControlNumber:2016947390

© SpringerScience+BusinessMediaSingapore2016

Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpart ofthematerialisconcerned,specificallytherightsoftranslation,reprinting,reuseofillustrations, recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmission orinformationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodologynowknownorhereafterdeveloped.

Theuseofgeneraldescriptivenames,registerednames,trademarks,servicemarks,etc.inthis publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexemptfrom therelevantprotectivelawsandregulationsandthereforefreeforgeneraluse.

Thepublisher,theauthorsandtheeditorsaresafetoassumethattheadviceandinformationinthis bookarebelievedtobetrueandaccurateatthedateofpublication.Neitherthepublishernorthe authorsortheeditorsgiveawarranty,expressorimplied,withrespecttothematerialcontainedhereinor foranyerrorsoromissionsthatmayhavebeenmade.

Printedonacid-freepaper

ThisSpringerimprintispublishedbySpringerNature TheregisteredcompanyisSpringerScience+BusinessMediaSingaporePteLtd.

Preface

Thisbookintroducesthenewtheoryofdiscriminantanalysisbasedonmathematical programming(MP)-basedoptimallineardiscriminantfunctions(OLDFs)(hereafter, “theTheory”)afterR.Fisher.Thereare fiveseriousproblemsofdiscriminantanalysis inSect. 1.1.2.Idevelop fiveOLDFsinSect. 1.3.AnOLDFbasedonaminimum numberofmisclassification(minimumNM,MNM)criterionusingintegerprograming(IP-OLDF)revealsfourrelevantfactsinSect. 1.3.3. IP-OLDFtellsusthe relationbetweenNMandLDFclearlyinadditiontoamonotonicdecreaseofMNM. IP-OLDFandanOLDFusinglinearprograming(LP-OLDF)arecomparedwith Fisher ’sLDFandaquadraticdiscriminantfunction(QDF)usingIrisdatainChap. 2 andcephalo-pelvicdisproportion(CPD)datainChap. 3.However,becauseIP-OLDF maynot fi ndatrueMNMifdatadonotsatisfythegeneralpositionrevealedbystudent datainChap. 4 (Problem1),IdevelopRevisedIP-OLDF,RevisedLP-OLDF,and RevisedIPLP-OLDFthatisamixturemodelofRevisedLP-OLDFandRevised IP-OLDF.OnlyRevisedIP-OLDFcan findtrueMNMcorrespondingtoaninterior pointofoptimalconvexpolyhedron(optimalCP,OCP)definedonthediscriminant coefficientspaceinSect. 1.3. BecauseallLDFsexceptforRevisedIP-OLDFcannot discriminatecasesonthediscriminanthyperplaneexactly(Problem1),NMsofthese LDFsmaynotbecorrect.IP-OLDF findsSwissbanknotedatainChap. 6 havingsix variablesislinearlyseparabledata(LSD)andtwovariablessuchas(X4,X6)is minimumlinearlyseparablemodelbyexaminationofall63modelsmadebysix independentvariables.RevisedIP-OLDFconfirmsthisresult,later.Bymonotonic decreaseofMNM,16modelsincluding(X4,X6)arelinearlyseparablemodels.This factisveryimportantforustounderstandthegeneanalysis.OnlyRevisedIP-OLDF andahard-marginsupportvectormachine(H-SVM)candiscriminateLSDtheoretically(Problem2).Problem3isthedefectofgeneralizedinverseof variance-covariancematricesthatcausesatroubleforQDFandaregularizeddiscriminantanalysis(RDA).IsolveProblem3thatisexplainedbythepass/fail determinationsusing18examinationscoresinChap. 5.AlthoughthesedataareLSD, errorratesofFisher ’sLDFandQDFareveryhighbecausethesedatasetsdonotsatisfy Fisher ’sassumption.Thesefactstellusseriousproblemthatwehadbetter

re-evaluatedthediscriminantresultsofFisher ’sLDFandQDF.Inparticular,weshall re-evaluatethemedicaldiagnosis,andvariousratingsbecausethesedatahavethe sametypeoftestdatahavingmanycasesonthediscriminanthyperplane.Because Fisherneverformulatedtheequationofstandarderror(SE)oferrorratesanddiscriminantcoeffi cient(Problem4),Idevelopa100-foldcross-validationforsmall samplemethod(hereafter, “theMethod1”).TheMethod1offersthe95%confidence interval(CI)ofdiscriminantcoefficientanderrorrate.Moreover,Idevelopapowerful modelselectionproceduresuchasthebestmodelwithminimummeanoferrorratein thevalidationsamples(M2).BestmodelsofRevisedIP-OLDFarebetterthanother sevenLDFsusingsixdatasetsincludingJapanese-automobiledatainadditionto above fivedatasets.Therefore,wemisunderstandIestablishtheTheoryin2015. However,whenRevisedIP-OLDFdiscriminatessixmicroarraydatasets(thedatasets) inNovember2015,RevisedIP-OLDFcannaturallyselectfeatures.AlthoughRevised IP-OLDFcanmakefeature-selectionnaturallyforSwissbanknotedataand Japanese-automobiledatainChap. 7,Idonotthinkitisaveryimportantfactbecause thebestmodelofferstheusefulmodelselectionprocedureforcommondata.Over thantenyears,manyresearchersarestrugglingintheanalysisofgenedatasetsbecause therearehugenumbersofgenesanditisdifficultforustoanalyzebycommon statisticalmethods(Problem5).IdevelopaMatroskafeature-selectionmethod (hereafter, “theMethod2”)andLINGOprogram.TheMethod2revealsthedataset consistsseveraldisjointsmalllinearlyseparablesubspaces(smallMatroska,SMs) andotherhigh-dimensionalsubspacethatisnotlinearlyseparable.Therefore,wecan analyzeeachSMbyordinarystatisticalmethods.We fi ndProblem5inNovember 2015andsolveitinDecember2015.

Thebookrepresentsmylife'swork/research,towhichIhavededicatedover44 yearsofmylife.AftergraduatingfromKyotoUniversityin1971,Iwasemployed bySCSKCorp.inJapanasasystemintegrator.NaojiTuda,thegrandsonofthe second-generationgeneraldirectorTeigoIbaofSumitomoZaibatsu,wasmyboss andhebelievedthatmedicalengineering(ME)isanimportanttargetforthe information-processingindustries.Throughhisdecision,Ibecameamemberofthe projectfortheautomaticdiagnosticsystemofelectrocardiogram(ECG)datawith theOsakaCenterforCancerandCardiovascularDiseasesandNEC.Theproject leader,Dr.YutakaNomura,orderedmetodevelopthemedicaldiagnosticlogicfor ECGdatathroughtheFisher ’sLDFandQDF.AlthoughIhadhopedtobecomea mathematicalresearcherwhenIwasaseniorstudentinhighschool,Ifailedthe entranceexaminationofgraduateschoolatKyotoUniversitybecauseIspentmuch moretimepursuingtheactivitiesoftheswimmingclubintheuniversity. AlthoughIdidnotbecomeamathematicalresearcher,IstartedresearchwithME. TheresearchIconductedfrom1971to1974usingFisher ’sLDFandQDFwas inferiortohisexperimentaldecisiontreelogic.Initially,Ibelievedthatmystatisticalabilitywaspoor.However,IsoonrealizedthatFisher ’sassumptionwastoo strictformedicaldiagnosis.Iproposedtheearthmodel(Shinmura,1984)1 for 1SeethereferencesinChap. 1

medicaldiagnosisinsteadofFisher ’sassumption.Therefore,thisexperiencegave methemotivationtodeveloptheTheory.Shinmuraetal.(1973,1974)proposeda spectrumdiagnosisusingBayesiantheorythatwasthe firsttrialfortheTheory. However,logisticregressionwasmoresuitablefortheearthmodel.

Shimizuetal.(1975)requestedmetoanalyzephotochemicalairpollutiondata byHayashiquantificationtheory,andthisbecamemy firstpaper.Dr.Takaichirou Suzuki,leaderoftheEpidemiologyGroup,providedmewithseveralthemesfor manytypesofcancers(Shinmuraetal.1983).

In1975,ImetProf.AkihikoMiyakefromtheNihonMedicalSchoolatthe workshoporganizedbyDr.ShigekotoKaihara,ProfessorEmeritusoftheMedical SchoolofTokyoUniversity.MiyakeandShinmura(1976)studiedtherelationship betweenpopulationandsampleerrorrateinFisher ’sLDF.Next,Miyakeand Shinmura(1979)developedanOLDFbasedontheMNMcriterionbyaheuristic approach.ShinmuraandMiyake(1979)discriminatedCPDdatawithcollinearities. Afterwerevisedapapertwoorthreetimes,astatisticaljournalrejectedourpaper. However,MiyakeandShinmura(1980)wasacceptedbyJapaneseSocietyfor MedicalandBiologicalEngineering(JSMBE).FormereditorswhojudgedOLDF basedontheMNMcriterionoverestimatedthevalidationsample,andFisher ’sLDF didnotoverestimatethesamplebecauseFisher ’sLDFwasderivedfromthenormal distributionwithoutexaminationofrealdata.Iwasdeeplydisappointedthatmany statisticiansdislikedrealdatareviewandstartedtheirresearchfromanormal distributionbecauseitwasverycomfortableforthemwithouttheexaminationof realdata(lotuseating).However,IcouldnotdevelopasecondtrialoftheTheory becauseofpoorcomputerpowerandadefectintheheuristicapproach.

Shinmuraetal.(1987)analyzedthespecifi csubstancemycobacterium(SSM, commonlyknownasMaruyamavaccine).From270,000patients,wecategorized 152,289cancerpatientsintofourpostoperativegroups.Thosepatientsthatwere administeredSSMwithinoneyearaftersurgeryweredividedintofourgroups everythreemonthsatthestartoftheSSMadministration.WeassumedthatSSMis onlywaterwithoutsideeffects,andthiswasthenullhypothesis.Thesurvivaltime forthe firstgroupwaslongerthanforthefourthgroupfromninemonthsto12 monthsaftersurgeryandthenullhypothesiswasrejected.

In1994,Prof.KazunoriYamaguchiandMichikoWatanabestronglyrecommendedmetoapplyforthepositionatSeikeiUniversity.Afterorganizingthe9th SymposiumofJSCSinSCSKatRyogokunearRyogokuKokugikaninMarch 1995,IbecameaprofessorattheEconomicDepartmentinAprilofthesameyear. Dr.TokuhideDoipresentedalong-termcareinsurancesystemthatemployeda decisiontreemethodasadvisedbyme.(DoctorKaiharaplannedthissystemasan advisortotheMinistryofHealthandWelfare,andIadvisedDr.Doitousethe decisiontree.)

In1997,Prof.TomoyukiTarumiadvisedmetoobtainadoctoratedegreein scienceathisgraduateschool.Withoutexaminingthepreviousresearch,IdevelopedIP-OLDFandLP-OLDFthatdiscriminatedIrisdata,CPDdata,and115

randomnumberdatasets.IP-OLDFfoundtworelevantfactsabouttheTheory. Therefore,weconfirmedtheMNMcriterionwasessentialforthediscriminant analysisandcompletetheTheoryin2015.TheTheoryisusefulforthegene datasetsassameastheordinarydatasets.Redearscandownloadallmyresearch fromresearchmapandTheoryfromresearchgate. https://www.researchgate.net/pro file/Shuichi_Shinmura http://researchmap.jp/read0049917/?lang=english

Musashinoshi,JapanShuichiShinmura

Acknowledgments

Iwishtothankallresearcherswhocontributedtothisbook:LinusSchrage,Kevin Cunningham,HitoshiIchikawa,JohnSall,NorikiInoue,KyokoTakenaka, MasaichiOkada,NaohiroMasukawa,AkiIshii,IanB.Jeffery,TomoyukiTarumi, YutakaTanaka,KazunoriYamaguchi,MichikoWatanabe,YasuoOhashi,Akihiko Miyake,ShigekotoKaihara,AkiraOoshima,TakaichirouSuzuki,Tadahiko Shimizu,TatuoAonuma,KunioTanabe,HiroshiYanai,TojiMakino,Jirou Kondou,HiroshiTakamori,HidenoriMorimura,AtsuhiroHayashi,IebunYun, HirotakaNakayama,MikaSatou,MasahiroMizuta,SouichirouMoridaira,Yutaka Nomura,NaojiTuda.

Iamgratefulformyfamilies,inparticular,thelegacyofmylatefather,who supportedtheresearch:OtojirouShinmura,ReikoShinmura,MakikoShinmura, HidekiShinmura,andKanaShinmura.

IwouldliketothankEditage(www.editage.jp )forEnglishlanguageediting.

1.3.2BeforeandAfterSVM

1.3.3IP-OLDFandFourNewFactsofDiscriminant

1.5.2Pass/FailDetermination

2IrisDataandFisher

2.3.1ComparisonofMNMandEightNMs

2.3.3LINGOProgram1:SixMP-BasedLDFs

2.4.1FourTrialstoObtainValidationSample

2.4.1.1GenerateTrainingandValidationSamples

2.4.1.220,000NormalRandomSampling

2.4.2BestModelComparison

3Cephalo-PelvicDisproportionDatawithCollinearities

3.3100-FoldCross-Validation

3.3.1BestModel

4.4.1ComparisonofMNMandNine

5.3.1MNMandNineNMs

5.3.2ErrorRateMeans(M1andM2)

5.4Pass/FailDeterminationbyExaminationScores(90%Level in2012)

5.4.1MNMandNineNMs

5.4.2ErrorRateMeans(M1andM2)

5.4.395%CIofDiscriminantCoef

5.5Pass/FailDeterminationbyExaminationScores(10%Level in2012)

5.5.1MNMandNineNMs

5.5.2ErrorRateMeans(M1andM2)

5.5.395%CIofDiscriminantCoef

6BestModelforSwissBanknoteData ..........................

6.1Introduction ..........................................

6.2SwissBanknoteData ...................................

6.2.1DataOutlook ....................................

6.2.2ComparisonofSevenLDFforOriginalData ...........

6.3100-FoldCross-ValidationforSmallSampleMethod

6.3.1BestModelComparison

6.3.295%CIofDiscriminantCoefficient

6.3.2.1Considerationof27Models

6.3.2.2RevisedIP-OLDF

6.3.2.3Hard-MarginSVM(H-SVM) andOtherLDFs

6.4Explanation1forSwissBanknoteData

6.4.1MatroskainLinearlySeparableData

6.4.2Explanation1ofMethod2bySwissBanknoteData

6.5Summary ............................................

7.2.2ComparisonofNineDiscriminantFunctions forNon-LSD ....................................

7.2.3ConsiderationofStatisticalAnalysis

7.3100-FoldCross-Validation(Method1)

7.3.1ComparisonofBestModel

7.3.295%CIofCoefficientsbySixMP-BasedLDFs

7.3.2.1RevisedIP-OLDFVersusH-SVM

7.3.2.2RevisedIPLP-OLDF,RevisedLP-OLDF, andotherLDFs

7.3.395%CIofCoefficientsbyFisher ’sLDFandLogistic Regression

7.4MatroskaFeature-SelectionMethod(Method2)

7.4.1Feature-SelectionbyRevisedIP-OLDF ................

7.4.2Coeffi cientofH-SVMandSVM4 ....................

8MatroskaFeature-SelectionMethodforMicroarrayDataset (Method2) ...............................................

8.1Introduction ..........................................

8.2MatroskaFeature-SelectionMethod(Method2) ...............

8.2.1ShortStorytoEstablishMethod2 ...................

8.2.2ExplanationofMethod2byAlonetal.Dataset

8.2.2.1Feature-SelectionbyEightLDFs

8.2.2.2ResultsofAlonetal.Dataset UsingtheLINGOProgram

8.2.3SummaryofSixMicroarrayDatasetsin2016

8.2.4SummaryofSixDatasetsin2015

8.3ResultsoftheGolubetal.Dataset

8.3.1OutlookofMethod2bytheLINGOProgram3

8.3.2FirstTrialtoFindtheBasicGeneSets

8.3.3AnotherBGSintheFifthSM

8.4HowtoAnalyzetheFirstBGS ............................

8.5StatisticalAnalysisofSM1...............................

8.5.1One-WayANOVA

9LINGOProgram2ofMethod1 .............................

9.1Introduction

9.2Natural(Mathematical)NotationbyLINGO

9.3IrisDatainExcel

9.4SixLDFsbyLINGO

9.5DiscriminationofIrisDatabyLINGO

9.6HowtoGenerateResamplingSamplesandPrepareData inExcelFile

9.7SetModelbyLINGO

Symbols

StatisticalDiscriminantFunctionsbyJMP

JMPStatisticalsoftwaresupportedbytheJMPdivisionofSAS Institute,Japan

JMPscriptJMPscriptsolvesFisher ’sLDFandlogisticregressionby Method1

LDFlineardiscriminantfunctionssuchasFisher ’sLDF,logistic regression,twoOLDFs,threerevisedOLDFs,andthree SVMs

Fisher ’sLDFFisher ’slineardiscriminantfunctionunderFisher ’s assumption

LogistLogisticregression;inthetable, “Logist” isoftenused

QDF* Quadraticdiscriminantfunction

RDA* Regularizeddiscriminantanalysis

*QDFandRDAdiscriminateordinarydatainthisbook

MathematicalProgramming(MP)byLINGO andWhat’sBest!

What’sBest!Exceladd-insolver

LINGOMPsolverthatcansolveLP,IP,QP,NLP,andstochastic programming

LINGOProgram1LINGOthatsolvestheoriginaldatabysixMP-based LDFsexplainedin 2.3.3

LINGOProgram2LINGOthatsolvessixMP-basedLDFsbyMethod 1explainedinChap. 9

LINGOProgram3LINGOthatsolvessixMP-basedLDFsbyMethod2

LPLinearProgrammingdevelopsReviselLP-OLDF

IPIntegerProgrammingdevelopsRevisedIP-OLDF

QPQuadraticProgrammingdevelopsthreeSVMs

NLPNonlinearProgrammingdefi nesFisher ’sLDF

MP-basedLDFs

SVMSupportvectormachine

H-SVMHard-marginSVM

S-SVMSoft-marginSVM

SVM4**

S-SVMforpenaltyc=10000

SVM1** S-SVMforpenaltyc=1

**Becausethereisnoruletodecideaproper “c”,we compareresultsbySVM4andSVM1

OLDF:OptimalLDF

LSDLinearlyseparabledata,MNMofwhichiszero;LSD includesseverallinearlyseparablemodels(orsubspaces) MatroskaIngeneanalysis,wecallalllinearlyseparablespaceand subspacesasMatroska BigMatroskathemicroarraydatasetisLSDandincludessmaller MatroskainitbymonotonicdecreaseofMNM

SMsmallMatroskafoundbyLINGOprogram3notexplained inthisbook

BGSbasicgenesetorsubspacethatisthesmallestMatroskain eachSM

NMNumberofmisclassi fications

MNMMinimumNM

CPConvexpolyhedron;theinteriorpointofCPhasunique NManddiscriminatessamecasesdefinedbyIP-OLDF, notRevisedIP-OLDF

OCPOptimalCP;theinteriorpointofOCPhasuniqueMNM IP-OLDFOLDF-basedMNMcriterionusingIP;ifdataarenot generalposition,IP-OLDFmaynot findtrueMNM

RevisedIP-OLDFIt findstheinteriorpointofOCPandsolvesProblem1 LP-OLDFOLDFusingLP;oneofL1-normLDF

RevisedLP-OLDFOneofL1-normLDF;Althoughitisfasterthanother MP-basedLDFs,itisweakforProblem1

RevisedIPLP-OLDFAmixturemodelofRevisedLP-OLDFinthe firststepand RevisedIP-OLDFinthesecondstep

DATA

Datancasesbyp-independentvariables xi ithp-independentvariablevector(fori=1,…,n) yi Objectvariable;yi=1forclass1andyi=-1forclass2

Hi(b)Hi(b)=yi × (txi × b +1)isalinearhyperplane(fori=1, …,n)anddividep-dimensionalcoefficientspaceinto fi nite CP(twohalf-planessuchasHi(b)<0andHi(b)>0.)

Hi(b)<0Minushalf-planeofHi(b):IfHi(bk)<0,Hi(bk)=yi × (txi × bk +1)=yi × (tbk × xi +1)<0andcase xi is misclassified.Ifinteriorpoint bk islocatedinh-minus half-plane,NM=h.ThisLDFmisclassi fiesthesame h–cases

OrdinaryorcommonData

IrisdataFisherevaluatesFisher ’sLDFbythesedata CPDdataCephalo-pelvicdisproportiondatawithcollinearities

StudentdataPass/faildeterminationusingstudentattribute LSDLinearlyseparabledatathatincludelinearlyseparable modelsinit.Ingeneanalysis,wecallLSDandlinearly separablemodelsareMatroskas

SwissbanknotedataIP-OLDF findsthesedataareLSD;WeexplainProblem2 and5

TestdataPass/faildeterminationusingexaminationscores;these datasetsareLSD,andweexplainatrivialLDF

Japanese-automobile data LSD;weexplainProblem3and5

ThedatasetsSixmicroarraydatasets

TheoryandMethod

TheoryNewtheoryofdiscriminantanalysisafterR.Fisher Method1100-foldcross-validationforsmallsamplemethod

Method2Matroskafeature-selectionmethodformicroarraydatasets

M1Themeanoferrorrateinthetrainingsample

M2Themeanoferrorrateinthevalidationsample

BestmodelM2ofbestmodelisminimumamongallpossiblemodels ofeachLDF

LOOprocedureAleave-one-outmodelselectionprocedure

ThebestmodelThemodelwithminimumM2insteadofLOObyMethod 1

Diff1Thedifferencedefinedas(NMofninediscriminant functions MNM)

DiffThedifferencedefinedas(M2 M1)

M1DiffThedifferencedefinedas(M1ofninediscriminant functions-M1ofRevisedIP-OLDF)

M2DiffThedifferencedefinedas(M2ofninediscriminant functions-M2ofRevisedIP-OLDF)

FiveProblemsofDiscriminantAnalysis

Problem1AllLDFs,withtheexceptionofRevisedIP-OLDF,cannot discriminatethecasesonthediscriminant hyperplane.NMsoftheseLDFsmaynotbecorrect

Problem2AllLDFs,withtheexceptionofH-SVMandRevised IP-OLDF,cannotrecognizeLDFtheoretically;Although RevisedLP-OLDFandRevisedIPLP-OLDFcanoften discriminateLSD,weneverdiscussinChap. 8 becauseof thisreason

Problem3Thedefectofthegeneralizedinversematricestechnique; QDFmisclassi fiesallcasesasotherclassesforaparticular case.Addingasmallrandomnoisetotheconstantvalues solvesProblem3

Problem4Fisherneverformulatedanequationforthestandarderror oftheerrorrateanddiscriminantcoefficient.Method1 offers95%confidenceinterval(CI)fortheerrorrateand coefficient.

Problem5Formorethantenyears,manyresearchershavestruggled toanalyzethemicroarraydatasetthatisLSD.OnlyRevised IP-OLDFcanmakefeature-selectionnaturally.Idevelop theMatroskafeature-selectionmethod(Method2)that findsasurprisingstructureofthemicroarraydatasetwhere suchstructureisthedisjointunionsofseveralsmalllinearly separablesubspaces(smallMatroska,SMs).Nowwecan analyzeeachSMveryquickly.Studentlinearlyseparable, Swissbanknote,andJapanese-automobiledatashowthe naturalfeature-selectionofRevisedIP-OLDF.Therefore, Irecommendthatresearchersoffeature-selectionmethods, suchasLASSO,evaluateandcomparetheirtheorythrough thesedatasetsinChaps. 4, 6–8.Iomittheresultsofthe pass/faildeterminationusingexaminationscoresthat consistonlyfourvariables

Chapter1

NewTheoryofDiscriminantAnalysis

1.1Introduction

1.1.1TheoryTheme

Thisbookintroducesanewtheoryofdiscriminantanalysis(hereafter, “the Theory”)afterR.Fisher.Thischapterexplainshowtosolvethe fiveserious problemsofdiscriminantanalysis.Tothebestofmyknowledge,thisisthe fi rst bookthatcompareseightlineardiscriminantfunctions(LDFs)usingseveraldifferenttypesofdata.TheseeightLDFsareasfollows:Fisher ’sLDF(Fisher 1936, 1956),logisticregression(Cox 1958),hard-marginSVM(H-SVM)(Vapnik 1995), twosoft-marginSVMs(S-SVMs)suchasSVM4(penalty c =10,000)andSVM1 (penalty c =1),andthreeoptimalLDFs(OLDFs).At first,IdevelopanOLDF basedonaminimumnumberofmisclassi fications(minimumNM(MNM))criterionusingintegerprogramming(IP-OLDF)andanOLDFusinglinearprogramming(LP-OLDF)(Shinmura 2000b, 2003, 2004, 2005, 2007).However,becauseI findthedefectofIP-OLDF,IdevelopthreerevisedOLDFssuchasRevised IP-OLDF(Shinmura 2010a, 2011a),RevisedLP-OLDF,andRevisedIPLP-OLDF (Shinmura 2010b, 2014b).IrisdatainChap. 2 arecriticaltestdatabecauseFisher evaluatesFisher ’sLDFwiththesedata(Anderson 1945).Cephalo-pelvic Disproportion(CPD)data(MiyakeandShinmura 1980)inChap. 3 aremedical datawiththreecollinearities.AlthoughStudentdatainChap. 4 employasmalldata sample(Shinmura 2010a),wecanunderstandProblem1becausethedataarenot generalpositions.The18pass/faildeterminationsusingexaminationscoresin Chap. 5 arelinearlyseparabledata(LSD).NoneoftheLDFs,withtheexceptionof H-SVMandRevisedIP-OLDF,candiscriminateLSDtheoretically.Idemonstrate that18errorratesofFisher ’sLDFandthequadraticdiscriminantfunction (QDF)areveryhigh(Shinmura 2011b);nevertheless,thesedataareLSD. Moreover,sevenLDFs,withtheexceptionofFisher ’sLDF,becometrivialLDF (Shinmura 2015b).Swissbanknotedata(FluryandRieduyl 1988)inChap. 6 and

© SpringerScience+BusinessMediaSingapore2016

S.Shinmura, NewTheoryofDiscriminantAnalysisAfterR.Fisher, DOI10.1007/978-981-10-2164-0_1

Japanese-automobiledata(Shinmura 2016c)inChap. 7 arealsoLSD.AlthoughI developaMatroskafeature-selectionmethodformicroarraydataset(Method2),it isdifficultforustounderstandthemeaningofMethod2ifwedonotknowLSD discriminationverywell.IcallLSDasbigMatroska.AssameasbigMatroska includesseveralsmallMatroska,themicroarraydataset(thedatasets)includes severallinearlyseparablesubspaces(smallMatroska(SM))init(thelargest Matroska).Therefore,IexplainthisideausingcommondatainChaps. 6 and 7. WhenIdiscriminatethedatasets,onlyRevisedIP-OLDFcanselectfeaturesnaturallyand findsthesurprisingstructureofthedatasets(Shinmura 2015e–s, 2016b). Moreover,Idevelopa100-foldcross-validationforsmallsamplemethod (Method1)(Shinmura 2010a, 2013, 2014c)insteadoftheleave-one-out(LOO) procedure(LachenbruchandMickey 1968).Wecanobtaintwoerrorratemeans, M1and M2,fromthetrainingandvalidationsamples,respectively,andproposea simplemodelselectionproceduretoselectthebestmodelwithminimum M2.The bestmodelofRevisedIP-OLDFisbetterthanthesevenother M2sfromthe previousdataexceptfortheIrisdata.

Wecannotdiscriminatecasesonthediscriminanthyperplane(Problem1).Only RevisedIP-OLDFcansolveProblem1.Moreover,onlyH-SVMandRevised IP-OLDFcandiscriminateLSDtheoretically(Problem2).Problem3isthedefect ofthegeneralizedinversematrixtechniqueandQDFofmisclassifyingallcasesto anotherclassforaparticularcase.IsolveProblem3.Fisherneverformulatedan equationforthestandarderrors(SEs)oftheerrorrateanddiscriminantcoeffi cient (Problem4).TheMethod1offersthe95%confidenceinterval(CI)oftheerrorrate andcoeffi cient.Formorethantenyears,manyresearchershavestruggledtoanalyzethe dataset thatisLSD(Problem5).OnlyRevisedIP-OLDFcanmake feature-selectionnaturally.TheMethod2 findsthesurprisingstructureofthe datasetthatisthedisjointunionsofseveralsmallgenesubspaces(SMs)thatare linearlyseparablemodels.Ifwecanrepairthespecifi cgenesfoundbyMethod2, wemightovercomecancerdiseases.Now,wecananalyzeeachSMveryquickly. Wecallthelinearlyseparablemodelingeneanalysis, “Matroska.” Ifthedatasets areLSD,thefullmodelisthelargestMatroskathatcontainsallsmallerMatroskain it.WealreadyknowthatthesmallestMatroska(thebasicgenesetorsubspace (BGS))candescribetheMatroskastructurecompletelybymonotonicdecreaseof MNM.Ontheotherhand,LASSO(BuhlmannandGeer 2011;Simonetal. 2013) attemptstomakethefeature-selectionsimilartoMethod2.Thisbookoffersuseful datasetsandresultsforLASSOresearchersfromthefollowingperspective:

1.CanLDFobtainedbyLASSOdiscriminatethreedifferenttypesofLSDsuchas Swissbanknotedata,Japanese-automobiledata,andsixmicroarraydatasets exactly?

2.CanLDFobtainedbyLASSO findtheMatroskastructurecorrectlyandlistall BGSs?

IfLASSOcannot findSMsorBGSinthedataset,itcannotexplainthedata structure.

1.1.2FiveProblems

TheTheorydiscussesonlybinaryortwo-class,class1orclass2,discriminationby eightLDFssuchasRevisedIP-OLDF,RevisedLP-OLDF,RevisedIPLP-OLDF, H-SVM,SVM4,SVM1,Fisher ’sLDF,andlogisticregression.Thevaluesofclass 1andclass2are1and 1,respectively.Weconsiderthesevaluesasobjectvariable ofdiscriminantanalysisandregressionanalysis.Let f(x)beLDFand f(xi)bea discriminantscorefor xi.Althoughtherearemanydifficultstatisticsindiscriminant analysis,weshouldfocusonthediscriminantrulethatisquitedirect:If yi × f (xi)>0, xi isclassi fiedintoclass1/class2correctly.If yi × f(xi)<0, xi ismisclassi fied.If yi × f(xi)=0,wecannotdiscriminate xi correctly.Thisunderstanding ismostimportantfordiscriminantanalysis.Thereare fiveseriousproblemshidden inthissimplisticscenario(Shinmura 2014a, 2015c, d).

1.1.2.1Problem1

Wecannotadequatelydiscriminatebetweencaseswhere xi liesonthediscriminant hyperplane(f(xi)=0).TheStudentdatainChap. 4 showthisfactclearly.Thusfar, thishasbeenanunresolvedproblem.However,mostresearchersclassifythese casesintoclass1withoutlogicalreason.Theymisunderstandthediscriminantrule asfollows:If f(xi) ≥ 0, xi isclassifiedintoclass1correctly.If f(xi)<0, xi is classi fiedintoclass2properly.Therearetwomistakesintheirrule.The fi rst mistakeistoclassifythecasesonthediscriminanthyperplanetoclass1without logicalexplanation.Thesecondmistakeiswecannotdeterminethecaseswith positivediscriminantscoreasclassifi edintoclass1andthosewithanegativevalue asclassi fiedintoclass2aprioribecausethedatadeterminethis,notresearchers. OtherstatisticiansproposedeterminingProblem1randomly(i.e.,akintothrowing dice)becausestatisticsisthestudyofprobabilities.Ifuserswouldknowofthis claim,theymightbesurprisedanddisappointedindiscriminantanalysis.Inparticular,medicaldoctorsmightbeupsetbecausetheydonotgamblewithmedical diagnoses,giventhattheyattempttoseriouslydiscriminatecasesbasedonthe discriminanthyperplane.Moststatisticalresearchersarelackofthisfactofmedical diagnosis.Ifweconsiderpass/faildeterminationusingthescoresoffourtestswhere thepassingmarkis50points,wecanobtaintrivialLDFsuchas f = T1+ T2+T3+T4 50.If f ≥ 0,agivenstudenthaspassedtheexamination.On theotherhand,if f <0,thestudenthasfailedtheexamination.Becausewecan describethediscriminantruleby(independent)variablesclearly,wecancorrectly includesuchstudentonthediscriminanthyperplaneinthepassingclass.Wehave ignoredthisunresolvedproblemuntilnow.TheproposedRevisedIP-OLDFbased onMNMcantreatProblem1appropriately(Shinmura 2010a).Indeed,withthe exceptionofRevisedIP-OLDF,noLDFscancorrectlycountthenumberofmisclassi fications(NMs).Therefore,wemustcountthenumberofcaseswhere f(xi)=0 anddisplaythisnumber “h” alongsideNMofallLDFsintheoutput.Wemust

estimateatrueNMthatmightincreaseto(NM+ h).Aftershowingmanyexamples ofProblem1,somestatisticiansclaimthattheprobabilityofcasesonthediscriminanthyperplaneiszerowithoutatheoreticalreason.Theyerroneouslybelievethat wediscriminatedataonacontinuousspace.

1.1.2.2Problem2

OnlyH-SVMandRevisedIP-OLDFcanrecognizeLSDtheoretically.1 OtherLDFs mightnotdiscriminateLSDexactly.WhenIP-OLDFdiscriminatesSwissbanknote datainChap. 6,I findthatthesedataareLSD.Inaddition,Japanese-automobiledataare LSDinChap. 7.Throughbothdata,IexplaintheMatroskafeature-selectionmethod (Method2)inChap. 8.Wecanobtainexaminationscoreseasily,andthesedatasetsare alsoLSDs.Moreover,thereistrivialLSD.However,severalLDFscannotdetermine pass/failusingexaminationscorescorrectly(Shinmura 2015b).Inparticular,theerror ratesofFisher ’sLDFandQDFareveryhigh.Table 1.4 listsallthe18errorratesof Fisher ’sLDFandQDFthatarenotzerointhepass/faildeterminationsfrom2010to 2012.Thisfactsuggeststhatreviewthediscriminantanalysisofpastimportantresearch becauseerrorratesmaydecrease.Inmedicaldiagnosis,researchersgaveuptheir researches,errorratesofwhichwereovertenpercent.However,RevisedIP-OLDFmay tellthemerrorratesarezero.Moreover,discriminantfunctionsthatcannotdiscriminate LSDcorrectlyarenothelpfulforgeneanalysis.

1.1.2.3Problem3

Ifthevariance–covariancematrixissingular,Fisher ’sLDFandQDFcannotcalculateitbecauseinversematricesdonotexist.BecauseJMP(Salletal. 2004) adoptedthegeneralizedinversematrixtechnique,IhadbelievedthatFisher ’sLDF andQDFcouldcalculategeneralizedinversematrixwithoutproblems.WhenI discriminatedmathexaminationscoresamong56examinationdatafromthe NationalCenterforUniversityEntranceExaminations(NCUEE),QDFanda regularizeddiscriminantanalysis(RDA)(Friedman 1989)misclassi fiedallstudents inthepassingclassasthefailingclass.Ifweexchangeclass1andclass2,QDFand RDAmisclassifiedallstudentsinthefailingclassasthepassingclassdecidedby JMPspeci fication.WhenQDFcausedseriousproblemswithproblematicdata,JMP switchedQDFtoRDAautomatically.Afterthreeyearsofsurveys,Ifoundthat RDAandQDFdonotworkcorrectlyforaparticularcasewherethevaluesofthe variablesthatbelongtooneclasshaveaconstantvaluebecauseallthestudentsin thepassingclassansweredtheparticularquestioncorrectly.Ifuserscanselect

1Empirically,RevisedLP-OLDcandiscriminateLSDcorrectly.However,itisveryweakfor Problem1.LogisticregressionandSVM4discriminateLSDcorrectlyformanyexaminations. Fisher ’sLDF,QDF,andSVM1aresevereforLSDdiscriminations.Irecommendresearchers reviewtheiroldresearchesusingthesethreediscriminantfunctions.

appropriateoptionsforamodifi edRDAdevelopedforthisparticularcase,RDA worksbetterthantheQDFlistedinTable 1.5,whichisexplainedbytheresultsof theJapanese-automobiledata.However,JMPdoesnotcurrentlyofferamodi fied QDF.Therefore,Ijudgedthiswasthedefectofgeneralizedinversematrix.Ifwe addslightrandomnoisetotheconstantvalue,QDFcandiscriminatethedata exactly.Becauseitisthebasicstatisticalknowledgeforus,thedatavariedandI trustthequalityofJMP;Ineedthreeyearsto findthereason.Problem3has providedawarningforourstatisticalunderstandingdataalwayschange.

1.1.2.4Problem4

Somestatisticianserroneouslybelievethatdiscriminantanalysisistheinferential statisticalmethodthatissimilartoregressionanalysis.However,Fishernever formulatedanequationofSEsfordiscriminantcoefficientsorerrorrates. Nonetheless,ifweusetheindicator yi ofmathematicalprogramming-basedlinear discriminantfunctions(MP-basedLDFs)inEq.(1.7)astheobjectvariableand analyzethedatabyregressionanalysis,theobtainedregressioncoefficientsare proportionaltothecoefficientsofFisher ’sLDFbytheplug-inrule1.Therefore,we useamodelselectionprocedure,suchasstepwiseprocedures,andallpossible combinationmodels(Goodnight 1978)withstatisticssuchasAIC,BIC,andCpof regressionanalysis.Inthisbook,IproposeMethod1andthenewmodelselection proceduresuchasthebestmodel.Iset k =100andselectthemodelwithminimum M2asthebestmodel;thisisaverydirectandpowerfulmodelselectionprocedure comparedwithLOO.First,weselectthebestmodelineachLDF.Next,weselect themodelwithminimum M2amongsixMP-basedLDFsasthe fi nalbestmodel. Weclaimthatthe fi nalbestmodelhasgeneralizationability.Moreover,weobtain the95%CIofthediscriminantcoefficient.Althoughwecoulddemonstratein2010 thatthebestmodelwasuseful(Shinmura 2010a),Icouldnotexplaintheuseful meaningofthe95%CIofthediscriminantcoefficientbefore2014.However,ifwe divideallcoefficientsbytheLDFinterceptandsettheintercepttoone,six MP-basedLDFsandlogisticregressionbecometrivialLDFs,andonlyFisher ’s LDFisfarfromtrivial(Shinmura 2015b).Moreover,Icanexplaintheuseful meaningofthe95%CIofSwissbanknoteandJapanese-automobiledata (Shinmura 2016a, c)moreprecisely.

1.1.2.5Problem5

Formorethantenyears,manyresearchershavestruggledtoanalyzethedatasets (Problem5).However,tothebestofmyknowledge,therehasbeennoresearchon LSDdiscriminationthusfar.Iexamine fivedifferenttypesofLSDs,suchasSwiss banknotedata,pass/faildeterminationof18examinationdata,Japanese-automobile data,studentlinearlyseparabledataandsixmicroarraydatasets.WhenIdiscriminatethedatasets,mostofthecoefficientsofRevisedIP-OLDFbecomezero.Only

RevisedIP-OLDFcanselectfeaturesnaturallyand fi ndsthesurprisingstructureof thedatasets.ThedatasetsareAlonetal.(1999),Chiarettietal.(2004),Golubetal. (1999),Shippetal.(2002),Singhetal.(2002),andTianetal.(2003).Jefferyetal. (2006)analyzedthesedatasetsanduploadthesedatasetsontheirHP.2 Ishiietal. (2014)analyzedthesedatasetsbyprincipalcomponentanalysis(PCA).I findthe Matroskastructureinthedatasets,withMNMofzero.TheMethod2canreducethe high-dimensionalgenespaceintoseveralsmallMatroskas(SMs)(Shinmura 2015e–s, 2016a).WecananalyzetheseSMsbyordinarystatisticalmethodssuchas t test,one-wayANOVA,clusteranalysis,andPCA.Becausetherehasbeenno researchonLSDdiscriminationthusfar(tothebestofourknowledge),many researchershavestruggledandhavenotobtainedgoodresults.IexplainMethod2 withtheresultsofSwissbanknotedatainChap. 6 andJapanese-automobiledatain Chap. 7 becauseRevisedIP-OLDFcanselectvariablesnaturallyforordinarydata.

1.1.2.6Summary

RevisedIP-OLDFsolvesProblems1,2,and5.Problem3isthedefectofthe generalizedinversematrixtechnique,andQDFnowcausesProblem3.Ifweadd slightrandomnoisetotheconstantvalue,wecansolveProblem3easily.Ipropose Method1andcomparetwostatisticalLDFsbyJMPscriptandsixMP-basedLDFs bytheLINGOProgram2(Schrage 2006)usingsixdifferenttypesofdata.Through manyresults,IcanconfirmthatMethod1solvesProblem4usingacomputerintensiveapproach.Problem5isthecomplexanalysisofmicroarraydatasets.Only RevisedIP-OLDFcanmakefeature-selectionofthedatasetsnaturallyand fi ndthe datasetsthatconsistofseveraldisjointunionsofSMs.WecananalyzeeachSMin thedataseteasilybecauseeachSMisasmallgenesubspace.Itisquitestrangethree SVMscannotselectfeaturenaturally.

1.2MotivationforOurResearch

1.2.1ContributionbyFisher

FisherdescribedFisher ’sLDFusingvariance–covariancematricesandfoundedthe statisticaldiscriminanttheory.Heassumedthattwoclasses(orgroups)havethe samevariance–covariancematrices,andtwomeansaredifferent(Fisher ’s assumption).However,becauseFisher ’sassumptionistoostrictforactualdata, QDFwasdefi nedastwoclasseshavingdifferentvariance–covariancematrices. Thisfactindicatesthatstatisticiansareawarethatthereexistdatathatdonotsatisfy Fisher ’sassumption.Moreover,multiclassdiscriminationthatusesthe

2http://www.bioinf.ucd.ie/people/ian/

Mahalanobisdistancehasbeenproposed.Inthequalitycontrol,Taguchiand Jugular(2002)consideredthatoneclass(thenormalstate)hasavariance–covariancematrix,andanotherclass(theuncontrolledstate)consistsofonlyonecase. Theydiscriminateddatathroughmulticlassdiscriminationandclaimedthatthe typicaluncontrolledcaseisfarfromthenormalstatewithlargeMahalanobisdistance.Theirclaimissimilartothe “earthmodel” inmedicaldiagnosis(Shinmura 1984).Becausestatisticalsoftwarepackageseasilyimplementthesediscriminant functionsbasedonvariance–covariancematrices,weapplydiscriminantanalysisto manyapplicationsinscience,technology,andindustry,suchasmedicaldiagnosis, patternrecognition,andvariousratings.However,realdatararelysatisfyFisher ’s assumptions.Therefore,itiswellknownthatlogisticregressionisbetterthan Fisher ’sLDFandQDFbecauseitdoesnotassumeaparticulartheoreticaldistribution,suchasanormaldistribution.Itisverystrangeandunfortunateforusthat thereisnodiscussiononthismatterbyresearchersandusersoflogisticregression.

1.2.2DefectofFisher ’sAssumptionforMedicalDiagnosis

AftergraduatingfromKyotoUniversityin1971,Ibecameamemberoftheproject thatdevelopedtheautomaticdiagnosticsystemforelectrocardiogram(ECG)data from1971to1974.Aprojectleaderwhowasamedicaldoctorrequestedmeto discriminateoverten3 abnormalsymptomsfromnormalsymptomusingFisher ’s LDFandQDF.Ourfouryearsofresearchwereinferiortomedicaldoctor ’s experimentaldecisiontreelogic.First,IbelievedthatmyresultsusingFisher ’s LDFandQDFwereinferiortodecisiontreelogicresultsbecausemyknowledge andexperiencewaspoor.Later,IrealizedthatFisher ’sassumptionwasnotadequateformedicaldiagnosis.Isummarizedtworeasonsformyfailure,bothof whicharedescribedbelow.Ontheotherhand,thereisnoactualtestforFisher ’s assumption.IdemonstratethatNMofFisher ’sLDFisclosetoMNMinIrisdata. WecanusethistrendinsteadoftheteststatisticsofFisher ’shypothesis.

FirstReason:Inmedicaldiagnosis,typicalcasesinabnormalsymptomsarefar fromthediscriminanthyperplane.Iexplainedmedicaldiagnosisasthe “earth model” wherethenormalsymptomistheland,abnormalsymptomsarethe mountains,andthediscriminanthyperplanesarehorizon.TheMahalanobis–Taguchistrategyissimilartotheearthmodel.ThisclaimviolatesFisher ’s assumption.Inastatisticalconcept,weunderstandthattypicalcasesinbothclasses aretwoaveragesoftwonormaldistributions.Therefore,Ibelievedthatthediscriminantfunctionsbasedonthevariance–covariancematricesarenotadequatefor medicaldiagnosisanddevelopedaspectrumdiagnosticmethod(Shinmuraetal. 1973, 1974).Iknewthatlogisticregressionisremarkablysuccessfulinmedical diagnosisandunderstoodthatitissuperiortothespectrumdiagnosticmethod.

3Icannotrecollecttheexactnumberofabnormalsymptoms.

Currently,Japanesemedicalresearchersdiscriminatedatabylogisticregression insteadofFisher ’sLDFandQDF.Iregretthatasresearchersandusersoflogistic regression,theydidnotdiscussmyclaim.

SecondReason:Therearemanycasesclosetothediscriminanthyperplane. IconcludedthatFisher ’sLDFandQDFarefragileforthediscriminationofparticulardata,suchaspass/faildeterminationusingexaminationscores(Shinmura 2011b)andtheratingofbonds,stocks,andestatesinadditiontomedicaldata. Thesedataalsohavethecharacteristicfeatureofhavingmanycasesclosetothe discriminanthyperplane.NoneoftheLDFs,withtheexceptionofRevised IP-OLDF,candiscriminatethecasesonthediscriminanthyperplanecorrectly (Problem1).Recently,becauseIcouldnotaccessmedicaldataforourresearch,I usedpass/faildeterminationwithexaminationscoresinsteadofmedicaldata.

1.2.3ResearchOutlook

After1975,IdiscriminatedmanydatausingFisher ’sLDF,QDF,logisticregression, multiclassdiscriminationusingMahalanobisdistance,decisiontreelogic(orpartitioning),andthequanti ficationtheorydevelopedbyDr.Hayashi(Shimizuetal. 1975;NomuraandShinmura 1978;Shinmuraetal. 1983).Throughthesestudies,I foundProblems1and4(Shinmura 2014a, 2015c, d).In1973,wedevelopedthe spectrumdiagnosticmethodusingBayesiantheory.However,logisticregression wasmoresophisticatedthanthespectrumdiagnosticmethod.Next,wedeveloped OLDFbasedontheMNMcriterion(MiyakeandShinmura 1979, 1980;Shinmura andMiyake 1979),whichisaheuristicapproach.BecauseWarmackandGonzalez (1973)comparedseveraldiscriminantfunctions,theirresearchencouragedour research.Wewerenotabletodeveloptheresearchbecausewehadlowcomputer powerandbecauseofthedefectoftheheuristicapproach.

Startingin1997,IdevelopedIP-OLDF(Shinmura 1998; 2000a, b;Shinmura andTarumi 2000).BecauseIdefi nedIP-OLDFinthediscriminantcoefficient spaces,Ifoundtwoimportantfactsofdiscriminantanalysis.The firstisOCP.The secondis “themonotonicdecreaseofMNM.” However,therewasaseriousdefect inIP-OLDFusingStudentdatathatarenotgeneralpositions.Ifdataarenotgeneral positions,IP-OLDFmightnotsearchforthevertexofatrueOCP.Thisdefect meansthattheobtainedMNMmightnotbetrueMNM,andProblem1causedthis defect.In2007,RevisedIP-OLDFsolvedthedefectbecauseitcan findtheinterior pointoftrueOCPandavoidProblem1.Therefore,IcouldsolveProblem1 completely.Until2007,IwasnotabletoevaluateeightLDFsusingvalidation samplesbecauseourresearchdataweresmallsamples.

After2007,IdevelopedMethod1.Throughthisbreakthrough,Iwasableto solveProblem4andendedthebasicresearch.RevisedIP-OLDFsolvesProblems1 and2.AlthoughIcanevaluateeightLDFsby M2,Icannotexplaintheuseful meaningofthe95%CIofdiscriminantcoeffi cients.After2010,Istartedapplied researchonLSDdiscrimination.IfoundthatProblem3isthedefectofthe

generalizedinversematrixtechniquebythepass/faildeterminationthatuses examinationscores(Shinmura 2011b).WithregardtoIP-OLDF,Isettheintercept ofIP-OLDFtooneandwasabletoobtaintwoimportantfactssuchasOCPand monotonicdecreaseofMNM.Therefore,Idividedallcoefficientsbytheintercept andsettheintercepttoone.Throughthesecondbreakthrough,sevenLDFs,with theexceptionofFisher ’sLDF,becametrivialLDFbythepass/faildetermination thatusesexaminationscores,andIwasabletoexplaintheusefulmeaningofthe coefficientofRevisedIP-OLDFusingSwissbanknotedata.Therefore,Ihave solvedthefourproblemsandcanconfi rmtheendofourresearch.However,whenI discriminatedShippetal.datasetinOctober2015,IfoundthatRevisedIP-OLDF canmakefeature-selectionnaturallyandcansolveProblem5quickly.

1.2.4Method1andProblem4

Ifweset “k =100” intheMethod1,wecanobtain100LDFsand100errorrates fromthetrainingandvalidationsamples.Fromthe100LDFs,weobtainthe95% CIofdiscriminantcoefficients.Fromthe100errorrates,weobtainthe95%CIof errorratesandtwomeansoferrorrates, M1and M2,fromthetrainingandvalidationsamples.Weconsiderthemodelwithminimum M2amongallpossible combinationmodelstobethebestmodel.Thisstandardisadirectandpowerful modelselectionprocedurecomparedwiththeLOOprocedure.

Weshoulddistinguishsuchcomputer-intensiveapproachesfromtraditionalinferentialstatisticswiththeSEequationbasedonnormaldistribution.Statisticians withoutcomputerpowerestablishedinferentialstatisticsmanually.Today,wecan utilizethepowerofacomputerwithstatisticalandMPsolvers,suchasJMPand LINGO.IdevelopedtheMethod1(Program2)ofFisher ’sLDFandlogistic regressionwiththeJMPscriptsupportedbytheJMPdivisionofSASInstitute Japan.Inaddition,IdevelopedMethod1forsixMP-basedLDFswithLINGO. ThoseareRevisedIP-OLDF,RevisedIPLP-OLDF,RevisedLP-OLDF,H-SVM, SVM4,andSVM1.IexplaintheLINGOProgram2inChap. 9.Thoseresearchers whowanttoanalyzetheirresearchdatacanobtainthe95%CIfortheerrorrate anddiscriminantcoeffi cients.Thesestatisticsprovidepreciseanddeterministic judgmentonmodelselectionprocedurecomparedwiththeLOOprocedure.Tothis point,IcannotvalidateandevaluateRevisedIP-OLDFwithsevenotherLDFs becauseIonlyhavesmalloriginaldataandnovalidationsamples.Researcherswith smallsamplescanvalidateandassesstheirresearchdatawithMethod1andthe bestmodel.

MiyakeandShinmura(1976)discussed “errorratesoflineardiscriminant function” bythetraditionalapproach.Ontheotherhand,KonishiandHonda(1992) discussed “errorrateestimationusingthebootstrapmethod.” Their computer-intensiveapproachesarenottraditionalinferentialstatisticsanddonot offerthe95%CIoftheerrorratesandcoefficientsforindividualdata.Although logisticregressionoutputsthe95%CIofthecoefficientthroughmaximum

likelihoodproposedbyR.Fisher,thisisalsoacomputer-intensiveapproach.Onthe otherhand,wecanselectthebestmodeland95%CIoftheerrorratesand coefficientsforsixLDFsbytheMethod1andthebestmodel.Manyresearchers whowanttodiscriminatesmallsampleshavethePhilosopher ’sStone.

1.3DiscriminantFunctions

IcomparetwostatisticalLDFsbyJMPandsixMP-basedLDFsbyLINGO.Iomit akernelSVMbecauseitisanonlineardiscriminantfunction.However,Ievaluate QDFandRDAwitheightLDFsonlyfortheoriginalsixdifferentdata,withthe exceptionofthedatasets.Next,IcomparetwostatisticalLDFsandsixMP-based LDFsforresamplingsamplesifthedataareLSD.IfthedataarenotLSD,we cannotdiscriminatethedatabyH-SVMbecauseitcauseserrorfornon-LSD.

1.3.1StatisticalDiscriminantFunctions

Fisherdefi nedFisher ’sLDFbymaximizationofthevarianceratio(between/within classes)inEq.(1.1).Nonlinearprogramming(NLP)cansolvethisequation.

IfweacceptFisher ’sassumption,thesameLDFisobtainedinEq.(1.2)by anotherplug-inrule2.ThisequationdefinesFisher ’sLDFexplicitly,whereas Eq.(1.1)definesLDFimplicitly.Therefore,statisticalsoftwarepackagesadoptthis equation.Somestatisticianserroneouslybelievethatdiscriminantanalysisisinferentialstatistics,similartoregressionanalysis.Discriminantanalysisisnottraditionalinferentialstatisticsbasedonthenormaldistributionbecausethereareno SEsforthediscriminantcoefficientsanderrorrates(Problem4).Therefore, LachenbruchandMickeyproposedtheLOOprocedureforselectingagooddiscriminantmodel,asindicatedinTable 1.6

:

MostrealdatadonotsatisfyFisher ’sassumption.Whenthevariance–covariance matricesoftwoclassesarenotthesame(Σ1 ≠ Σ2),theQDFde finedinEq.(1.3) canbeused.Thisfactiscriticalforus.Previousstatisticianshaveknownthatmost realdatadonotsatisfyFisher ’sassumption.WeusetheMahalanobisdistancein Eq.(1.4)forthediscriminationofmulticlasses.TheMahalanobis–Taguchimethod ofqualitycontrolisoneoftheapplications.

Fisher'sLDF

WeuseFisher ’sLDFandQDFinmanyareas,butcannotcalculatewhether somevariablesremainconstant.Therearethreecases.First,somevariablesthat belongtobothclassesarethesameconstant.Second,somevariablesthatbelongto bothclassesaredifferent,butconstant.Third,somevariablesthatbelongtoone classareconstant.Moststatisticalsoftwarepackagesexcludeallvariablesinthese threecases.Ontheotherhand,JMPenhancesQDFusingthegeneralizedinverse matrixtechnique.Therefore,QDFcantreatthe firstandsecondcasescorrectly,but cannotmanagethethirdcaseproperly(Problem3).

Recently,thelogisticregressioninEq.(1.5)hasbeenusedinsteadofFisher ’s LDFandQDFfortworeasons.First,itiswellknownthattheerrorrateoflogistic regressionisoftenlessthanthatofFisher ’sLDFandQDFbecauseitisderived fromrealdata,insteadofsomenormaldistributionfreefromreality.Let “ p ” bethe probabilityofbelongingtoaclassofdiseases.Ifthevalueofsomevariableis increasing/decreasing, “ p ” increasesfromzero(normalclass)toone(abnormal class).Thisrepresentationisveryusefulinmedicaldiagnosis,aswellasforratings inrealestatesandbonds.Onthecontrary,Fisher ’sLDFassumesthatcasescloseto theaverageofthediseasesarerepresentativecasesofthediseases’ class.Medical doctorsneverpermitthisclaim.Althoughthemaximum-likelihoodprocedure calculatesSEofthelogisticcoefficient,weshoulddistinguishthecomputerintensiveapproachfromthetraditionalinferentialstatisticsbasedonthetheoretical distributioninducedmanually.Firth(1993)indicatedthattheSEofalogistic coefficientbecomeslargeandtheconvergencecalculationbecomesunstablefor LSD.IfIobservethefollowingpoints:(1)Ican findNM=0bychangingthe discriminanthyperplaneonROC,(2)MNM=0,(3)SEsbecomelarge,and(4)the convergencecalculationbecomesunstable,Icandeterminethatlogisticregression canrecognizeLSD.IconfirmthatlogisticregressioncanalmostrecognizeLSDby thistediouswork:

1.3.2BeforeandAfterSVM

TherearemanytypesofresearchonMP-baseddiscriminantanalysis.Glover (1990)defi nedmanylinearprogramming(LP)discriminantmodels.Rubin(1997) proposedMP-baseddiscriminantfunctionsusingIP.Stam(1997)summarized Lp-normdiscriminantmethodsin1997andansweredthequestion, “Whyhave statisticiansrarelyusedLp-normmethods?” Heprovidedfourreasons:communication,promotion,andterminology;softwareavailability;therelativeaccuracyof

Another random document with no related content on Scribd:

THE HAWAIIANS’ ACCOUNT OF THE

FORMATION OF THEIR

ISLANDS

AND ORIGIN OF THEIR RACE WITH THE TRADITIONS OF THEIR MIGRATIONS,

E., AS GATHERED FROM ORIGINAL SOURCES

Author of “An Account of the Polynesian Race”

WITH TRANSLATIONS EDITED AND ILLUSTRATED WITH NOTES BY THOMAS G. THRUM

Memoirs of the Bernice Pauahi Bishop Museum

Volume V—Part I

HONOLULU, H. I.

B M P

1918

[Contents]

P.

L K.

CHAPTER PAGE

I. His Birth and Early Life—Change to Oahu and Fame Attained There 2

II. Kalonaikahailaau—Kawelo Equips Himself to Fight Aikanaka—Arrival at Kauai 20

III. Commencement of Battle Between Kawelo and the People of Kauai 38

IV. Kaehuikiawakea—Kaihupepenuiamono and Muno—Walaheeikio and Moomooikio 42

V. Kahakaloa—His Death by Kawelo 48

VI. Kauahoa—Kawelo Fears to Attack Him— Seeks to Win Him by a Chant—Kauahoa Replies 52

VII. Size of Kauahoa—Is Killed by Kawelo— Kawelo Vanquishes Aikanaka 56

VIII. Division of Kauai Lands—Aikanaka Becomes a Tiller of Ground 60

IX. Kaeleha and Aikanaka Rebel Against Kawelo—Their Battle and Supposed Death of Kawelo 62

X. Temple of Aikanaka—How Kawelo Came to Life Again—He Slaughters His Opponents and Becomes Again Ruler of Kauai 66

S P.

His High Office Laamaomao, His Wind Gourd In Disfavor with the King He Moves to Molokai Has a Son Whom He Instructs Carefully Dreams of Keawenuiaumi Setting Out in Search for Him Prepares with His Son to Meet the King 72

L K

I. Prepares to Meet Keawenuiaumi in Search of Pakaa—Canoe Fleet of Six District Chiefs, Recognized, are Taunted as They Pass— Keawenuiaumi, Greeted with a Chant, Is Warned of Coming Storm and Invited to Land —On Advice of the Sailing-masters the King Sails on 78

II. Kuapakaa Chants the Winds of Hawaii—The King, Angered, Continues on—Winds of Kauai, Niihau and Kaula; Of Maui, Molokai, Halawa—Chants the Names of His Master, Uncle and Men—Pakaa Orders the Winds of Laamaomao Released 92

III. Swamping of the Canoes—They Return to Molokai and Land—The King is Given Dry Apparel, Awa and Food—Storm-bound, the Party is Provided with Food—After Four Months They Prepare to Embark 108

IV. Departure from Molokai—Names of the Six Districts of Hawaii—The King Desires Kuapakaa to Accompany Him—The Boy Consents Conditionally—Setting out they meet with Cold, Adverse Winds—The Sailing-masters Fall Overboard 118

V. At Death of Pakaa’s Enemies Calm Prevails —The Boy is Made Sailing-master—He Directs the Canoes to Hawaii—The Men Are Glad, but the King is Sad at His Failure— Kuapakaa Foretells His Neglect—Landing at Kawaihae, and Deserted, he Joins two Fishermen—Meeting a Six-manned Canoe He Wagers a Race, Single-handed, and Wins —He Hides His Fish in the King’s Canoe— They Plan Another Race to Take Place in Kau, Life to be the Forfeit 124

VI. The Canoe Race in Kau—Kuapakaa Offers to Land Four Times Before His Opponents’ First, and Wins—The King Sends for the Boy and Pleads for the Lives of His Men— Kuapakaa Reveals Himself and Pakaa—The Defeated Men Ordered Put to Death— Keawenuiaumi Orders Kuapakaa to Bring Him Pakaa—Pakaa Demands Full Restitution First—The King Agrees, and on Pakaa’s Arrival Gives Him the Whole of Hawaii 128

Legend of Palila 136

Legend of Puniakaia 154

Legend of Maniniholokuaua and Keliimalolo 164

Legend of Opelemoemoe 168

Legend of Kulepe 172

Legend of Kihapiilani 176

[Contents]

Legend of Hiku and Kawelu 182

Legend of Kahalaopuna 188

Legend of Uweuwelekehau 192

Legend of Kalaepuni and Kalaehina 198

Legend of Kapakohana 208

Legend of Kapunohu 214 [1]

PREFACE.

In this second series of the Fornander Collection of Hawaiian Folklore, with the exception of a few transpositions, as mentioned in the preceding volume, the order of the author has been observed in the main, by grouping together, first, the more important legends and traditions of the race, of universal acceptance throughout the whole group, followed by the briefer folk-tales of more local character.

A few of similar names occur in the collection, indicating, in some cases, different versions of the same story, a number of the more popular legends having several versions.

The closing part of this volume, to embrace the series of Lahainaluna School compositions of myth and traditional character, it is hoped will be found to possess educational value and interest.

No liberties have been taken with the original text, the plan, as outlined, being to present the various stories and papers as written, regardless of historic or other discrepancies, variance in such matters being treated in the notes thereto.

[Contents]

L K.

CHAPTER I.

B E L

K.—H C

O F

A T.

Maihuna was the father and Malaiakalani was the mother of Kawelo, who was born in Hanamaulu,1 Kauai. There were five children in the family. The first was Kawelomahamahaia;

H M

K.

I.

K H

W K

K N .—

K H O

L H M.

Omaihuna ka makuakane, o

Malaiakalani ka makuahine, o Hanamaulu i Kauai ka aina hanau o Kawelo. Elima ka nui o ko Kawelo mau hanauna; o ka mua, o Kawelomahamahaia; o

MOKUNA

the second was Kaweloleikoo. These two were males; after these two came Kaenakuokalani, a female; next to her was Kaweloleimakua and the last child was Kamalama. Kaweloleimakua, or Kawelo is the subject of this story.

The parents of Malaiakalani [the mother] were people who were well versed in the art of foretelling the future of a child, by feeling of its limbs, and by looking over the child, they could tell whether it would grow up to be brave and strong, or whether it would some day rule as king. At the birth of the two older brothers of Kawelo, these old people examined them, but found nothing wonderful about them. This examination was followed by the two on Kawelo, upon his birth. After the examination the old people called the parents of Kawelo and said to them: “Where are you two? This child of yours is going to be a soldier; he is going to be a very powerful man and shall some day rule as king.” Because of these wonderful traits, the old

kona muli, o Kaweloleikoo; he mau keiki kane laua, mahope hanau o Kaenakuokalani, he wahine ia. O kona muli mai o Kaweloleimakua, a o kona muli iho o Kamalama, o ka mea nona keia moolelo o Kaweloleimakua, oia o Kawelo.

O na makua o Malaiakalani, he mau mea akamai laua i ka haha a me ka nana i ka wa uuku o ke keiki, aole e nalo ia laua ke ano a me ka hana a ke keiki ke nui ae, ke koa a me ka ikaika, ke keiki ku i ka moku. Pela ka hana a ua mau makua nei, i na kaikuaana o Kawelo, a hiki ia Kawelo, haha no laua a hai aku i kona ano a me kana hana, i na makua o Kawelo: “E, auhea olua, o keia keiki a olua, he keiki koa, he keiki ikaika, he keiki e ku ana i ka moku.” Nolaila lawe ae la laua ia Kawelo a hanai iho la. Mahope o laila, hanau o Kamalama ko Kawelo kaikaina ponoi.

people took Kawelo and attended to his bringing up themselves. It was after this that Kamalama, the younger brother of Kawelo was born.

Shortly after the birth of Kamalama, the grandparents of Kawelo moved over to Wailua, where they took up their residence, taking their grandchild Kawelo along with them. At this time, while Kawelo was being brought up, Aikanaka, the son of the king of Kauai was born, and also Kauahoa of Hanalei. All these three were born and brought up together.2

Kawelo as a child was a very great eater; he could not satisfy his hunger on anything less than all the food of one umu to a meal. Kawelo ate so much that his grandparents began to get tired of keeping him in food, so at last they began to search for something to entice Kawelo away from the house and in that way get him to forget to eat. One day they went up to the woods and hewed out a canoe. After it was brought down to the sea

Mahope o laila, hoi ae la na kupuna o Kawelo i Wailua e noho ai, me ka laua moopuna o Kawelo. I keia wa e hanai ia nei o Kawelo, hanau o Aikanaka he keiki alii, a hanau no hoi o Kauahoa no Hanalei ia, akolu lakou ia wa hookahi i hanai ia ai.

He keiki ikaika loa o Kawelo ma ka ai ana, hookahi umu hookahi ai ana, pela aku, a pela aku, a ana na kupuna o Kawelo, i ke kahumu ai na Kawelo, nolaila, imi iho la laua i mea e walea ai o Kawelo. Pii aku la laua i ke kalai waa, a hoi mai la, kapili a paa, haawi aku la ia Kawelo, hoehoe iho la o Kawelo i uka i kai o Wailua, a lilo iho la ia i mea nanea ia ia i na la a pau loa.

shore it was rigged up and given to Kawelo. As soon as Kawelo got the canoe he paddled it up and down the Wailua river, and after this it became an object of great interest to him every day

When Kauahoa saw Kawelo with his canoe day after day enjoying himself, he got it into his mind to make himself something to enjoy himself with; so he made [4]himself a kite, and after it was completed he flew it up. When Kawelo saw the kite he took a liking to it and so went home to his grandparents and requested them to make him a kite.3 The grandparents thereupon made Kawelo a kite and after it was completed he took it out and flew it up. When Kauahoa saw Kawelo with a kite he came with his and they flew them together. While they were flying their kites, Kawelo’s kite became entangled with Kauahoa’s kite which caused Kauahoa’s to break away and it was carried by the wind till it landed at Koloa, to the west. The name of the place where the kite landed is known as Kahooleinapea to this day,

Ma keia hana a Kawelo, ike mai la o Kauahoa i ka Kawelo mea nanea, he waa, hana iho la ia i lupe hoolele nana, a hoolele ae la, a ike o Kawelo i keia mea, [5]makemake iho la ia, hoi aku la olelo i na kupuna e hana i lupe nana. A hana iho la na kupuna o Kawelo i lupe nana, a paa, hoolele ae la o Kawelo i kana lupe, a ike o Kauahoa hoolele pu ae la i na lupe a laua. Ma keia lele like ana o na lupe a laua, hihia ae la ka Kawelo lupe me ka Kauahoa, a moku iho la ka Kauahoa lupe, a lilo aku la i ka makani, a haule i Koloa ma ke komohana; o kahi i haule ai, o Kahooleinapea, a hiki i keia la, no ka haule ana o ka pea a Kauahoa, kela inoa o ia wahi.

because of the fall of Kauahoa’s kite there.

After Kauahoa’s kite was broken away, Kawelo looked at Kauahoa with the belief that surely Kauahoa would come and attack him; but since Kauahoa did not come Kawelo said within himself: “Kauahoa will never overcome me if we should ever meet in any future battle.”

Kauahoa was a much larger boy than Kawelo, still he was afraid of him.4

After flying their kites, they went in swimming and riding down the rapids. In this Kawelo again showed himself to be more skilful than Kauahoa, which caused Kawelo to be more sure in his belief that Kauahoa will never overcome him in the future. Kawelo and Kauahoa were not separated from one another in the matter of their relationship; they were connected, and so was the young chief, Aikanaka. He was connected in blood to the two boys, a fact which made Aikanaka something like an older

Ma keia moku ana o ka lupe a Kauahoa ia Kawelo, nana aku la o Kawelo i ko Kauahoa kii mai e pepehi ia ia, a liuliu, noonoo iho la o Kawelo, aole no e pakele o Kauahoa ia ia, ina laua e kaua mahope, no ka mea, he nui o Kauahoa, he uuku o Kawelo, aka, ua makau nae o Kauahoa ia Kawelo.

A mahope o ka hoolele lupe, hookahekahe wai iho la laua, a oi aku la no ko Kawelo i mua o Kauahoa, nolaila, noonoo iho la no o Kawelo, aole no e pakele o Kauahoa ia ia mahope aku ke kaua. O Kawelo a me Kauahoa, aole laua i kaawale aku, ua pili no ma ka hanau ana, a pela no ke ’lii o Aikanaka, ua pili no ia laua, nolaila, lilo o Aikanaka i kaikuaana haku no laua. Ma na mea a pau a Aikanaka e olelo mai ai, malaila laua e hoolohe ai, ina he kui lei, a he mea e ae paha, aole a laua hoole, he ae wale no.

brother and lord to them. Everything Aikanaka wished was granted to him, whether in stringing wreaths, or other things, they never denied him anything.

While Kawelo and his grandparents were living at Wailua with Aikanaka and the others, Kawelo’s older brothers, together with their grandparents, left Kauai and came to live in Waikiki, Oahu. Kakuhihewa was the king of Oahu at this time. There was living with Kakuhihewa, a very strong man who was a famous wrestler. This man used to meet the older brothers of Kawelo in several wrestling bouts but they never could throw him down. The brothers of Kawelo were great surf riders, and they often went to ride the surf at Kalehuawehe.5 After the surf ride they would go to the stream of Apuakehau and wash, and from there they would go to the shed where the wrestling bouts were held and test their skill with Kakuhihewa’s strong man; but in all their trials

Ia Kawelo ma e noho ana i

Wailua me Aikanaka ma, holo mai la na kaikuaana o Kawelo me ko laua mau kupuna, mai Kauai mai a noho i Waikiki ma Oahu nei. O Kakuhihewa ke ’lii o Oahu nei e noho ana ia wa, a aia hoi me Kakuhihewa, he kanaka ikaika loa i ka mokomoko. A o ua kanaka la, oia ka hoa mokomoko o na kaikuaana o Kawelo, aole nae he hina i na kaikuaana o Kawelo. A he mea mau i na kaikuaana o Kawelo ka heenalu, i ka nalu o Kalehuawehe, a pau ka heenalu, hoi aku la a ka muliwai o Apuakehau auau, a pau, hoi aku la a ka hale mokomoko, aole nae he hina o ke kanaka o Kakuhihewa i na kaikuaana o Kawelo.

they never once were able to throw him.

While living separated from each other, the older brothers of Kawelo being in Oahu, their grandparents, who were with Kawelo in Wailua, after a while, began to long for a sight of the other grandchildren, so one day they sailed for Oahu, bringing Kawelo with them, and they landed at Waikiki where they were met by the older brothers of Kawelo. After deciding to make their home in Waikiki, Kawelo took up farming and also took unto himself a wife, Kanewahineikiaoha, the daughter of Kalonaikahailaau, and they lived together as husband and wife.

While Kawelo was one day working in his fields, he heard some shouting down [6]toward the beach, so he inquired of his grandparents: “What is that shouting down yonder?” The grandparents answered: “It is your brothers; they have been out surf riding and are now wrestling with Kakuhihewa’s

Ma keia noho kaawale ana o na kaikuaana o Kawelo i Oahu nei, hu ae la ke aloha i na kupuna o lakou e noho ana me Kawelo i Wailua, nolaila, holo mai la na kupuna me Kawelo i Oahu nei, a pae ma Waikiki, ike iho la i na kaikuaana, a noho iho la i laila. Ma keia noho ana i laila, mahiai o Kawelo, a moe iho la i laila i ka wahine, oia o Kanewahineikiaoha, kaikamahine a Kalonaikahailaau, a noho pu iho la laua he kane a he wahine.

Ia Kawelo e mahiai ana, lohe aku la ia i ka pihe uwa o kai, uwa ka pihe a [7]haalele wale, alaila, ninau aku o Kawelo i na kupuna: “Heaha kela pihe o kai e uwa mai nei?” I mai la na kupuna: “Ou kaikuaana; hele aku la i ka heenalu, a hoi mai la mokomoko me ke kanaka ikaika o Kakuhihewa, a hina iho la

strong man. One of them must have been thrown, hence the shouting you hear.” When Kawelo heard this he became very anxious to go down and see it; but his grandparents would not consent.6 On the next day, however, Kawelo went down on his own account and saw his older brothers surf riding with many others at Kalehuawehe. He asked for a board which was given him and he swam out with it to where his brothers were waiting for the surf, and they came in together. After the surf riding, they went to the stream of Apuakehau and took a fresh water bath; and from there they went to the shed where the wrestling bouts were to be held.

Upon their arrival at the shed Kawelo stood up with the strong man to wrestle. At sight of this Kawelo’s older brothers said to him: “Are you strong enough to meet that man? If we whose bones are older cannot throw him, how much less are the chances of yourself, a mere youngster.” Kawelo, however, paid no heed to the remarks made by his brothers, but stood

kekahi, uwa ae la, a nolaila, kela pihe au e lohe la i ka uwa.” A lohe o Kawelo, olioli iho la ia e iho e ike, aka, aohe ae o na kupuna ona, nolaila, i kekahi la, iho aku la o Kawelo ma kona manao a hiki i kai, e heenalu ana na kaikuaana a me ka lehulehu i ka nalu o Kalehuawehe. Nonoi aku la o Kawelo i papa nona, a loaa mai la, au aku la ia i ka heenalu a loaa na kaikuaana, hee iho la lakou i ka nalu, a pau ka heenalu ana, hoi aku la lakou a ka muliwai o Apuakehau auau wai, a pau ka auau ana, hoi aku la lakou i ka hale mokomoko. A hiki lakou i ka hale, ku ae la o Kawelo me ke kanaka ikaika i ka mokomoko. I mai na kaikuaana: “He ikaika no oe e ku nei, a hina ka hoi maua na mea i oo ka iwi, ole loa aku oe he opiopio?” Aole o Kawelo maliu aku i keia olelo a kona mau kaikuaana, ku iho la no o Kawelo, a pela no hoi ua kanaka la. Ia wa, olelo mai ua kanaka ikaika la ia Kawelo, penei: “Ina wau e kahea penei, ‘Kahewahewa, he ua!’ alaila, kulai kaua.” Hai aku la no hoi o Kawelo i kana olelo hooulu, penei: “Kanepuaa! Ke nahu nei!

there facing the strong man. At this show of bravery the strong man said to Kawelo: “If I should call out, ‘Kahewahewa, it is raining’,7 then we begin.” Kawelo then replied in a mocking way: “Kanepuaa, he is biting, wait awhile, wait awhile. Don’t cut the land of Kahewahewa, it is raining.”8 While Kawelo was having his say, the strong man of Kakuhihewa was awarded the privilege of taking the first hold; and using his whole strength he attempted to throw Kawelo. Kawelo was almost thrown, but through his great strength and skill he was not. Kawelo, after mocking the man, took his hold and threw the strong man, who was thrown with Kawelo on top of him. This delighted the people so much that they all shouted.

When the older brothers of Kawelo saw how the strong man was thrown by their younger brother they were ashamed, and they returned home weeping and tried to deceive their grandparents. When they arrived at the house the grandparents asked them: “Why these tears?”

Alia! Alia i oki ka aina o Kahewahewa, he ua!” Ia Kawelo e olelo ana peia, lilo iho la ka olelo mua i ke kanaka ikaika o Kakuhihewa, a i ke kulai ana, aneane no e hina o Kawelo, a no ka ikaika no o Kawelo, aole i hina. Ia manawa hoomakaukau o Kawelo i kana olelo hooulu, a i ko Kawelo kulai ana hina iho la ia ia a kau iho la o Kawelo maluna, a uwa ae la na kanaka a pau loa.

A ike na kaikuaana o Kawelo, i ka hina ana o ke kanaka ikaika i ko laua kaikaina, hilahila iho la laua, a hoi aku la i ka hale me na olelo hoopunipuni i na kupuna, me ka uwe, a me ka waimaka. Ninau mai la na kupuna: “He waimaka aha keia?” I aku la laua: “I pehi ia mai nei maua e

They replied: “Kawelo threw stones at us. We are therefore going back to Kauai.” After the brothers of Kawelo had returned to Kauai, Kawelo and his wife and younger brother Kamalama lived on at Waikiki.

Not very long after this Kawelo began to learn dancing, but being unable to master this he dropped it and took up the art of war under the instruction of his father-in-law, Kalonaikahailaau. Kamalama also took up this art as well as Kanewahineikiaoha. After Kawelo had mastered the art of warfare, he took up fishing. Maakuakeke of Waialae was the fishing instructor of Kawelo.

Early in the morning Kawelo would get up and start out from Waikiki going by way of Kaluahole, Kaalawai, and so on to Waialae where he would chant out:

Say, Maakuakeke, Fishing companion of Kawelo, Wake up, it is daylight, the sun is shining, [8] The sun has risen, it is up.

Kawelo i ka pohaku, nolaila, e hoi ana maua i Kauai.” A hoi na kaikuaana o Kawelo i Kauai, noho iho la o Kawelo me kana wahine, a me kona pokii me

Kamalama. Mahope o laila, ao o Kawelo i ka hula, a o ka loaa ole o ia, haalele o Kawelo ia mea, a ao iho la i ke kaua me kona makuahunowai me

Kalonaikahailaau; ao iho la no hoi o Kamalama, a me

Kanewahineikiaoha. A pau ke ao ana i ke kaua, ao iho la o Kawelo i ka lawaia. O

Maakuakeke he kumu lawaia a Kawelo, no Waialae.

I ke kakahiaka nui, ala ae la o Kawelo a hele aku la mai Waikiki aku, a Kaluahole, Kaalawai, hiki i Waialae, paha aku la o Kawelo penei:

E Maakuakeke, Hoa lawaia o Kawelo nei la, E ala, ua ao, ua malamalama, [9]

Ua hiki ka la aia i luna; Lawe mai na kihele makau,

Bring along our hooks

Together with the fishing kit

As well as our net.

Say, Maakuakeke,

The rattling paddles, The rattling top covering, The rattling bailing cup, wake up, it is daylight.

While Kawelo was chanting, Maakuakeke’s wife heard it, so she woke her husband up saying: “Wake up, I never heard your grandparents chant your name so pleasingly as has Kawelo this morning. No, not even your parents. This is the first time that I have heard such a pleasing chant.” Maakuakeke then woke up, made ready everything called out by Kawelo in the chant, went out, boarded the canoe and they set out. As they were going along, Maakuakeke called out to Kawelo in a chant as follows:

Say, Kawelo-lei-makua, stop. Say, offspring of the cliffs of Puna,

The eyes of Haloa are above, My lord, my chiefly fisherman of Kauai.

Me ka ipu holoholona pu mai, Me ka upena mai a kaua; E Maakuakeke, Ka hoe nakeke, Ke kuapoi nakeke, Ke ka nakeke, e ala ua ao.

Ma keia paha a Kawelo, lohe ka wahine a Maakuakeke, hoala aku la i kana kane: “E, e ala, aole au i lohe i ka lealea o ko inoa i kou mau kupuna, aole hoi i na makua, a ia Kawelo akahi no au a lohe i ka lea o kou inoa.”

Ala ae la o Maakuakeke, hoomakaukau i na mea a pau a Kawelo i kahea mai ai, hele aku la a kau i luna o ka waa, a holo aku la laua. Ia laua e holo ana, kahea mai o Maakuakeke ia Kawelo, penei:

E Kawelo-lei-makua, e pae, E kama hana a ka lapa o Puna, Na maka o Haloa i luna, Kuu haku, kuu lawaia alii o Kauai.

Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.