
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Thisaranie Kaluarachchi
University of Colombo School of Computing, Colombo 0700, Sri Lanka
Abstract - Web UI/UX developers typically devote significant effort to designing and developing a desired website, as the website design is constantly reviewed and modified by end users or stakeholders. This repetition is a time-consuming and labor-intensive step in website development In this paper, a novel automatic DOM hierarchy construction approach based on computer vision and machine learning is proposed to reduce the overhead of the iterative web UI prototyping process by creating the DOM for a website, which can eventually be translated into various front-end web frameworks. The proposed approach constructs the DOM hierarchy from a screen capture of a real-world website as a website mock-up design artifact. It detects and classifies web GUI elements in a website mock-up design artifact using computer vision techniques and a CNN-driven image classifier. To construct the DOM hierarchy for a website, heuristic algorithms are proposed that arrange classified web GUI elements in a tree structure. An in-depth analysis of the approach reveals that it is capable of (I) accurately detecting and categorizing web GUI elements in a screen capture of a real-world website and (II) generating DOM hierarchies similar to those designed by web developers.
Key Words: websitemock-updesignartifact,webGUIelementsdetection,webGUIelementsclassification,heuristicalgorithm, automaticDOMhierarchyconstruction
The DOM construction is an Application Programming Interface (API) for documents written in HTML and XML markuplanguages.Itisanessentialpartoffront-endwebdevelopment.Itprovidesawayforprogrammerstointeractwithand manipulate the structure of a website. With the help of DOM, web developers can access and modify different parts of a webpage.Thefinalmock-updesignsketchmaynotalwaysaccuratelyrepresentthelookandfeelofafinalwebpagethatthe clientexpects.IfthewebUI/UXdesignerdoesnotrefineamock-updesignsketchuntilitreachesthedevelopers,incorrect assumptionsmaybemadeduringtheiterativeprototypedevelopmentphase.Finally,thesketchedprototypeofthewebsite maynotsatisfytheclient’sexpectations.Themock-updesignsketchleadstoiterationsofdifferentcopiesofthedesignsketch tofine-tunetheprototypeuntilanagreementisreachedwiththeclient.FunctionalGUIcodeshouldbeiterativelymodified withtheDOMstructureaccordingly.
WebUI/UXdeveloperstypicallydevotesignificantefforttodesigninganddevelopingadesiredwebsite,asthewebsite design is constantly reviewed and modified by end users or stakeholders. This repetition is costly in terms of time and resources. As a result of these challenges, as well as the ever-changing web development trends with front-end code automation,anautomaticwebsitegenerationconcepthasemerged[1],partiallytoautomatewebdesignanddevelopment workflow.Automaticwebsitegenerationisaversatilesolutiontominimizetheburdenonthewebdesignanddevelopment processwhiledramaticallyminimizingwebdevelopmentcosts[2-4]andaccidentalmistakes[5].
According to a recent literature review on automatic website generation [1], researchers have invented several strategiestoautomatetheUIprototypingaspectinthewebsitedevelopmentprocess,suchasGUIelementclassificationfrom visualdesigns,DOMhierarchyconstruction,andHTMLcodegeneration.However,automatingtheUIprototypingaspectofthe websitedevelopmentprocessisstillabigchallenge.Thisleadstothemotivationofthisresearchwork,whichexplainsanovel automaticDOMhierarchyconstructionapproachfromscreencapturesofreal-worldwebsitestoreducetheburdenandtimewastageoftheiterativewebUIprototypingprocesstosomeextent.
Assuch,thisresearchpresentsandalgorithmtoconstructtheDOMofawebsitefromthemock-ups.Fortheresearch baseline,threestepsareconsideredtobeabletocreateafunctionalapproach:(I)identifyGUIelementstorealizeawebsite screen capture, (II) determine web GUI elements and their related attributes, and (III) generate the corresponding DOM hierarchyasthelaststep.Thesestepsareorganizedinapipeline,andeachstepcanbemodifiedindependently,facilitatingthe development and experimentation of different modules The DOM construction approach presented in this paper was comparedwiththestate-of-the-artwebpagegenerationtools Sketch2Code [14]and pix2code [2].Comparingthecodegenerated withtheDOMconstructionis,therefore,valid,andtheresultsindicatethattheapproachproducesDOMhierarchiescloserto thetargetDOMhierarchiesthan Sketch2Code and pix2code
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Theremainderofthispaperisorganizedasfollows:Section2conductsaliteraturereviewofthecurrentstate-of-theartmethodsresearchershaveengagedintoautomatethewebUIprototypingprocess.Section3focusesonthedesignscience research implementation of the proposed DOM hierarchy construction approach. Section 4 evaluates the approach by structurallycomparingconstructedDOMhierarchiestoactualDOMhierarchiesconstructedbyprofessionalwebdevelopers. Section5expandsonthediscussionofresults,andfinally,Section6concludestheresearchbylookingatthelimitationsand futuredirectionsoftheproposedDOMhierarchyconstructionapproach.
Withwebsitedevelopmenttrendschangingyearly,theautomaticwebsitegenerationtrendassistsUI/UXdevelopersin automatingaspectsofthewebUIprototypingprocess,providingaversatilesolutiontosignificantlyreducetheburdenandtime wasted in web design and development processes [1]. Two strategies researchers have invented to automate the web UI prototypingprocessareconsideredtomotivatetheproposedautomaticDOMhierarchyconstructionapproach. Theyareweb GUIelementsextractionandclassificationfromvisualdesigns[2,3,6-15]andDOMhierarchicalconstruction[2,6,7,11,12,14,15].
The related works show recent methods of extracting and classifying web GUI elements via computer vision (CV) and machinelearningtechniques.ClassicalCVtechniquesdetectandclassifythepositions,sizes,andtypesofwebGUIelementsfrom visualdesigns.Thestudies[6,7]utilizevariousCVtechniques,suchascontourdetection,edgedetection,colourfiltering,andline detection, to robustly classify web GUI elements, such as images, texts, titles, buttons, menus, inputs, etc., in an image. CV techniquesfacetheproblemofbeingextendedwithnewGUIelements,whichischallengingandtime-consuming.
Withtheadvancementinmachinelearning,theemergenceofnewfamiliesofmachinelearningmethodswasrequired.Lately, the Convolutional Neural Network (CNN) has been immensely popular due to the introduction of deep learning. CNN is a supervisedmachinelearningapproachthatcanautomaticallylearnrobust,salientimagecategoryfeaturesfrommanylabelled trainingimages.Therefore,theneuralnetworkisusedtoautomateimageclassificationforwebsites.Thestudies[2,3,8-15] developedvariousCNN-poweredvisionmodelstodetectandclassifywebGUIelementsfromvisualdesigns.
DOMhelpsdevelopersaccessandchangedifferentpartsofthepage,suchasheadings,paragraphs,images,andforms. Therefore,theDOMcanbethoughtofasawaytoorganizeandcontrolalltheelementsthatmakeupawebpage.IntheDOM, everyelementonawebpage,likeaheadingoraparagraph,istreatedasanobject.Usingobjects,theDOMprovidesaconsistent andorganizedwayfordeveloperstointeractwithwebpageelements.Therefore,classifyingthewebGUIelementsofawebpage isanimportantstepinautomaticallyconstructingaDOMhierarchyfromawebsitemock-updesignartifact.
TherelatedworksshowthemethodsofconstructingasuitableDOMhierarchyforawebsitewithpropergroupingofweb GUIelementsinwebGUIcontainers.Beltramelli[2]usesaDomainSpecificLanguage(DSL)languagemodelbasedonaShortTerm Long Memory (LSTM) network to encode classified web GUI elements and their semantic relationships into an intermediate DSL representation that represents the DOM hierarchy of the website. Alexander Robinson [6] introduceda recursivealgorithmthatrecursivelygroupsnestedwebGUIelementstobuildpredefinedwebGUIcontainers.Finally,the hierarchicaltreestructureofthewebsiteisbuiltbyrecursivelygroupingtheclassifiedwebGUIcontainers.Nguyenetal.[7] introducedamethodtodetectandgroupmobileGUIelementsintocontainersthatgeneratehierarchicalsub-trees.Then,theUI hierarchyformobileapplicationsisgraduallyconstructedbymergingthosehierarchicalsub-trees.
Yunetal.[12]proposedamethodtoconstructthehierarchicalstructureofaUIsketchasanXMLfileutilizingtheYOLOdeep neuralnetwork,whichextractedthestructuralanddescriptiveinformationofUIelements.Microsoft Sketch2Code [14]usesa layout construction algorithm to construct the DOM hierarchy of a website. Ferreira et al. [11] introduced a hierarchy reconstructionalgorithmthatiterativelyintegratesclassifiedwebGUIelementsandcontainerstobuildtheDOMhierarchyfora website. Similarly, to build the GUI skeleton from mobile design mock-ups, Chen C. et al. [15] utilized a Recurrent Neural Network(RNN)encoderandanRNNdecoderwithvisualfeaturesofmobileGUIelementstoproducetheGUIskeletonfrom tokensequencerepresentation,andeventuallyconvertedtoaUIhierarchythroughDepthFirstTraversal(DFT)algorithm.
Although generating a hierarchical UI representation is an active research field as suggested by these breakthroughs, constructingtheDOMfromvisualinputsisstillanearlyunexploredresearcharea.However,someresearchleveragesmachine learningtoconstructtheDOMwhereassomemethodsrelyonheuristicalgorithms.Ourresearchpresentsanothermethodof constructingtheDOMfromascreencaptureofawebsiteincollaborationwithmachinelearningandheuristicalgorithms.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
The research study aims to develop a novel automatic DOM hierarchy construction approach that constructs DOM representationsofwebsitesfromscreencapturesofreal-worldwebsites.Therefore,designscienceresearchmethodologyis followedtodesignandinvestigateanovelautomaticDOMhierarchyconstructionapproachasanartifact. Firstly, aseriesof imageprocessingtechniques(Canny’sedgedetectionalgorithm,morphologicaloperationsetc.)isappliedtoinfertheboundaries ofwebGUIelementsandtoextractimagesofeachGUIelementfromwebsitescreencaptures.Theoutputofthisphaseisasetof X-axisandY-axiscoordinatesonawebsitescreencapture.Similarly,basedonthederivedX-axisandY-axiscoordinates,a collectionofwebGUIelementimagescroppedfromtheinputwebsitescreencaptureisgeneratedandfedtothenextphasetobe classifiedintowebdomain-specificelementcategories
AfteridentifyingtheboundingboxesofwebGUIelementsinawebsitemock-updesignartifact,thecroppedimagesofthose elementsareclassifiedintotheirrespectivewebdomain-specificcategories.Forthat,aCNNimageclassifieristrainedtoclassify the images of web GUI elements However, no dataset contains labelled atomic web GUI elements. Therefore, a datasetis generated,includingallstructuralinformationnecessarytoconstructanartifacttodetectandclassifywebGUIelements.Screen capturesofcurrentreal-worldwebsitesareconsideredandcategorizedintothreedatasetsusedbytheproposedCNNtotrain, validateandtesttheimageclassifier.Astheinitialprototypeofthisapproach,pureHTML5andCSS-supportedwebsitesare attempted,and620real-worldwebsiteblocksarediscoveredfromHTMLwebsitetemplateprovidersontheinternetasof December2022.
WebsiteblocksdevelopedinpureHTML5andCSSaremanuallyfilteredout,excludingHTML5andCSS-supportedwebsite blocksdevelopedinframeworkssuchasBootstraporPHP,websiteblockswithnon-standardwebGUIelements,suchasgaming websites,tendtousenon-standardinteractivewebGUIelementsthatcanbecomplicatedtocategorize,andduplicatewebsites thatbelongtomorethanonecategory.Similarly,whileonlineadvertisementsarearegularfeatureofmostprominentwebsites, theyaremanuallyblockedwhenscreencapturesarecapturedtopreventdetectingnon-nativeelementssuchasflash,videosand frames.ThewebsitescreencapturesarecapturedusingPhantomJsinMATLAB,athird-partyscreenshotcapturermodule,or with browser extensions suchas GoFullPage for theGoogleChrome browserand Lightshot screenshottool for the Firefox browser.Tomaintaintheconsistencyofheightandwidthofallscreencaptures,thewebsitesdisplayinginlandscapeorientation werereviewed,andtheirscreencaptureswereverifiedataresolutionof1280x1024.
TotraintheproposedCNNimageclassifier,alargesetofimagesofGUIelementscroppedfromscreencapturesofreal-world websiteblocksannotatedwiththeirappropriatewebdomain-specifictypesisrequired.Forthat,awebscrapingtoolisusedto manuallytraversetheGUIofthetargetwebsiteinamannersimilartotheDepthFirstSearch(DFS)algorithmtocapturethe boundingrectanglecoordinates,givingthepreciselocations,width,andheight,andmetainformationofeachGUIelement,which arethenstoredinXMLformat.ThisinformationisutilizedinMATLABtoextractindividualsub-imagesofeachwebGUIelement displayedonthewebsitemock-updesignartifactssavedinthedataset.Finally,theindividualsub-imagesofeachGUIelement aremanuallylabelledintotheirrespectivewebdomain-specifictypeswithpreciousaccuracy.Thisprocessproducedadataset containingscreencapturesof620distinctreal-worldwebsiteblockswith33,120sub-imagesofwebGUIelementsfrom12most prominentnativeclasses.
ThewebGUIelementimagesdatasetissegmentedintotraining,validation,andtestingdatasetsonascaleforusewiththe suggestedCNNimageclassifier.Thesegmentationmethodologydescribedin[16]researchstudyisfollowedtosegmentwebGUI elementimagesintotraining,validation,andtestingdatasetsinproportionsof75%,15%,and10%,respectively. TheCNN architecturesuggestedissimilartothearchitectureofAlexNetCNN[17],whichisoneofthemostcommonlyusedconvolutional neuralnetworksforimageclassification.Ithastwofewerconvolutionallayers(3insteadof5). ThesuggestedCNNisdeveloped inMATLAButilizingthreetoolkits:NeuralNetwork[18],ComputerVision[19],andParallelComputing[20]. Aftertrainingthe CNNimageclassifier,newunseenimagesofwebGUIelementscanbefedtotheclassifier,resultinginaseriesofclassification scorescorrespondingtoeachelementclass.Withtheproposedapproach,thewebGUIelementclasswiththehighestdegreeof confidenceisassignedtobetheappropriatelabelforagiventargetimage.Section4evaluatestheclassificationprecisionofthe proposedCNNimageclassifierusingthewebGUIelementsdatasetdescribedinthissection.
ThefinalstageintheapproachistobuildtheDOMhierarchyofawebsitewiththepropergroupingofwebGUIelementsin web GUIcontainers. HTML5 offerssixsemantic layoutcomponents(header, nav, section,article,aside,andfooter)asGUI containers to define different parts of a webpage. Therefore, an algorithm is designed to derive HTML5 semantic layout componentsbygroupingcorrespondingGUIelements.ThealgorithmrequireseachGUIelement'scoordinates,width,andheight in the input website image. The algorithm traverses the image along the X- and Y-axis directions to draw circumscribed rectangles,whicharerectanglesthattouchatleastonepointoftheGUIelementwitheachofitsslides,whileallpixelsoftheGUI elementcontainedwithinaninternalareaoftherectangleoritsborders.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
AcircumscribedrectangleRectiisdefinedasapairofcoordinatesthatsatisfythefollowingexpression:topleftcorner(x1r, y1r)andbottomrightcorner(x2r,y2r).Assumethatthecoordinatesofapixel p aredenotedbyxpandyp,andtheimageofaGUI elementisdenotedbyelei (topleftcorner(x1,y1)andbottomrightcorner(x2,y2)).
(x1r, y1r), (x2r, y2r): p elei (x1r ≤ xp
˄
(x
˄ p elei (x
)
)
EvenifawebGUIelementbelongstomorethanonecircumscribedrectangle,everywebGUIelementshouldbelongto exactlyonecircumscribedrectangle.Therefore,circumscribedrectanglescanbeassumedaswebGUIcontainerscontainingany sequentialwebGUIelements.Inconstructingcircumscribedrectangles,rectangleswithintersectioncanbeignoredbecauseitis assumedthatmostwebGUIcontainersinsimpleHTML5andCSS-supportedwebsiteblocksmaynotoverlap.Ifacircumscribed rectanglesatisfiesthecriteriaof(I)notintersectingothercircumscribedrectanglesand(II)beingstructurallyidenticaltooneof thepredefinedHTML5semanticlayoutcomponents,itisextractedasawebGUIcontainerwithitsassociatedelements.The algorithmisprovidedwithalistofstructuralinformationofpredefinedHTML5semanticlayoutcomponentstoensurethe derivedcontainersareunique.ThelistofHTML5semanticlayoutcomponentscomprisesstructuralinformationofsemantic layout components (x, y coordinates, width, and height) and their corresponding categories; the structural information is collectedusingawebscrapingtool,andthecorrespondingcategoriesaremanuallylabelled.Finally,itgeneratesacollectionof HTML5semanticlayoutcomponentsintheinputwebsiteimage,alongwitheachlayoutcomponent’sstructuralinformation, categorynames,andGUIelementcomposition.Thepseudo-codeofthealgorithmcanbeexploredinAlgorithm1asfollows.The resultofthealgorithmisdepictedinFigure1andFigure2
Input: elements -element1(x1,y1,width1,height1)…………elementn(xn,yn,widthn,heightn)and HTML_containers -alist ofpredefinedHTML5semanticlayoutcomponents
Output: containers (alistofcontainerswithassociatedelements)
1: function CONTAINERDERIVATION(elements)
2: containers []
3: temp.containers []
4: for all do
5: xa min( ,
6: ya min( ,
7: xb max( + ),( )
8: yb max( + ),( ) 9: R rectangle(xa,ya)(xb,yb)
10: if R then
11: rectangle(xa,ya)(xb,yb)
12: continue
13: end if 14: temp. containers [] R
15: end for 16: for all do
17: if =NULL
18: if belongsto HTML_containers then
19: containers
20: end if
21: end if
22: end for
23: return containers
24: end function
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
1: Examples of Subsets Identified as Containers on a Website Block
2: Examples of Subsets That Not Identified as Containers on a Website Block
Accordingly,allsubsetsofwebGUIelementswithcircumscribedrectanglesinblue,asshowninFigure1,arecontainersand correspondtopredefinedHTML5semanticlayoutcomponents,whereassubsetswithcircumscribedrectanglesinred,shownin Figure2,arenotcontainersbecauseofintersectionsbetweenrectanglesanddonotcorrespondtoanyofthepredefinedHTML5 semanticlayoutcomponents.
OncetheGUIcontainercomponentsarederived,analgorithmisdesignedtoconstructtheDOMhierarchyofawebsitewith theproperarrangementofthederivedwebGUIcontainersandelements.Forthat,aniterativeDOMhierarchyconstruction algorithmisdesignedtoderivearealisticDOMhierarchyforawebsitegraduallybyutilizingderivedwebGUIcontainersas HTML5 semantic layoutcomponents, in which successive web GUI elementsare grouped, and then constructing the DOM hierarchywiththeproperlayoutofwebGUIelementsandcontainers.
Accordingtotheconceptofchild,parent,leaf,andinternalnodesingraphtheory,forinstance,ifN1 andN2 arewebGUI elementsandthecoordinatesofN1aregreaterthanthecoordinatesofN2(N1>N2),thenN2isachildofparentN1.Accordingly, nodes can be classified into two types based on their orientation property: definite and indefinite. A node is definite if its orientationcanbeuniquelydefinedashorizontalorvertical,whereasanodeisindefiniteifithasbothhorizontalandvertical orientationsimultaneouslyorcannotbedetermined.Thestructureofawebsitescreencaptureisatreeinwhicheachelement correspondstoadefinitenodewithaparent-childrelationship,andeachelementmustbeconnectedtoexactlyoneparent, exceptfortherootnode,whichhasnone.TheDOMstructureofa websitefromascreencaptureisconstructedusingthat conceptbysuggestinganiterativeDOMhierarchyconstructionalgorithmwhereGUIcontainersareconstructedassub-trees, andthefinaltreeisconstructedbyhierarchicallyarrangingthosesub-trees. Thepseudo-codeofthealgorithmcanbeexplored in Algorithm 2 asfollows.TheresultofthealgorithmisdepictedinFigure3.
Input: containers -acollectionofderivedHTML5semanticlayoutcontainersforawebsitemock-updesignartifact withtheirstructuralinformation,categoryname,andwebGUIelementcomposition
Output: |containers|==1and containers1 -therootnodeofthetreecontainingallwebGUIcontainersandwebGUI elementsforawebsitemock-updesignartifact
1: function DOMCONSTRUCTION(containers)
2: for all do
3: for all do
4: orientation .orientation
5: if orientation isHorizontal then
6: selecttheelementtotherightof
7: else if orientation isVertical then
8: selecttheelementtothebottomof elementi+1
9: if isnotNULL then
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
10: if is definite then 11: make achildof 12: else
13: newElement new Element
14: make achildof newElement
15: make achildof newElement
16: newElement.orientation orientation
17: newElement
19: i i +1
20: end if 21: end if 22: end for
23: k k +1
24: end for
25: for all do
18: \
26: orientation orientation
27: if orientation isHorizontal then
28: selectthecontainertotherightof
29: else
30: selectthecontainertotheleftof
31: end if
32: i i +1
33: end for
34: return containers1
35: end function
ThealgorithmintendstoiterativelyprocesseachwebGUIcontainertoconstructsub-treeswith parent-child relationshipsof elementsbymergingeachwebGUIelementintoadjacent definite webGUIelementsateachstage.Thehierarchicalarrangement ofsuccessiveGUIelementswithineachGUIcontainerisperformedbyevaluatingtheirorientationaccordingtothegraphtheory. UsingMATLAB,hierarchyconstructionheuristicsaredevelopedtobuildsub-treesforwebGUIelementstobeproperlyplacedin GUIcontainers.Initially,thealgorithmproducesacollectionofsub-treeswithrelatedwebGUIelementsarrangedhierarchically. ThealgorithmisfurtherextendedtoconstructarealisticDOMhierarchyforawebsitebyiterativelyarrangingtheconstructed sub-treeslikeatree.Thealgorithmterminateswhenexactlyonecomponent(containers1)isconstructedastherootrepresenting atreestructureofwebGUIcontainersandwebGUIelements,eachwithacorrespondingname(seeFigure3).
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
Thesuggestedapproachisevaluatedontwovalidationstrategies:(I)theaccuracyoftheCNN-drivenclassifieronwebGUI elementimagesclassificationand(II)thesimilarityoftheDOMhierarchiesconstructedbytheapproachtorealDOMhierarchies constructedbywebdevelopers.Fortheexperimentoftheapproach,thestudyconsistsof(I)asetof33,120labelledimagesof webGUIelementsextractedfrom620distinctwebsitescreencapturestoassesstheaccuracyofCNN-drivenwebGUIelement classifiers,(II)64additionalwebsitescreencaptures(excludedfromthedatasetusedtotrainandtesttheCNNimageclassifier) toserveaswebsitemock-updesignartifacts,and(III)twolatestapproachesonautomaticwebsitegeneration: Sketch2Code [14] and pix2code [2].
Table1showsthedatasetsutilized,brokendownbywebGUIelementclass.ThesuggestedCNNwastrainedonthetraining datasetandverifiedonthevalidationdataset.AlltheunseenwebGUIelementimagesinthetestingandvalidationdatasetswere collectedfromthescreencapturesofrealwebsiteblocksandweredistinctfromthoseinthetrainingdataset.Theeffectiveness oftheproposedCNN-drivenimageclassifierisevaluatedbymeasuringtheaveragetop-1classificationprecisionacrossallweb GUIelementclassesinthetestdataset,asshownbelow.
Here, True Positives are instances where the CNN image classifier correctly predictspositive web GUI element classes, whereas False Positives areinstanceswheretheclassifierincorrectlypredictsnegativewebGUIelementclasses. Section5 describestheclassificationcapabilitiesoftheCNN-drivenimageclassifier,whicharerepresentedinconfusionmatrixwith precisionacrossclasses.
Table 1: Distribution of Labelled Web GUI Element Images on Training, Validation and Testing Datasets
DOM Hierarchy Construction
ThesimilarityofDOMhierarchiesconstructedbytheapproachwasmeasuredbycomparingtoagroundtruthsetofDOM hierarchiesandasetofDOMhierarchiesconstructedbytwobaselinewebsitegenerationapproaches: Sketch2Code [14]and pix2code [2].Forthispartofthestudy,64websitescreencaptureswereselectedfrom620distinctscreencapturesofreal websitesbyrandomlyselectingfourwebsiteblocksfromeachcategorytoutilizeasinputstoourapproach, Sketch2Code,and pix2code approaches,fromwhichDOMhierarchiesweregenerated.To comparetheDOMhierarchiesconstructedbyeach approach, the runtime DOM hierarchies extracted dynamically from the input websites for each approach using Selenium WebDriver are compared to the set of ground truth DOM hierarchies extracted from the original websites. The Selenium WebDriver representation of the website reflects the automatically generated GUI-related source code for each approach
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
displayedatruntimeonthedevicescreen.ItcanbeusedtoaccuratelycomparetheDOMhierarchyconstructionofwebGUI elementsandcontainersforeachapproach.
TheruntimeDOM-hierarchiesofallgeneratedprototypewebsitesforeachapproacharecomparedtotheoriginal,ground truthruntimeDOM-hierarchies(extractedfrom Selenium WebDriver)bydeconstructingtheDOMtreesusingpre-ordertraversal andusingtheWagner-Fisheralgorithm[21]implementationof Levenshtein editdistancetocalculatethesimilaritybetweenthe DOMhierarchicalrepresentationsoftheruntimeGUIs.TheDOMhierarchiesaredeconstructedsuchthattheelementtypeand nestedorderofGUIelementsareincludedinthehierarchydeconstruction.Thepre-ordertraversalisimplementedinthiswayto avoidsmalldeviationsinotherattributesincludedinthe Selenium WebDriver information,suchaspixelvalues,giventhatthe maingoalofthisevaluationistomeasurehierarchicalsimilarities.Threetraditionaleditoperationssupportedby Levenshtein distance:insertion,deletion,andsubstitutionareconsideredinthemeasurementofeditdistance.Tomorecompletelymeasure the similarity of the DOM hierarchies of prototype websites to the ground truth DOM hierarchies, a weighting schema is introducedthatrepresentsapenaltyvalueforeachtypeofeditoperationrequiredtotransformoneGUIelementofaDOMinto another,wherein,inthedefaultcase,eacheditoperationcarriesanidenticalweightof1/3.
5.1 Results:
To evaluate the results produced by the CNN-driven image classifier, a confusion matrix is presented to illustrate the classificationprecisionacrossthe12mostpopularwebGUIelements.Table2showstheconfusionmatricesgeneratedbythe CNN-drivenimageclassifier.TheX-axisofaconfusionmatrixrepresentsthewebGUIelementclasspredictedbytheclassifier, andtheY-axisrepresentsitsactualwebGUIelementclass.Similarly,thefirstcolumnoftheconfusionmatricesrepresentsthe totalnumberofeachwebGUIelementinthetestdataset,whilethenumbersinthesubsequentcolumnscorrespondtothe percentageofeachclassontheY-axisthatwereclassifiedaswebGUIelementsontheX-axis.Thus,thediagonalofthematrix, highlightedingreen(seeTable2),correspondstothetrueclassification(i.e., True Positives)ofeachwebGUIelementclass, whereas False Positives areshownintheothercells.
ColumnHeadingAbbreviationsforWebGUIElements:Heading(H),Paragraph(P),Image(I),Hyperlink(HL),DropdownMenu(DM),Button(B),GoogleMap (GM),TextField(TF),TextArea(TA),Icon(IC),SearchBar(SB)andLabel(L).
TheCNN-drivenimageclassifierachieves89.8%ofthetop-1precisiononthepredictionofwebGUIelements,whichisthe overallprobabilityoftrueclassifications Therefore,itisexperimentallystatedthattheCNN-drivenimageclassifieremployedin theproposedapproachperformsmuchbetter
5.2 Results: DOM Hierarchy Construction
AnimportantstepintheDOMhierarchyconstructionprocessistheautomaticgenerationofaDOMhierarchy,enablinga well-structuredlayoutandthepropervisualizationofwebGUIelementsintoHTML5webGUIcontainers.Theevaluationofthe approach’sDOMhierarchyconstructioncomparesagainstthetwobaselinewebsitegenerationapproaches, Sketch2Code and pix2code,bydecomposingtheruntimeDOM-hierarchiesintotreesandmeasuringtheLevenshteineditdistancebetweenthe generatedtreesandtargettreesasdescribedinSection4.2.Byvaryingthepenaltyvaluesprescribedtoeacheditoperation,a morecomprehensiveunderstandingofthesimilarityofthegeneratedDOMhierarchiescanbegainedbyobserving,forinstance,
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
whethercertainDOMhierarchiesweremoreorlessshallowthantherealwebsites,byexaminingtheperformanceofinsertion anddeletionedits.
Figure4comparestheDOMhierarchiesgeneratedbyourapproach, Sketch2Code,and pix2code approachesbasedontreeedit distance(I–III),whichmeasurestheminimumcostofeditingoperationswhentransformingonetreeintoanother.Eachgraph illustrates results for a different edit operation, and the proposed approach, Sketch2Code, and pix2code approaches are representedbylinesinthreedifferentcolourswiththeeditdistanceshownontheY-axisandthepenaltyprescribedtotheedit operationontheX-axis.
Thegraphsshowthatthe Sketch2Code and pix2code approachestrendtoproducerelativelyshallowDOMhierarchies.
Figure 4: Similarity of DOM Hierarchies Constructed by WebDraw, pix2code and Sketch2code based on Levenshtein Edit Distances
Whenthepenaltyforinsertionoperationisincreased,ourapproachoutperformsboth Sketch2Code and pix2code.Thisisbecause ourapproachrequiresfewerinsertionoperationsintotheDOMhierarchiestomatchthetargetDOMhierarchiesprecisely. However, Sketch2Code and pix2code requireaddingmoreinnernodestotheDOMtreesbecausetheyproduceveryshallowDOM hierarchies.TheevaluationshowsthattheshallowDOMhierarchiesproducedbythe pix2code and Sketch2Code approaches matchthetargetDOMhierarchiesandourapproach'sDOMhierarchies.Whenrenderingawebsiteinawebbrowser,ashallow DOMhierarchyispreferred,andourapproachbuildsDOMhierarchiesthataresignificantlysimilartotargetDOMhierarchies andtendtobeslowerthan Sketch2Code and pix2code
AlthoughtheproposedapproachisaneffectivetoolforconstructingDOMhierarchies,ithassomepracticallimitations, someofwhichpresentfutureresearchopportunitiesinautomaticDOMhierarchyconstruction.Whenwebsitesarescreencapturedfromdesktoporlaptopcomputers,theiterativeDOMhierarchyconstructionalgorithmiscurrentlytiedtothewebsite’s appearanceonaparticularscreensize.Althoughthisalgorithm canbegeneralizedtoworkregardlessofscreensizeusing displaydensity-independentpixel(dp)values,thisisataskforthefuture.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
The current CNN-driven web GUI element classifier can classify incoming images into one of the top 12 web GUI elements.Thus,itcurrentlydoesnotsupportunusualwebGUIelementtypes.FutureresearchstudiescouldexamineCNN architectureswithgreaterclassificationcapacity(forinstance,CNNarchitecturesindeeplearningtoclassifymorewebGUI elementswithwebGUIcontainers),orresearcherscouldevenexaminenewlyemergingarchitecturessuchashierarchicalCNNs [22].TheDOMhierarchyconstructionapproachcurrentlyoperatesintwostepstodetectandclassifywebGUIelementsfroma websitemock-updesignartifact;however,futureresearchstudieswillinvestigatethepotentialofobjectdetectionclassifiersin CNN[23,24]thatcanperformtheseobjectdetectionandclassificationstepssimultaneously.
Theever-growingfieldofmachinelearninghasledresearcherstoasignificantstepforward:improvingUIdesignsto provideabetterfunctionaluserexperienceatalowercost.Withthediscoveryoftheapproach,researcherscanexploreobject detection-orientedCNNclassifiersinthefuturetoprovidebettersupportinthewebGUIelementsdetectionphaseandobject classification.Similarly,theapproachcanbedeployedasanautomaticwebsiteprototypegenerationtoolthatconvertsthe constructedDOMhierarchiesintofunctionalHTML/CSScode.
ThenovelautomaticDOMhierarchyconstructionapproachpresentedinthispaperwillbenefitwebUI/UXdevelopers byallowingthemtospeeduptheiterativewebUIprototypingprocessintheirworkflowbyenablingthegenerationofautomatic DOMforawebsiteformatoftheirchoice.Therefore,theinventiveapproachmeetstheneedtointroduceanovelapproachtothe automaticDOMhierarchyconstructionstrategyintheautomaticwebUIprototypingconcepttoreducetheburdenandwasteof timeintheiterativewebUIprototypingprocess.
Thisresearchreceivednospecificgrantfromfundingagenciesinthepublic,commercial,ornot-for-profitsectors.
[1] Kaluarachchi,T.,&Wickramasinghe,M.(2023). A systematic literature review on automatic website generation.Journalof ComputerLanguages,Volume75,2023,101202,ISSN2590-1184,https://doi.org/10.1016/j.cola.2023.101202
[2] Beltramelli,T.(2018,June). pix2code: Generating code from a graphical user interface screenshot.InProceedingsoftheACM SIGCHISymposiumonEngineeringInteractiveComputingSystems(pp.1-6).
[3] Suleri, S., Sermuga Pandian, V. P., Shishkovets, S., & Jarke, M. (2019, May). Eve: A sketch-based software prototyping workbench.InExtendedAbstractsofthe2019CHIConferenceonHumanFactorsinComputingSystems(pp.1-6).
[4] Moran,K.,Bernal-Cárdenas,C.,Curcio,M.,Bonett,R., &Poshyvanyk,D.(2018). Machine learning-based prototyping of graphical user interfaces for mobile apps.IEEETransactionsonSoftwareEngineering,46(2),196-221.
[5] Ozkaya, Are DevOps and automation our next silver bullet? IEEESoftw.36(4)(2019)3–95.
[6] Robinson,A.(2019). Sketch2code: Generating a website from a paper mock-up.arXivpreprintarXiv:1905.13750.
[7] T.A.NguyenandC.Csallner, Reverse Engineering Mobile Application User Interfaces with REMAUI (T).201530thIEEE/ACM International Conference on Automated Software Engineering (ASE), Lincoln, NE, USA, 2015, pp. 248-259, DOI: 10.1109/ASE.2015.32.
[8] Hassan,S.,Arya,M.,Bhardwaj,U.,&Kole,S.(2018). Extraction and Classification of User Interface Components from an Image. InternationalJournalofPureandAppliedMathematics,118(24),1-16.
[9] Aşıroğlu,B.,Mete,B.R.,Yıldız,E.,Nalçakan,Y.,Sezen,A.,Dağtekin,M.,&Ensari,T.(2019,April). Automatic HTML code generation from mock-up images using machine learning techniques.In2019ScientificMeetingonElectrical-Electronics& BiomedicalEngineeringandComputerScience(EBBT)(pp.1-4).IEEE.
[10]Kim, B., Park, S., Won, T., Heo, J., & Kim, B. (2018, October). Deep-learning based web UI automatic programming. In Proceedingsofthe2018ConferenceonResearchinAdaptiveandConvergentSystems(pp.64-65).
[11]J.Ferreira,A.Restivo,H.S.Ferreira, Automatically Generating Websites from Hand-Drawn Mock-Ups,InProceedingsofthe 16thInternationalJointConferenceonComputerVision,ImagingandComputerGraphicsTheoryandApplications,2021.
[12]Yun,Y.S.,Jung,J.,Eun,S.,So,S.S.,&Heo,J.(2019). Detection of GUI elements on sketch images using object detector based on deep neural networks. InProceedingsoftheSixthInternationalConferenceonGreenandHumanInformationTechnology: ICGHIT2018(pp.86-90).SpringerSingapore.
International Research Journal of Engineering and Technology (IRJET) e-ISSN: 2395-0056
Volume: 11 Issue: 12 | Dec 2024 www.irjet.net p-ISSN: 2395-0072
[13]Kaluarachchi,T.T.,Kumara,H.M.M.S.D.,&Wickramasinghe,M.I.E.(2020). Turning GUI into HTML source code with Image processing and Deep learning.PGISInternationalResearchCongress(RESCON)2020,FacultyofScience,University ofPeradeniya,SriLanka.
[14]Sketch2Code-MicrosoftAIlabs,2019, https://sketch2code.azurewebsites.net/
[15]ChunyangChen,TingSu,GuozhuMeng,ZhenchangXing,YangLiu, From UI Design Image to GUI Skeleton: A Neural Machine Translator to Bootstrap Mobile GUI Implementation,in:Proceedingsofthe40thInternationalConferenceonSoftware Engineering,ICSE’18,ACM,NewYork,NY,USA,2018,pp.665–676.
[16]M.D.ZeilerandR.Fergus. Visualizing and Understanding Convolutional Networks.Cham:SpringerInternationalPublishing, 2014,pp.818–833.[Online].Available: https://doi.org/10.1007/978-3-319-10590-1_53
[17]A.Krizhevsky,I.Sutskever,andG.E.Hinton. Imagenet Classification with Deep Convolutional Neural Networks.InAdvances in Neural Information Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and K. Q. Weinberger, Eds. Curran Associates,Inc.,2012,pp.1097–1105.[Online].Available:http://papers.nips.cc/paper/4824-imagenet-classification-withdeep-convolutional-neural-networks.pdf
[18]MATLABNeuralNetworkToolbox:https://www.mathworks.com/products/neural-network.html
[19]MATLABTesseractOCRLibrary:https://www.mathworks.com/products/computer-vision.html
[20]MATLABParallelComputingToolbox:https://www.mathworks.com/products/parallel-computing.html
[21]Wagner-Fischeralgorithm:https://en.wikipedia.org/wiki/Wagner-Fischer_algorithm
[22]R.Girshick,J.Donahue,T.Darrell,andJ.Malik,2014. Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation.InProceedingsofthe2014IEEEConferenceonComputerVisionandPatternRecognition,ser.CVPR’14. Washington,DC,USA:IEEEComputerSociety,pp.580–587.
[23]S.Ren, K. He, R.Girshick,andJ. Sun,2015.FasterR-CNN:TowardsReal-Time ObjectDetection with Region Proposal Networks.InProceedingsofthe28thInternationalConferenceonNeuralInformationProcessingSystems,ser.NIPS’15. Cambridge,MA,USA:MITPress,pp.91–99.[Online].Available:http://dl.acm.org/citation.cfm?id=2969239.2969250
[24]Gu,Jiuxiang,ZhenhuaWang,JasonKuen,LianyangMa,AmirShahroudy,BingShuai,TingLiuetal.,2018. Recent Advances in Convolutional Neural Networks.Patternrecognition,77,354-377.
BIOGRAPHIES
ThisaranieKaluarachchireceivedaB.Sc.degreeinComputerSciencefromtheFacultyofScienceatthe UniversityofPeradeniyainKandy,SriLanka.SheispursuingherPhDinComputingattheUniversityof ColomboSchoolofComputinginColombo,SriLanka.HerresearchinterestsincludeArtificialIntelligence, MachineLearning,DeepLearning,andComputerVision.