CLEAR Journal September 2018 Edition

Page 1


B.Tech Electrical and Electronics Engineering

1

CLEAR JUNE 2018


CLEAR Journal (Computational Linguistics in Engineering And Research) M. Tech Computational Linguistics, Dept. of Computer Science and Engineering, Govt. Engineering College, Sreekrishnapuram, Palakkad678633 www.simplegroups.in simplequest.in@gmail.com Chief Editor Shine S Assistant Professor Dept. of Computer Science and Engineering Govt. Engineering College, Sreekrishnapuram, Palakkad678633 Editors Aiswarya K Surendran Bhavya K Minimol M Pavithra C P Sreeja V Cover page and Layout Zeenath M T

Editorial……………………… 3 CLEAR SEPTEMBER 2018 Invitation………………………19 Last word……………………………20

Why did Artificial Intelligence Fail in Predicting 2018 FIFA World Cup Result ? .......................................................5 Resmi P

Artificial Intelligence to Predict Alien Life on Other Planets .......................................................8 Amrutha C

Text Based Multi-Emotion Extraction Using Linguistic Analysis ...................................10 Divya Visakh

Alias Links Identification from Narratives...................................13 Kavya T S

Face Detect – Track System: For Criminal Detection .....................................................16 Manjusha K

2

CLEAR JUNE 2018


Dear Readers, Here is the latest edition of CLEAR Journal, which comes with some new articles through which we discuss several things like the reason why AI made the wrong prediction in FIFA 2018, how AI predicts alien life on other planets, text based multi-emotion extraction, identification of Alias links from narratives and face detection and recognition in criminal detection processes etc. We are very happy that we are getting new readers, and that gives us much motive to make improvements and keep going well. As always, we make improvements based on your valuable feedbacks, and expect more. On this hopeful prospect, I proudly present this edition of CLEAR Journal to our faithful readers and look forward to your opinions and criticisms.

Best Regards, Shine S (Chief Editor)

3

CLEAR JUNE 2018


Placement •

• •

Rahul M of M.tech Computational Linguistics , 2016-18 batch got placement for the post of Associate Software Engineer. Sandhini S of M.tech Computational Linguistics , 2016-18 batch got placement for the post of Fellow, ICFOSS at Trivandrum. Uma E S of M.tech Computational Linguistics , 2016-18 batch started working as Assistant Professor at Cochin College of Engineering and Technology.

Internship Anisha T S, Bhavya K, Pradeep T and Sandeep Nithyanandan of M.Tech Computational Linquistics, 2017-19 batch got Internship at Lymbyc, Bangalore.

Workshop On Python A one day workshop on “PYTHON” was held at GEC Sreekrishnapuram organised on 14th September 2018 by second year M.Tech Computational Linquistics students. The workshop was conducted at M.Tech Computational Linquistics Lab for M.Tech 2018-2020 batch. Basics of python was introduced to the students using many example programs.

VFS Talk by Gautham Anil VFS Talk on ‘Introduction to Data Science in practice’ for M.Tech and B.Tech CSE students was given by Dr. Gautham Anil, Principal Architect, Data Scientist working in 4INFO, San Francisco on 03/08/2018.

Content Developers Anisha T S, Bhavya K and Pradeep T of M.Tech Computational Linquistics, 2017-2019 batch got selected as Content Developers for a Machine Learning course offered by ASAP (Additional Skill Acquisition Programme) combined with IIT Palakkad, IIT Madras and CET Trivandrum.

Simple Groups Congratulates All for their Achievements!!

4

CLEAR JUNE 2018


Why Did Artificial Intelligence Fail in Predicting 2018 FIFA World Cup Result? Resmi P M.Tech Computational Linguistics Government Engineering College,Palakkad reshma.resmi81@gmail.com

The FIFA World Cup 2018 (Russia) ended on Sunday July 15th, with France as the champion, followed by Croatia and Belgium. Just like the previous World Cup 2014, many researchers tried to predict the outcome of the tournaments in Russia in advance. This year, the FIFA World Cup 2018 was no exception, and researchers and scientists tried to exploit Artificial Intelligence (AI) and statistics to predict the outcomes of all 64 matches in FIFA World Cup. Artificial Intelligence (AI) has made a lot of noise recently, and it is known as the future technology. Nowadays, AI is becoming a part of every large and medium businesses. But how reliable would it be? In this article, I talk about the performance of AI in predicting the results of the World Cup 2018 as a sample use-case. Either you are an expert in AI or not, I try to keep this article as simple and understandable as possible. There are different approaches to predict the results of FIFA World Cup. One approach is to simulate every single match in a paired comparison in terms of team’s capabilities and the winning odds. Zeileis, Leitner, and Hornik (2018) used the same technique, and they predicted that Brazil would win the FIFA World Cup 2018 with a probability of 16.6%, and it is followed by Germany (15.8%) and Spain (12.5%) [1]

5

CLEAR JUNE 2018

Swiss Bank UBS also predicted the same three teams as the top 3 teams but in a different order. They predicted Germany (24.0%) as the champion, followed by Brazil (19.80%) and Spain (16.1%). Their generated model was based on four factors: 1) the Elo rating; 2) the teams’ performances in the qualifications preceding the World Cup; 3) the teams’ success in previous World Cup tournaments, and 4) a home advantage. The model was calibrated by 10,000 Monte Carlo simulations to determine winning probabilities and the results of the last five tournaments [2]. On June 8, 2018, four researchers (A. Groll et al.) from Technical University of Dortmund (Germany), Ghent University (Belgium), and Technical University of Munich (Germany) published a research paper on arXiv predicting the results of the FIFA World Cup 2018 using a well-known algorithm of Artificial Intelligence, Random Forest, and Poisson ranking algorithm [3]. This paper was published online days before the opening game of the world cup between Russia and Saudi Arabia, on June 14. They used a dataset covering all matches of the last four FIFA World Cups (2002–2014). They predicted Spain as the champion, followed by Germany and Brazil as runnerups.


These three mentioned research came up with the same top 3 teams of Spain, Germany, and Brazil, in different orders. They used three different methods, data and data features, but they came with almost a similar result. Now, the world cup is over, and we can see that all those models failed to predict the world cup results correctly, and none of the predictions happened. First of all, they used a good data source. Secondly, they considered many features and parameters for training. Thirdly, they employed the algorithm of Random Forest. In the rest of this article, I discuss its data features, error, and the reason of its failure in this area.

Data Features A. Groll et al. considered various features related to the team itself, such as 1) Economic factors (GDP per Capita, Population); 2) Sportive factors (ODDSET probability, FIFA Ranking); 3) Home advantage (Host,Continent, onfederation); 4) Team’s structure factors (Maximum number of teammates for each squad, Average Age, Number of Champions League players, Number of Legionnaires); 5) Team’s coach factors (Age, Duration of tenure, Nationality). In total, they had 16 features for each team and each world cup.

Prediction After running the tournaments’ simulations for 100,000 times, Spain was predicted to be the champion reaching final with 28.9% of chance, followed by Germany (26.3%), and Brazil (21.9%).

Error As we observed in the FIFA World Cup 2018, none of the predicted top 2 teams could reach to quarter-finals, let alone the final games (Brazil reached quarter-finals). Based on the actual results of the world cup, and the predictions, Root Mean Square Error RMSE) and Mean Absolute Error (MAE) of the model are calculated as below: RMSE:8.052 MAE:6.468

Classification Model As mentioned earlier, they used “Random Forest” that is one of the well-known algorithms in Machine Learning. This algorithm works based on “Decision Tree”, and it has shown high performance in data classification in many cases. They also used Poisson models to rank the teams based on their current abilities.

6

CLEAR JUNE 2018

Fig: Predicted results of the FIFA World Cup 2018 by the algorithm — Source: [3]

Random

Forest


Why did AI fail? In Machine Learning (- Supervised learning in this case), it is very important to have proper data for training and modeling. But in this case, despite having proper data (16 features, cleaned), relatively large data (past four world cups), and good algorithms with right parameters, the trained model failed terribly. The reason for this failure relies on the nature of what we are predicting. FIFA World Cup like any other human-based incidents is dependent on too many factors (not only 16) before and during the match (for minimum 90 minutes), which are known as confounding variables. In order to predict the results correctly, every single minute of each match should be simulated. The result of each state (every minute/second) of the match depends on the preceding states. This is also known as Markov Chain Process. An incorrectly simulated state can easily result in unreliable outcomes for the proceeding states of the game. Besides the internal factors, the results of a football match may also be significantly influenced by some external factors as well, such as an unfair referee, weather, political situation, even personal problems of players, etc. These important features are usually very difficult to be measured and collected. In addition, there is always some chance of exploration, and uncertainty, for instance having a critical mistake or scoring an own goal, which are not easily predictable. In a nutshell, stochastic and dynamic environments such as FIFA World Cup or human activities are those areas that the today’s technology of AI cannot perform very well. This is a very good example to 7

CLEAR JUNE 2018

note that we have to be very careful about the applicability of AI in the similar dynamic fields. Also, by having a very complex data structure, it might be very difficult to audit the trained models for any potential bias. The existence of bias in AI can simply lead to discriminative decisions against a particular group. The implementation of such systems responsible as the sole decision maker may cause huge problems for both individuals and companies. Governments and companies are recommended to use AI for stochastic and dynamic environments only as a supplementary decision-making platform.

References [1] Zeileis, A., C. Leitner, and K. Hornik (2018): “Probabilistic forecasts for the 2018 FIFA World Cup based on the bookmaker consensus model,” Working Paper 2018–09, Working Papers in Economics and Statistics, Research Platform Empirical and Experimental Economics, Universität Innsbruck. [2] Audran, J., M. Bolliger, T. Kolb, J. Mariscal, and Q. Pilloud (2018): “Investing and football — Special edition: 2018 World Cup in Russia,” Working paper, UBS. [3] Groll, A., C. Ley, G. Schauberger, and H. Van Eetvelde (2018): “Prediction of the FIFA World Cup 2018 — A random forest approach with an emphasis on estimated team ability parameters,” Working Paper.


Artificial Intelligence to Help Predict Alien Life on Other Planets Amrutha C M.Tech Computational Linguistics Government Engineering College, Palakkad amruthac17@gmail.com

Artificial intelligence could soon help scientists determine whether other planets harbour alien life. Researchers from Plymouth University’s Centre for Robotics and Neural Systems used artificial neural networks (ANNs), which use similar learning techniques to the human brain, in order to estimate the probability of extraterrestrial life on other worlds. They hope that the technology will be used aboard robotic spacecraft on alien-hunting space missions. “We’re currently interested in these ANNs for prioritising exploration for a hypothetical, intelligent, interstellar spacecraft scanning an exoplanet system at range,” said Christopher Bishop, a PhD student at Plymouth University who led the study.

Meet Alma, the alien hunting telescope that can see 12 billion years into the past. They also looking at the use of large area, deployable, planar Fresnel antennas to get data back to Earth from an interstellar probe at large distances. This would be needed if the technology us used in robotic spacecraft in the future. The AI system works by classifying planets into five different types, determined by whether they are most similar to modern-day Earth, early Earth, Mars, Venus or Saturn’s moon Titan. Once a planet has been classified, the neural network uses a “probability of life” metric based on the profile of the five target types. Upcoming space missions that could make use of the technology include NASA’s

8

CLEAR JUNE 2018


upcoming exoplanet-finding TESS spacecraft and the European Space Agency’s Ariel Space Mission.Both missions will gather vast amounts of data, which Plymouth University’s ANNs could analyze for signs of life. “Given the results so far, this method may prove to be extremely useful for categorising different types of exoplanets using results from ground-based and near Earth observatories,” said Professor Angelo Cangelosi from the university’s faculty of science and engineering, who supervised the study.

Inference at the Edge: Edge Computing takes compute closer to the applications. Each edge location mimics the public cloud by exposing a compatible set of services and endpoints that the applications can consume. It is all set to redefine enterprise infrastructure.

Hybrid Learning Models: Approach that combines different types of deep neural networks with probabilistic approaches to model uncertainty.

Deeper Personalization: Personalization can be great, but it can also be equally annoying. We have all experienced recommendations that seem to bear no actual relation to anything that we may actually be interested in. In the future, users will likely receive more precise recommendations and adverts will become both more effective and less inaccurate. The user experience will vastly improve for all.

References [1] https://www.theweek.in/news/scitech/2018/04/04/artificial-intelligence-mayhelp-predict-alien-life-study.html [2]https://www.sciencedaily.com/releases/20 18/04/180404093914.html

Cognitive Services: This technology includes kit like APIs and services through which developers can create more discoverable and intelligent applications. Machine learning APIs will allow developers to introduce intelligent features such as emotion detection; speech, facial, and vision recognition; and language and speech understanding into their applications. The future of this field will be the introduction of deeply.

9

CLEAR JUNE 2018


Text Based Multi-Emotion Extraction Using Linguistic Analysis Divya Visakh M.Tech Computational Linguistics Government Engineering College, Palakkad dvisakh19@gmail.com Emotions are considered to be the mental state of expressing the feelings that arises spontaneously from human beings while communication. Emotions are mainly expressed in the form of speech, facial expressions, gestures, through the grammar usage and written text. Now-a-days people chose text as the interface to express emotions while communication. So the emotion extraction from text grabbed lot of attention. Emotions are classified in this system with the help of Plutchik Model. Plutchik Model is a categorized model that contains 8 basic bipolar emotions that includes 6 basic emotions that are taken from the basic Ekman Model. The eight emotions includes ANGER, DISGUST, FEAR, SADDNESS, SURPRISE, TRUST and ANTICIPATION. Every text can contain more than one emotion and out of that only one of them will be text dominant emotion. So in this method, it has modelled emotion extraction problem as multi label classification problem by removing the fixed boundaries of the text and find all the existing emotions and the text predominant emotion The proposed method aims to recognize all the existing emotions in the text and determine the sentence predominant emotion. It is achieved by utilizing the structured and semantic information in the

10

CLEAR JUNE 2018

sentence, linguistic data and the machine learning techniques. For multi-label classification of emotions, needs to follow these steps:• Sentence dependency tree construction to analyze its semantic. • Sentence segmentation in the separate parts based on conjunction words. • For each part: o The initial emotion determination by using a machine learning method o The ultimate emotion determination by using a rule-based method • Emotion combination according to defined rules and sentence's final emotion determination. In the first phase, main idea is to extract all the emotions in the sentence and with the help of natural language tool (OpenNLP), the sentence structural and semantic information is obtained. Therefore the grammar role of each word in the sentence is determined, from that a dependency tree can be built to show the relationship between each words and sentence. From the above Sentence Segmentation phase, it is easy to determine different parts of the sentence subject and helps to answer the question:“Is different part of sentence has the same subject or not?”


From this question, it will be able to understand that he different parts of the sentence carry the same emotion or not. In the Second phase, rules are defined based on the conjunction words present in the sentence. Based on any language grammar, different parts of the sentence are mainly linked each other with the help of some words. Those words are called Conjunction words. Used conjunction words helps to determine the type of emotion that conveys through the sentence when it is defined by some rules. For example: “It was a memorable day but i was not happy”- it has two sections (“it was a memorable day” and “i was not happy”). The first section provides a positive emotion and second section implies a negative emotion. With the help of conjunction word ‘but’ the whole sentence ends in a negative emotion. So there are some rules defined for the conjunction words on the language structure: • Rule1: If conjunction word is but, two parts have conflict emotions and dominant emotion is in the next part of but. •

Rule2: If conjunction word is as, two parts have conflict emotions and dominant emotion is in the post part of as.

Rule3: If conjunction word is and, if two parts subject’s is same, two parts have similar emotion else two parts have conflict emotion and both parts have same weight. In the third stage, the initial and ultimate emotions are determined. The initial 11

CLEAR JUNE 2018

emotions are determined with the help of machine learning technique and ultimate emotion is obtained with the help of rulebased method. To recognize the existing emotions in the sentence, sentences are divided into sections based on conjunction words (if present any conjunction words). Then each part of the sentence is processed separately and the emotion present in each section is extracted. In the third stage, the initial and ultimate emotions are determined. The initial emotions are determined with the help of machine learning technique and ultimate emotion is obtained with the help of rule-based method. To recognize the existing emotions in the sentence, sentence are divided into sections based on conjunction words (if present any conjunction words). Then each part of the sentence is processed separately and the emotion present in each section is extracted.

The Initial Emotion Determination To determine initial emotion, each segment is transferred to the vector space. To transfer, each segments to the vector space, the stop words are removed and stems of the words are extracted with the help of Porter stemmer. To remove stop words, English stops are used. In the stepof creating feature vector, there arises a problem in the sparseness. In order to reduce the problem WordNet’s synsets are used. For the main words present in the sentence, their synsets are searched and existing synsets present will be replaced by the main words. Transition Point (TP) and Term FrequencyInverseDocument Frequency (TF-IDF) are the two weighting models used to build feature vectors.


These are represented with the help of some equations. So therefore each section is represented with two feature vectors and then the emotional class of each part is determined by using Support Vector Machine technique. The equations to represent both TP and TF-IDF are given as follows:

Determine the Ultimate Emotion To determine the ultimate emotion of any part, the linguistic information is used. Some words in every language has the property that can change the intensity of the emotion or type of emotion. For example, in English language, words like very, quite, not, no. These have the capability to change the sentence ultimate emotion. So two lists are prepared, one is to identify the change in intensity and the other to check the type changing of the sentence. One is Intensity changing list and another one is Type changing list. First consider the presence of absence of such types of words in each part. If there is a word from Type changing list, the final emotion will change to the conflict emotion of determined initial emotion and if there is a word from Intensity changing list, the current applied intensity will be added to 12

CLEAR JUNE 2018

the determined initial emotion and the result is returned as the ultimate emotion, and in third case if the words ofboth lists are available, the priority will be given to the changing intensity word. In the fourth phase, it’s about combining the emotions according to the defined rules and the sentence’s ultimate emotion determination. The experiments have been carried on multi-label dataset that contains 629 sentences with eight emotional categories. Based on some results, the proposed method, compared with used multi-label learning methods (BR, RAKEL, MLkNN) have shown a better performance. For future works, a large, suitable and annoted database can be created and used for multi-labelling tasks. Also can implement fuzzy rules to extract emotions and usage of semantic graph can provide more accuracy and better results. The whole methodology is to extract the text dominant emotion from the multi-emotion text. In this, the use of language semantic and structural information is very effective and also shows some better performance when compared with the multi-label algorithms.

References [1] "Recognizing Emotions and Sentiments in text", SRJSCI, accessed online on 14 September 2018. [2] "A learning based emotion classifier with semantic text processing" Springer, accessed online on September 2018. [3] "Emotion Extraction using Rule-based and SVM-KNN algorithm, Semantic -scholar


Alias Links Identification from Narratives Kavya T S M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram kavya.thripura@gmail.com A narrative includes multiple names, objects, places and so on. These are referred as participants (entities of interest) in a narrative. Identification of distinct and independent participants in a narrative is an important task for many NLP applications like timeline creation, question-answering, summarization, and information extraction. This task becomes challenging because these participants are often referred to using multiple aliases. A new approach based on Markov Logic Network (MLN) has been adopted to encode the linguistic knowledge for identification of aliases.This system works on the output of off-the-shelf coreference resolution system, rather than identifying aliases/coreferences from scratch. This helps in exploiting the strengths (such as linking pronoun mentions to their antecedents) of the existing systems and overcome the weaknesses (such as resolving generic NP mentions) by incorporating additional linguistic knowledge. This technique has three broad phases: • • •

Identification of participants MLN based formulation to identify aliases Composite mention creation

13

CLEAR JUNE 2018

It uses a Unified Linguistic Denotation Graph (ULDG) representation of NLPprocessed sentences in the input narrative. The ULDG unifies output from various stages of NLP pipeline such as dependency parsing, NER and coreference resolution.

Figure 1: Input ULDG initialized with NER + Coreference. In the figure, alias edges(Ea) are shown using dotted lines; participant edges (Ep) are shown using thick arrows; dependency edges (Ed) are shown using thin labeled arrows.New Ea edges: <man, Bonaparte>, <man, him>, &<man, His> are added. Newly added Ep edges are highlightedwith thick, filled arrows. Participant types of man & school are changed to PER & ORG


respectively; type of France is changed to OTH.

The predicates and key first-order logic rules are described in Table. Here, Alias(x, y) is the only query predicate. Hard rules

Description

Alias(x, x) ; Alias(x, Reflexivity and y) => Alias(y, x) symmetry of aliases Alias(x, y) ^ Alias(y, Transitivity z) => Alias(x, z) aliases

of

Alias(x, y) ¬Alias(y, z) ¬Alias(x, z)

of

^ Transitivity => aliases

Alias(x, y) => ( If x and y are NEType(x, z) aliases, their entity NEType(y, z)) types should be same Figure 2: Output ULDG after applying Algorithmon input ULDG in Figure 1 Phase-I: In this phase, it updates participant type of headword hof a generic NP if its Word-Net hypernymscontain PER/ORG/LOC indicating synsets. It also adds new Ep edges from h to dependent nodes of h using dependency relations compound, amod or det to get corresponding mention boundaries. It ensures that participant types of all nodes in a single clique in Ea are same by giving higher priority to NER-induced type than WordNetinduced type. Phase-II: In this phase, it encodes linguistic rules in MLN to add new Ea edges. MLN gives the benefits of (i) ability to employ soft constraints, (ii) compact representation, and (iii) ease of specification of domain knowledge. 14

CLEAR JUNE 2018

Conj(x, y) ¬Alias(x, y)

=> If x and y are conjuncts, then they are less likely to be aliases

Table 1: MLN Predicates and Rules Others are evidence predicates, whose observed groundings are specified using G. As it uses a combination of hard rules (i.e., rules with infinite weight) and soft rules (i.e., rules with finite weights), probabilistic inference in MLN is necessary to get find most likely groundings of the predicateAlias(x, y). As the goal is to minimize supervision and to avoid dependence on annotated data, it relies on domain knowledge in the current version to set the MLN rule weights. Phase-III: In this phase, it extracts an auxiliary subgraphG’(V’;E’) ⸦ G; where V’


contains only those nodes which correspond to headwords of basic participant mentions and E’ contains only those edges incident on nodes in V’ and labeled with appos or nmod. It identifies each independent participant mention in G’ and merges its dependent mentions using depth first search (DFS) on G’. Finally, each clique in Ea represents aliases of a unique participant. It uses the earliest non-pronoun mention in text order as the canonical mention for that clique. References [1] “Identification of Alias Links among Participants in Narratives”, https://aclanthology.info/papers/P182011/p18-2011, Accessed online on 11 September 2018. [2] “Universal Stanford dependencies: A cross-linguistic typology”, https://aclanthology.coli.unisaarland.de/papers/L14-1045/l14-1045, Accessed online on 11 September 2018. [3] “Alias detection in link data sets”, https://www.cs.cmu.edu/~neill/papers/hsi ung-alias.pdf, Accessed online on 11 September 2018. [4] “Markov Logic Networks for Text Mining:A Qualitative and Empirical Comparison with Integer Linear Programming”, http://www.lrecconf.org/proceedings/lrec2016/pdf/993_P aper.pdf, Accessed online on 11 September 2018.

15

CLEAR JUNE 2018

Smarter Chatbots

Powered by smarter AI, chatbots are now being deployed by companies to handle customer queries to deliver more personalized interactions whie eliminating the need for actual human personnel. Big data has a lot to do with delivering a more pleasant customer experience as bots process large amounts of data to provide relevant answers based on the entered keywords by customers in their queries. During interactions, they are also able to collect and analyze information about customers from conversations. This process can help marketers develop a more streamlined strategy to achieve better conversions.

Rapidly Growing IoT Networks

It’s becoming quite common that our smartphones are being used to control our home appliances, thanks to the technology called the internet of things (IoT). With smart devices such as Google Assistant and Microsoft Cortana trending in homes to automate specific tasks, the growing IoT craze is drawing companies to invest in the technology’s development.


Face Detect – Track System: For Criminal Detection Manjusha K M.Tech Computational Linguistics Government Engineering College, Sreekrishnapuram manjushasreepadmam@gmail.com

Face detection and recognition means that a system is able to identify that there is a human face present in an image or video and identify that person. A criminal detection frame work that could help policemen to recognize the face of a criminal. The frame work is client-server video based recognition. The face detect-track system uses an android platform to capture the image or video in client side. For the face detection stage, Viola-Jones algorithm is used and the face tracking stage is based on optical flow algorithm. Optical flow is implemented using two feature extraction methods, fast corner feature and regular feature. ▪

▪ ▪

▪ ▪

Policemen capture a video for a criminal or suspect using smart phone. Real time face detection and tracking is done at client side. The video frames containing the detected and tracked face are sent to the server. Video based face recognition is done at the server side. Personal information for the recognized person is sent back from the server to the policemen.

Face detection has attracted immense attention because it has many applications in

16

CLEAR JUNE 2018

computer vision communication and automatic control system. Face detection is a method to detect a face from an image which have several attributes in that image. Research into face detection, expression recognition, face tracking, pose estimation is required. By giving a single image, challenge is to detect the face from that image. Face detection is a challenging task because faces are not rigid and it changes in size, shape, colour etc. Face detection become more challenging task when given image is not clear and occluded by any other thing and not proper lightning, not facing camera etc.

Deep neural network or deep learning is used to detect and track the face. For face detection we use Viola-Jones algorithm. The face detect-track system deals with frames generated in the real time in which N is the number of the captured frames. The detection takes its role at the first frame and at every m frames to allow other faces to be


detected and tracked. The detection stage determines faces windows, and removes any faces not found longer. In every tracking iteration the face is preserved by distinctive and efficient features. The feature extraction method produces the facial points to be used in the optical flow function. Optical flow function takes the previous frame, the face points, and the next frame as inputs to predict the face location in the next frame.

1. Viola-Jones algorithm: Viola-Jones algorithm is the good algorithm for detection. It is a real time algorithm and for practical application 2 frames per second must be processed. The algorithm contains four parts: Haar feature selection, creating integrated image, AdaBoost training and Cascading classifiers. All human faces show some similar features. These regularities may be matched using Haar features. They can simply viewed as corners and based on it we detect there is an edge or not. If we select a picture then apply each Haar feature to that window pixel. We calculate the output for each pixel.

17

CLEAR JUNE 2018

Viola-Jones algorithm use AdaBoost to reduce number of features. In integral image the value of a pixel at (x,y) is calculated by using the sum of pixels at left and above of(x,y). AdaBoost is used to eliminate the redundant features. It is a machine learning algorithm and helps to find the best feature among all. After the features are found a weighted linear combination of all these features is used in evaluating and deciding any given window has a face or not. 2. The image of a person is matching bit by bit. The images match with the images stored in database. Fast Corner Method: Fast corner method is used for feature point’s detection in a face window. In fast corner method, a pixel p is classified as a center point by analyzing a circle of sixteen pixels around it. then finding at least n adjacent pixels with intensities larger than the intensity of the pixel p from these sixteen points. These n points present key feature points in the detected face window, and are used to track the face by optical flow. Regular Feature Method: To extract Regular features the face window is represented by a matrix n×m where n is the number of row and m is the number of columns. Then select the features as points

within the height and width of the face window.


The criminal detection and tracking framework is a client server video based face recognition surveillance in the real time. The basic uses of face detection and tracking is to unlock phones and specific applications, Biometric surveillance, banks ,retail stores, stadiums, airports and other facilities use facial recognition, for identifying criminals to reduce crime, for attendance system, illegal immigrant detection, for intelligent robotic system and for safety alert.

a type of array . For the selected window pixel has two values for identifying and is represented by using black or white. The values are added to the array. We use ViolaJones algorithm to detect. By using artificial intelligence we get the percentage value. After the matching process it checks the colour and the basic features. For tracking we use optical flow. Finally we will get the persons details. Then we can check the person is criminal or not. References [1] "Robust Real-Time Face Detection," https://link.springer.com/article/10.1023 /B:VISI.0000013087.49260.fb,Accesse d online on 14 September 2018 [2] "Face Tracking Using Optical Flow," https://ieeexplore.ieee.org/document/73 14604/, Accessed online on 14 September 2018 [3] "Low Complexity Head Tracking on Portable Android Devices for Real-time Message Composition,"https://link.springer.com/ article/10.1007%2Fs12193-015-0174-7, Accessed online on 14 September 2018

In face detect-tracking system we use machine learning or face learning technique. For detection we use Viola-Jones algorithm in Open CV. then we will crop the image and apply deep neural network. Then check the similarity detection by using set of images stored in database. If the image is not in database the automatically it will add to database. The files are stored as xml file, it is 18

CLEAR JUNE 2018

[4] "Face Detection and Tracking at different angles in video," http://www.arpnjournals.com/jeas/resea rch_papers/rp_2015/jeas_0915_2620.pd f, Accessed online on 14 September 2018 [5] "Real-time and multi-view face tracking on mobile platform," https://ieeexplore.ieee.org/document/59 46774/, Accessed online on 14 September 2018


M.Tech Computational Linguistics Dept. of Computer Science and Engg, Govt. Engg. College, Sreekrishnapuram Palakkad www.simplegroups.in simplequest.in@gmail.com

SIMPLE Groups Students Innovations in Morphology Phonology and Language Engineering

Article Invitation for CLEAR- December-2018

We are inviting thought-provoking articles, interesting dialogues and healthy debates on multifaceted aspects of Computational Linguistics, for the forthcoming issue of CLEAR (Computational Linguistics in Engineering And Research) Journal, publishing on December 2018. The suggested areas of discussion are:

The articles may be sent to the Editor on or before 10th December, 2018 through the email simplequest.in@gmail.com. For more details visit: www.simplegroups.in

Editor,

Representative,

CLEAR Journal

SIMPLE Groups

19

CLEAR JUNE 2018


Hello world, This latest edition of CLEAR journal comes with some latest advancements in Machine Learning and Natural Language Processing In the very previous edition, we have seen how AI predicted the results of 2018 FIFA world cup, and got to know that the prediction became wrong. In this edition, we discussed about what made the prediction come wrong. The researchers are working so much to get to know about the existence of life on other planets, and now AI can help them in their works too. The latest work in emotion extraction from text shows that it could outperform the existing works to an extent. The very new facial recognition and tracking system writes a new chapter in crime detection task. The new approach based on Markov Logic Network(MLN) for identifying Alias links is also much interesting. These articles are made based on the recent works or researches done in the field of computational linguistics. CLEAR is thankful to all who have given their valuable time and effort for contributing their thoughts and ideas . Simple group invites more aspirers in this field. Wish you all the success in your future endeavours‌!!! Bhavya K

20

CLEAR JUNE 2018


21

CLEAR JUNE 2018


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.