RESUME INFORMATION EXTRACTION WITH A NOVEL TEXT BLOCK SEGMENTATION ALGORITHM Shicheng Zu and Xiulai Wang Post-doctoral Scientific Research Station in East War District General Hospital, Nanjing, Jiangsu 210000, China

ABSTRACT In recent years, we have witnessed the rapid development of deep neural networks and distributed representations in natural language processing. However, the applications of neural networks in resume parsing lack systematic investigation. In this study, we proposed an end-toend pipeline for resume parsing based on neural networks-based classifiers and distributed embeddings. This pipeline leverages the position-wise line information and integrated meanings of each text block. The coordinated line classification by both line type classifier and line label classifier effectively segment a resume into predefined text blocks. Our proposed pipeline joints the text block segmentation with the identification of resume facts in which various sequence labelling classifiers perform named entity recognition within labelled text blocks. Comparative evaluation of four sequence labelling classifiers confirmed BLSTMCNNs-CRF’s superiority in named entity recognition task. Further comparison among three publicized resume parsers also determined the effectiveness of our text block classification method.

KEYWORDS Resume Parsing, Word Embeddings, Named Entity Recognition, Text Classifier, Neural Networks. Full Text :

OPTIMIZE THE LEARNING RATE OF NEURAL ARCHITECTURE IN MYANMAR STEMMER Yadanar Oo and Khin Mar Soe Natural Language Processing Lab, University of Computer Studies, Yangon, Myanmar

ABSTRACT Morphological stemming becomes a critical step toward natural language processing. The process of stemming is to reduce alternative forms to a common morphological root. Word segmentation for Myanmar Language, like for most Asian Languages, is an important task and extensively-studied sequence labelling problem. Named entity detection is one of the issues in Asian Language that has traditionally required a large amount of feature engineering to achieve high performance. The new approach is integrating them that would benefit in all these processes. In recent years, end-to-end sequence labelling models with deep learning are widely used. This paper introduces a deep BiGRUCNN-CRF network that jointly learns word segmentation, stemming and named entity recognition tasks. We trained the model using manually annotated corpora. State-of-the-art named entity recognition systems rely heavily on handcrafted feature built in our new approach, we introduce the joint model that relies on two sources of information: character level representation and syllable level representation.

KEYWORDS Myanmar word stemmer, Sequence labelling, Conditional random fields, Neural architecture, word segmentation

Full Text :

SYLLABLE-BASED NEURAL NAMED ENTITY RECOGNITION FOR MYANMAR LANGUAGE Hsu Myat Mo and Khin Mar Soe Natural Language Processing Lab., University of Computer Studies, Yangon, Myanmar

ABSTRACT Named Entity Recognition (NER) for Myanmar Language is essential to Myanmar natural language processing research work. In this work, NER for Myanmar language is treated as a sequence tagging problem and the effectiveness of deep neural networks on NER for Myanmar language has been investigated. Experiments are performed by applying deep neural network architectures on syllable level Myanmar contexts. Very first manually annotated NER corpus for Myanmar language is also constructed and proposed. In developing our in-house NER corpus, sentences from online news website and also sentences supported from ALT-ParallelCorpus are also used. This ALT corpus is one part of the Asian Language Treebank (ALT) project under ASEAN IVO. This paper contributes the first evaluation of neural network models on NER task for Myanmar language. The experimental results show that those neural sequence models can produce promising results compared to the baseline CRF model. Among those neural architectures, bidirectional LSTM network added CRF layer above gives the highest Fscore value. This work also aims to discover the effectiveness of neural network approaches to Myanmar textual processing as well as to promote further researches on this understudied language.

KEYWORDS Bidirectional LSTM_CRF, Myanmar Language, Named Entity Recognition, Neural architectures, Syllablebased

Full Text :

