Statistical Method for Named Entity Recognition in Telugu, an Indian Language
Suneetha Eluri1, Sumalatha Lingamgunta2
1Suneetha Eluri, Research Scholar, Assistant Professor, Department of Computer Science Engineering, Jawaharlal Nehru Technological University Kakinada AP India.
2Sumalatha Lingamgunta, Professor, Department of Computer Science Engineering, Jawaharlal Nehru Technological University Kakinada AP India.
Manuscript received on 21 March 2019 | Revised Manuscript received on 25 March 2019 | Manuscript published on 30 July 2019 | PP: 4211-4216 | Volume-8 Issue-2, July 2019 | Retrieval Number: B3500078219/19©BEIESP | DOI: 10.35940/ijrte.B3500.078219
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: One of the important tasks of Natural Language Processing (NLP) is Named Entity Recognition (NER). The primary operation of NER is to identify proper nouns i.e. to locate all the named entities in the text and tag them as certain named entity categories such as Entity, Time expression and Numeric expression. In the previous works, NER for Telugu language is addressed with Conditional Random Fields (CRF) and Maximum Entropy models however they failed to handle ambiguous named entity tags for the same named entity. This paper presents a hybrid statistical system for Named Entity Recognition in Telugu language in which named entities are identified by both dictionary-based approach and statistical Hidden Markov Model (HMM). The proposed method uses Lexicon-lookup dictionary and contexts based on semantic features for predicting named entity tags. Further HMM is used to resolve the named entity ambiguities in predicted named entity tags. The present work reports an average accuracy of 86.3% for finding the named entities.
Keywords: Named Entities, Statistical approach, and Hidden Markov Model, Lexicon Look up Dictionary, Telugu Language, NLP, Ambiguous Named Entity Tags.
Scope of the Article: Pattern Recognition