Corpora Based Classification to Perform Sentiment Analysis in Kannada Language
Shankar R1, Suma Swamy2
1Shankar R*, Department of CSE, BMS Institute of Technology and Management, VTU, Bengaluru, India.
2Suma Swamy, Department of CSE, Sir M. Visvesvaraya Institute of Technology, VTU, Bengaluru, India. 

Manuscript received on January 05, 2020. | Revised Manuscript received on January 25, 2020. | Manuscript published on January 30, 2020. | PP: 5186-5191 | Volume-8 Issue-5, January 2020. | Retrieval Number: E6872018520/2020©BEIESP | DOI: 10.35940/ijrte.E6872.018520

Open Access | Ethics and Policies | Cite  | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (

Abstract: In this modern era, the users’ opinions play an uncanny role in understanding how well a product has satisfied the customer requirements, so that the producer can change the product to suit the customers’ demands and these reviews also help the new consumers to decide on whether to purchase the product or not. Analysis of a particular entity’s feelings in terms of positive, negative or neutral polarization is known as ‘Sentiment Analysis’. SentimentAnalysis is a sub-domain of opinion mining.Here the analysis is focused on the mining of emotions and opinions of the people towards a specific topic. The emotions and opinions are collected in the form of organized, semi-organized or amorphous data. As the world is slowly progressing towards regional languages, this article talks about extracting the opinions of a product in Kannada and performing analysis about these reviews and classifying them accordingly. The dataset or the corpus is scarce as it is not English. The limited corpus is being collected via website – through an API. However, extracting inclusive opinion manually from huge amorphous data would be a tedious task. An automated system called ‘Sentiment Analysis or Opinion Mining’ can solve this problem, which can analyze and extract the observation of the user throughout the reviews. In this classifier of review analysis, the process classifies the review via corpus, which is a huge collection of pre-defined data. The API that has been used is Python-Beautiful Soup via utf-8 text recognition method to parse Kannada characters. The reviews are converted to text sentence and each word of the sentence are broken down. Data mining task is done to find the sentiment of each word by comparing it with two stored files named as good.txt and bad.txt. Further, the analyzed result is given through text output as Positive, Negative or Neutral sentiments based on their weights.
Keywords: Classification, Corpora, Kannada Lexicon, Opinion Mining In Kannada.
Scope of the Article: Classification.