MONO-Spam: An Intelligent Spam Detector Based On Natural Language Processing
Eshwar.S1, Lavanya. K2
1Eshwar. S, Private university in Vellore, (Tamil Nadu), India.
2Lavanya. K, Private university in Vellore, (Tamil Nadu), India.
Manuscript received on 23 March 2019 | Revised Manuscript received on 30 March 2019 | Manuscript published on 30 March 2019 | PP: 449-457 | Volume-7 Issue-6, March 2019 | Retrieval Number: F2395037619/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: With the evolution of “social” world, people produce a lot of data. Data is being produced everywhere without the inherent knowledge of the people. And, with the incremental usage of social media and e-commerce sites etc., a user produces and consumes a lot of data. The ‘data’ referred to here is not the bandwidth but the text. This text can be in the form of comments, reviews, emails, names, identities, birth dates, offers, claims etc. The problem here is the integrity of data and where its end point is and the sanity. Integrity, although solved by cryptography algorithms, the sanity is always a question mark. Checking if a data is clean is the most crucial part or else a lot of space and valuable resources are wasted. In this paper, we provide a novel way of using Natural Language Processing and Multinomial Naive Bayes algorithm to filter spam before insertion. The model filters spam with an accuracy of about 96 percent
Keywords: Classifiers, Natural Language Processing, Bag Of Words, TF-IDF, Corpus, Multinomial Naive Bayes classifier
Scope of the Article: Classifiers