Twitter Spam Detection using Pre-trained Model
Ankur Gupta1, Yogendra P.S. Maravi2, Nishchol Mishra3
1Ankur Gupta, School of Information Technology, RGPV, Bhopal, India.
2Yogendra P.S. Maravi, School of Information Technology, RGPV Bhopal, India.
3Nishchol Mishra, School of Information Technology, RGPV Bhopal, India.
Manuscript received on November 11, 2019. | Revised Manuscript received on November 20 2019. | Manuscript published on 30 November, 2019. | PP: 10520-10623 | Volume-8 Issue-4, November 2019. | Retrieval Number: D4228118419/2019©BEIESP | DOI: 10.35940/ijrte.D4228.118419
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: In the age of technology social media platform is becoming a great companion for expressing the thoughts, information, and opinion. It became the powerful tool for every person who wants to expand their networks of people beyond the physical boundation. We are living at that age where various categories of social media platform available according to work needed, it may be Facebook, LinkedIn, WhatsApp or Twitter. We are focusing our work on Twitter, It is also known as microblogging site which provides service to express the opinion in limited words. As the popularity of twitter is growing day by day users are joining the platform very fast, as it happens another side many spammers are also taking undue advantages of this platform, for any social media platform it is very important to maintain the secure, safer and trustworthy environment for their legitimate users. Twitter spams are more harmful than e-mail spam because of their higher clickthrough rate, as in the social network if someone trusted some spam a genuine post than it is higher chance that the persons in the network might also trust on that spam post and may click on it. There are plenty of methods available to handle the task of twitter spam detection problem, we are solving this problem of twitter spam at tweet level.Pre-trained models are some breakthrough in the journey of machine learning and natural language processing after their advancement they are of great help. Here we are using Bidirectional Encoder Representation from Transformer (BERT) model to solve the problem as our task is to solve the problem of imbalance dataset as well as the multilingual dataset, BERT makes a clear distinction in this type of task, the main advantage of this type’s model is that we don’t have to collect millions of data for better performance of the machine learning model.
Keywords: Twitter, BERT, Spam Detection, Pre-Trained.
Scope of the Article: Internet of Things.