Customers Churn Prediction with Rfm Model and Building a Recommendation System using Semi-Supervised Learning in Retail Sector
Punya P Shetty1, Varsha C M2, Varsha D Vadone3, Shalini Sarode4, Pradeep Kumar D5
1Mr. Pradeep Kumar D, Assistant Professor in the Department of Computer Science and Engineering in Ramaiah Institute of Technology.
2Miss Shalini Sarode Department of the Computer Science Engineering Program at the Ramaiah Institute of Technology.
3Miss Varsha D Vadone Department of the Computer Science Engineering Program at the Ramaiah Institute of Technology.
4Miss Varsha CM Department of the Computer Science Engineering Program at the Ramaiah Institute of Technology.
5Miss Punya Prakash Shetty Department of the Computer Science Engineering Program at the Ramaiah Institute of Technology.

Manuscript received on 08 April 2019 | Revised Manuscript received on 16 May 2019 | Manuscript published on 30 May 2019 | PP: 3353-3358 | Volume-8 Issue-1, May 2019 | Retrieval Number: A1425058119/19©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Customer churn or customer attrition occurs when certain customers are no longer loyal to a firm. In retail businesses, the event of churn is said to occur, if a customer’s transactions terminates after a certain duration. High churn rates incur humungous losses for the businesses as it is observed that acquiring new buyers is costlier than retaining the current customer base. Hence, for calculating customer churn of companies, they should be able to monitor churn rates. These churn rates give an organization various factors to be considered to determine their customer retention success rates and identify strategies for improvement. Customer churn is predicted using Pareto/NBD model. Once the customers who are likely to churn are predicted, they need to be differentiated based on their previous purchasing history. Natural Language Processing is used to model product categorization. Semi- supervised learning does customer segmentation. This consists of assigning a score by RFM model and segmenting using k-means clustering. The prediction of clusters is then done using algorithms like logistic regression, SVM and SGD classifier. These methods are collectively used to build a suitable recommendation system, which is targeted to make the churn customers who were valuable to the company loyal again, thereby improving the business for retailers.
Index Terms: NLP (Natural Language Processing), Pareto NBD (Pareto Negative Binomial Distribution), RFM(Recency-Frequency-Monetary), SVM (Support Vector Machine), SGD (Stochastic Gradient Descent)

Scope of the Article: E-Learning