Implementation of Correlation and Regression Models for Health Insurance Fraud in Covid-19 Environment using Actuarial and Data Science Techniques
Rohan Yashraj Gupta1, Satya Sai Mudigonda2, Pallav Kumar Baruah3, Phani Krishna Kandala4

1Rohan Yashraj Gupta*, Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India.
2Satya Sai Mudigonda, Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India.
3Pallav Kumar Baruah, Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India.
4Phani Krishna Kandala, Department of Mathematics and Computer Science, Sri Sathya Sai Institute of Higher Learning, Puttaparthi, India. 

Manuscript received on August 01, 2020. | Revised Manuscript received on August 05, 2020. | Manuscript published on September 30, 2020. | PP: 699-706 | Volume-9 Issue-3, September 2020. | Retrieval Number: 100.1/ijrte.C4686099320 | DOI: 10.35940/ijrte.C4686.099320
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Fraud acts as a major deterrent to a company’s growth if uncontrolled. It challenges the fundamental value of “Trust” in the Insurance business. COVID-19 brought additional challenges of increased potential fraud to health insurance business. This work describes implementation of existing and enhanced fraud detection methods in the pre-COVID-19 and COVID-19 environments. For this purpose, we have developed an innovative enhanced fraud detection framework using actuarial and data science techniques. Triggers specific to COVID-19 are identified in addition to the existing triggers. We have also explored the relationship between insurance fraud and COVID-19. To determine this we calculated Pearson correlation coefficient and fitted logarithmic regression model between fraud in health insurance and COVID-19 cases. This work uses two datasets: health insurance dataset and Kaggle dataset on COVID-19 cases for the same select geographical location in India. Our experimental results shows Pearson correlation coefficient of 0.86, which implies that the month on month rate of fraudulent cases is highly correlated with month on month rate of COVID-19 cases. The logarithmic regression performed on the data gave the r-squared value of 0.91 which indicates that the model is a good fit. This work aims to provide much needed tools and techniques for health insurance business to counter the fraud. 
Keywords: Fraud detection framework, Pearson correlation, Logarithmic regression, COVID-19, actuarial techniques, data science techniques, fraud detection, fraud prevention, fraud triggers.