Attribute Balanced Leveling with Ada Boost Regressor for Predicting Heart Disease using Machine Learning
Shermin Shamsudheen1, Rincy Merlin Mathew2, M. Shyamala Devi3
1Shermin Shamsudheen, Lecturer, Department of Computer Science, College of Computer Science & Information Systems, Jazan University, Saudi Arabia.
2Rincy Merlin Mathew, Lecturer, Department of Computer Science, College of Science and Arts, Khamis Mushayt, King Khalid university, Abha, Asir, Saudi Arabia.
3M. Shyamala Devi, Professor, Computer Science and Engineering, Vel Tech Rangarajan Dr. Sagunthala R&D Institute of Science and Technology, Avadi, Chennai, TamilNadu, India.
Manuscript received on January 02, 2020. | Revised Manuscript received on January 15, 2020. | Manuscript published on January 30, 2020. | PP: 2488-2493 | Volume-8 Issue-5, January 2020. | Retrieval Number: E5816018520/2020©BEIESP | DOI: 10.35940/ijrte.E5816.018520
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: The technological advancement can help the entire application field to predict the damage and to forecast the future target of the object. The wealth of the world is in the health of the people. So the technology must support the technologists in predicting the disease in advance. The machine learning is the emerging field which is used to forecast the existence of the heart disease through the values of the clinical parameters. With this view, we focus on predicting the customer churn for the banking application. This paper uses the customer churn bank modeling data set extracted from UCI Machine Learning Repository. The anaconda Navigator IDE along with Spyder is used for implementing the Python code. Our contribution is folded is folded in three ways. First, the data is processed to find the relationship between the elements of the dataset. Second, the data set is applied for Ada Boost regressors and the important elements are identified. Third, the dataset is applied to feature scaling and then fitted to kernel support vector machine, logistic regression classifier, Naive bayes classifier, random forest classifier, decision tree classifier and KNN classifier. Fourth, the dataset is dimensionality reduced with principal component analysis with five components and then applied to the previously mentioned classifiers. Fifth, the performance of the classifiers is analyzed with the indication metrics like precision, accuracy, recall and F score. The implementation is carried out with python code using Anaconda Navigator. Experimental results show that, the Naïve bayes classifier is more effective with the precision of 0.90 for dataset with random boost, feature scaled and PCA. Experimental results show that, the Naïve bayes classifier is more effective with the recall of 0.91 for dataset with random boost, feature scaled and PCA. Experimental results show that, the Naïve bayes classifier is more effective with the F score of 0.92 for dataset with random boost, feature scaled and PCA. Experimental results show, the Naïve bayes classifier is more effective with the accuracy of 91% without random boost, 93% with random boosting and 92% with principal component analysis.
Keywords: Machine Learning, Churn, Classification, accuracy, precision, recall and f-score.
Scope of the Article: Machine Learning.