Classification of Imbalanced Class Distribution using Random Forest with Multiple Weight Based Majority Voting for Credit Scoring
Ramila RajaLeximi Pannir Selvam1, Irfan Ahmed Mohammed Saleem2, Ahmed Alenezi3

1Ramila RajaLeximi Pannir Selvam, Research and Development Centre Bharathiar University, Coimbatore (Tamil Nadu), India.
2Irfan Ahmed Mohammed Saleem, Department of Computer and Information Sciences, College of Science and Arts, Taibah University, Al Ula, Madhina.
3Ahmed Alenezi, Department of Computer and Information Sciences, College of Science and Arts, Taibah University, Al Ula, Madhina.
Manuscript received on 04 May 2019 | Revised Manuscript received on 16 May 2019 | Manuscript Published on 23 May 2019 | PP: 517-526 | Volume-7 Issue-6S5 April 2019 | Retrieval Number: F10910476S519/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Classification is an important and most widely used technique in predicting class labels for given unlabelled instances. In this field of supervised learning, most standard classification algorithms provide better accuracy for balanced class distribution. However, in case of sensitive real world applications, especially, credit scoring and medical diagnosis containing imbalanced class samples, the standard algorithms normally produce higher misclassification rate for the minority class samples which is the field of interest of the user since, collecting the data for minority samples are equally rare and costly than majority class samples. This paper introduces an improvisation on the random forest algorithm by introducing multiple weight based majority voting that suits best for credit scoring datasets. The proposed algorithm has been evaluated and compared with other variations of random forest methods and it is proved that the proposed method improves overall performance and accuracy in predicting both majority and minority class labels.
Keywords: Credit Scoring; Imbalanced Dataset; Classification; Random Forest; Multiple Weight; Majority Voting.
Scope of the Article: Classification