Discriminant Pearson Correlative Feature Selection based Gentle Adaboost Classification for Medical Document Mining

1P.Poongothai, Ph.D. Research Scholar, Department of Computer Applications, Bharathiar University, Coimbatore, Tamil Nadu, India.
2Dr.Prof.T.Devi, Professor and Head, Department of Computer Applications, Bharathiar University, Coimbatore, Tamil Nadu, India.

Manuscript received on 07 August 2019. | Revised Manuscript received on 14 August 2019. | Manuscript published on 30 September 2019. | PP: 3777-3783 | Volume-8 Issue-3 September 2019 | Retrieval Number: C5391098319/2019©BEIESP | DOI: 10.35940/ijrte.C5391.098319
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: This paper examines Discriminant Pearson Correlative Analysis Based Multivariate Gentle Adaboost Classification (DPCA-MGAC) and it is used to improve the performance of medical document mining with minimum time complexity. A large number of documents are collected from PubMed databases through the semantic-based search. Processes such as removing stop words, stemming, features identification, selection of features i.e., relevant keywords for document classification are carried out. The significant feature selection is carried out using DPCA, and with the selected features the documents are categorized into different classes using MGAC. This classification process combines the results of all weak learners and makes a strong classification in order to improve the precision of medical data mining and minimizes the false positive rate. Experimental evaluation has been performed using PubMed database.
Keywords: Boosting, Document Classification, Document Collection, Text mining.

Scope of the Article: Classification