F7567038620 - International Journal of Recent Technology and Engineering (IJRTE)

An Integrated Strategy for Data Mining Based on Identifying Important and Contradicting Variables for Breast Cancer Recurrence Research
Avijit Kumar Chaudhuri¹, Deepankar Sinha², Kousik Bhattacharya³, Anirban Das⁴
¹Avijit Kumar Chaudhuri, Assistant Professor, Academic In-Charge, Sikkim Manipal University Learning Centre, Kolkata.
²Deepankar Sinha, PhD, Indian Institute of Technology (IIT), Kharagpur.
³Kousik Bhattacharya, Assistant Registrar(Exam.) DDE, Rabindra Bharati University, Kolkata, West Bengal.
⁴Anirban Das, HOD, Computer Science & Engineering in Amity University Kolkata.
Manuscript received on February 02, 2020. | Revised Manuscript received on February 10, 2020. | Manuscript published on March 30, 2020. | PP: 1096-1106 | Volume-8 Issue-6, March 2020. | Retrieval Number: F7567038620/2020©BEIESP | DOI: 10.35940/ijrte.F7567.038620
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Cancer leads to most deaths worldwide, and breast cancer isa leading disease that causes death among women. This disease is unique in the way that once treated, can recur in some cases. Individuals are unable to identify their condition before it becomes dangerous. Extracting significant predictive features of breast cancer is an important and risky job for further study. Researchers have applied data mining techniques in medical science. Several authors suggest that a single method doesn’t resolve issues in diagnosing problems, and a hybrid model is desirable. In this paper, the authors propose an integrated approachto avoid Type 1 and 2 errors in predicting recurrence. They identify important and contradicting variables and consider them for inclusion and exclusion, respectively, to revise the dataset. The evaluation of findings of crucial methods, using original and revised datasets, widens the choice of identifying the technique with higher accuracy. The results show that the accuracy improves with the selection of variables restricted to the ones identified as relatively significant, and the dataset revised after elimination of contradicting variables.
Keywords: Data Mining Techniques, Errors, Integrated Approach, Under-Estimation, Recurrence, Breast Cancer
Scope of the Article: Data Mining.

Download PDF

JOURNAL

REQUIREMENTS

PRODUCT

CONTACT US