An Incremental Genetic Algorithm Hybrid with Rough set Theory for Efficient Feature Subset Selection
N. Nandhini1, K. Thangadurai2
1N. Nandhini, Assistant Professor, Department of MCA, SNS College of Technology Autonomous, Coimbatore (Tamil Nadu), India.
2Dr. K. Thangadurai, Professor and Head, Department of Computer Science, Government Arts College, Karur (Tamil Nadu), India.
Manuscript received on 25 May 2019 | Revised Manuscript received on 12 June 2019 | Manuscript Published on 26 June 2019 | PP: 217-231 | Volume-8 Issue-1S5 June 2019 | Retrieval Number: A00390681S519/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Rough Set Theory (RST) is the most successful tool implemented in relevant feature selection or feature reduction domain. Conventionally, the feature subset selection methods make use of hill climbing strategy to find the reducts. The major limitation of such methods are that they are ineffective to find the optimal reduct since they can’t give assurance for optimal feature subset. Hence the researchers moved to heuristic based feature subset selection methods. There are two types of data are subjected for feature subset selection: static and dynamic data. Static data has finite number of samples with finite set of attributes, whereas dynamic data are keep growing. This research work focuses on dynamic data, where new attributes might be added over a period of time. For such data, the feature subset algorithm doesn’t require to be executed from the beginning whenever an attribute gets added. The class of incremental feature selection algorithms are efficient and prove their significance with these dynamic data. In this paper, an Incremental Genetic Algorithm (IGA) hybrid with RST is proposed for efficient feature selection with dynamic data, where GA is used to search relevant features heuristically while employing RST based fitness function. The proposed IGA-RST approach has three advantages like, (i) it starts with an effective initial population construction method which accelerates the convergence (ii) the fitness functions consists of feature weights estimated using Pseudo-Inverse matrix, which reduces the IGA-RST algorithm’s computation cost, (iii) a novel incremental approach is proposed to construct the reduct for a group of top-weighted attributes first to find the partial reduct, further continued with the next set of top-weighted attributes while considering the partial reduct as elite chromosomes, which is to handle dynamic and higher dimension data. The proposed IGARST based feature reduction’s performance is evaluated with benchmark datasets from UCI machine learning repository. Investigation results indicate that the IGA-RST improves the efficiency of feature subset selection significantly.
Keywords: Algorithm Genetic Hybrid Data Methods.
Scope of the Article: Algorithm Engineering