Moore Data Clustering Based Bloom Hash Storage for Dimensionality Reduction of Big Data Analytics
Chitra. K1, Maheswari. D2
1Chitra. K, Research Scholar, School of Computer Studies, Rathnavel Subramaniam College of Arts and Science, Sulur, Coimbatore, Tamil Nadu.
2Maheswari. D, Head, Research Coordinator, School of Computer Studies- PG, Rathnavel Subramaniam College of Arts and Science, Sulur, Coimbatore, Tamil Nadu.
Manuscript received on 12 August 2019. | Revised Manuscript received on 17 August 2019. | Manuscript published on 30 September 2019. | PP: 8178-8184 | Volume-8 Issue-3 September 2019 | Retrieval Number: C6652098319/19©BEIESP | DOI: 10.35940/ijrte.C6652.098319
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Big data contains massive amounts of information’s that are difficult to manage, acquire, store and analyses. The clustering of data is a demanding issue in the field of big data analytics. The existing techniques developed for clustering does not provide efficient performance and also time complexity of clustering was higher. Further, minimizing dimensionality of big data was not addressed effectively. In order to overcome these limitations, a Moore Data Clustering based Bloom Hash Storage (MDC-BHS) Technique is proposed. The MDC-BHS Technique is designed with aim of reducing the dimensionality of big data with lesser time through clustering. The MDC-BHS Technique used Moore Data Clustering (MDC) Model in order to group the data in big dataset with minimum time consumption. After performing clustering process, the MDC-BHS Technique employed Bloom Hash Storage (BHS) Model in order to store clustered data with minimum space complexity. The BHS Model is a space-efficient probabilistic data structure which utilized hashing function to create hash value for clustered data. Therefore, proposed MDC-BHS Technique significantly reduces the dimensionality of larger dataset. The experimental evaluation of MDC-BHS technique is carried out on weather data with factors such as clustering time and clustering accuracy and space complexity with respect to number of data. The experimental results demonstrate that MDC-BHS Technique is able to improve the clustering accuracy and also minimizes the space complexity when compared to state-of-the-art works.
Keywords: Big Data, Bloom Hash Storage, Dimensionality Reduction, Hashing Function, Moore Clustering, Moore Curve
Scope of the Article: Big Data Networking