Enhancing Performance of Map Reduce Workflow through H2HADOOP: CJBT
Gopichand G1, Vishal Lella2, Sai Manikanta Avula3

1Gopichand G, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore (Tamil Nadu), India.
2Vishal Lella, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore (Tamil Nadu), India.
3Sai Manikanta Avula, School of Computer Science and Engineering, Vellore Institute of Technology, Vellore (Tamil Nadu), India.
Manuscript received on 29 April 2019 | Revised Manuscript received on 11 May 2019 | Manuscript Published on 17 May 2019 | PP: 652-656 | Volume-7 Issue-6S4 April 2019 | Retrieval Number: F11340476S419/2019©BEIESP
Open Access | Editorial and Publishing Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: Distributed computing uses Hadoop framework for handling Big Data in parallel. Hadoop’s confinements can be exploited to execute various activities proficiently. These confinements are for the most part an immediate consequence of locality of data in the cluster, scheduling of tasks and various jobs and allocation of resources in Hadoop. Productive resource allocation remain as a challenge in Cloud Computing Map Reduce stages. Henceforth, we propose H2Hadoop, which is an enhanced architecture which decreases calculation cost related with Big Data analysis. The proposed framework additionally helps in solving the issue of resource allocation in local Hadoop. H2Hadoop provides a reliable, accurate and far faster solution for “text data”, such as finding DNA sequences and the theme of a DNA succession. Additionally, H2Hadoop gives an effective Data Mining technique for Cloud Computing environment. H2Hadoop design influences on Name Node’s ability to assign jobs to the Task Trackers (Data Nodes) inside the group. Building a metadata table containing information of location of data nodes with required features that are needed when in future a similar job is requested to the job tracker then it should compare the metadata and CJBT for assigning data nodes which were previously assigned instead of storing and reading through whole gathering again. Contrasting with local Hadoop, H2Hadoop diminishes time taken by CPU, number of read tasks, and other Hadoop factors.
Keywords: CJBT, DNA Sequences, H2Hadoop.
Scope of the Article: Process and Workflow Management