A Comparative analysis of Data Replication Strategies and Consistency Maintenance in Distributed File Systems
Priya Deshpande1, Aniket Bhaise2, Prasanna Joeg3

1Priya Deshpande, Assistant Professor, MIT College of Engineering, Pune (M.H), India.
2Aniket Bhaise, ME. Student, MIT College of Engineering, Pune (M.H), India.
3Prasanna Joeg, Professor, MIT College of Engineering, Pune (M.H), India.

Manuscript received on 21 March 2013 | Revised Manuscript received on 28 March 2013 | Manuscript published on 30 March 2013 | PP: 109-114 | Volume-2 Issue-1, March 2013 | Retrieval Number: A0522032113/2013©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley | Indexing and Abstracting
© The Authors. Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC-BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)

Abstract: The data plays vital role in data intensive applications or applications which relay on large data files, In the era of information technology the size of data is increasing drastically and this type of data is usually referred to as “Big Data”. Which is usually in the unstructured or may be in structured format in data grids or in cluster, and manipulation of such type of data like retrieving, storing and updating it is very tedious job in data intensive application. Data grids are type of data cluster technique which deals with such big data. which may be heterogeneous or homogeneous in nature depending on their property but in the era of fast growing technology the term heterogeneous data grids now replacing by cloud computing to serve as one of the service of cloud computing. In network of cloud computing, data replication and consistency maintenance plays key role to share data between nodes (data intensive applications) to achieve high performance, data availability, consistency and partial tolerance. In this paper we discuss the various data replication strategies with Hadoop Distributed File System which provides MapReduce Framework for data replication and consistency maintenance in cloud computing, to achieve high performance, consistency, availability and partial tolerance and discuss the performance evaluation of these various techniques and frameworks like cloud MapReduce, Integrated data replication and consistency maintenance and also MapReduce with Adaptive Load balancing for Heterogeneous and Load imbalanced cluster (MARLA).
Keywords: Distributed System, Data Intensive Applications, Data Grids, Data Replicas, Job Scheduling, Cloud Computing

Scope of the Article: Cloud Computing