IJBSTR REVIEW PAPER VOL 1 [ISSUE 7] JULY 2013
ISSN 2320 – 6020
A Survey on Dynamic Replication Strategies for Improving Response Time in Data Grids Ashish Kumar Singh1, Shashank Srivastava2 and Udai Shanker3 ABSTRACT: Replication is a phenomenon in which we create a exact copies of data for making better data availability. Replication is process which is good in distributed database system. Dynamic replication is helpful in reducing bandwidth consumption and also access latency in the large scale database such as data grid. Through replication process we can improve response time in a huge database like data grid. Different replication strategies can be defined depending on when, where, and how replicas are created and destroyed. Data grid is the best example of distributed database system. Data grid is a distributed collection of storage and computational resources that are not bounded within a geophysical location. Whenever we deal with the data grid then we have to deal with the geographical and temporal locality. KEYWORDS: Data grid, Replication, DDBS, Scalability. Introduction In recent years, distributed databases have become an important area of information processing, and their importance will rapidly grow. In distributed database, sites are interconnected through a network, for managing data in this wholly interconnected environment we need a method so that issue of data availability is not arises. This interconnected environment forms a data-grid. Data grid is a collection of huge amount of data which is located at multiple sites or at individual sites where each site can be its own multiple administrative power as to who may access the data. Data replication is the method through which we can solve all the issues of accessing data from the server by improving performance and availability of data. There are two principle replication schemes one is Active replication and another is Passive replication. In a replicated environment, copies of data are hosted by multiple sites. By increasing the number of copies or replicas enhances the system performance by improving the locality of data. If you are dealing with the data grid then you have deal with temporal and geographical locality where temporal locality means popular files in past will be accessed more in future and geographical locality means files recently accessed by a client are likely to be accessed by nearby clients. Ashish Kumar Singh1, Shashank Srivastava2 and Udai Shanker3 Department of Computer Science & Engineering, Madan Mohan Malaviya Engineering College, Gorakhpur-273 010 Email: ashi001.ipec@gmail.com1, shashank07oct@gmail.com2 and udaigkp@gmail.com3
Problems in file replication are Availability, Reliability, Cost, Scalability, Throughput, network traffic, Response time, and Autonomous operation. The purpose of a distributed file system is to allow users of physically distributed computers to share data and storage resources by using a common file system. The main advantages of replication are [5]: 1. Improved availability: in case of a failure of a node, the system can replicate the data from another site, which also improves the availability. 2. Improved performance: since the data is replicated among several nodes, the user can obtain data from the node nearest the node or that is the best in terms of workload. Data replication is very attractive in order to increase system throughput and also provide fault-tolerance. However, it is a challenge to keep data copies consistent. LITERATURE ALGORITHMS
REVIEW
OF
REPLICATION
Replication involves using specialized software that looks for changes in the distributive database. Once the changes have been identified, the replication process makes all the databases look the same. The replication process can be complex and time-consuming depending on the size and number of the distributed databases. This process can also require a lot of time and computer resources [14]. So there are several of algorithm are proposed by the different author for removing the problems related to the replication process are as follows. Dynamic Group Protocol [1] In this paper author develop a protocol Dynamic Group Protocol in 1992 which adapts itself to changes in site availability and network connectivity, which allows it to tolerate n -2 successive replica failures. The dynamic group
Š ijbstr.org 29