MCA‎ > ‎Semester4‎ > ‎MC0077‎ > ‎

Q. What are differences in Centralized and Distributed Database Systems? List the relative advantages of data distribution?

Bigg Boss
Check out this once.

Bigg Boss

A distributed database is a database that is under the control of a central database management system (DBMS) in which storage devices are not all attached to a common CPU. It may be stored in multiple computers located in the same physical location, or may be dispersed over a network of interconnected computers.


Collections of data (e.g. in a database) can be distributed across multiple physical locations. A distributed database can reside on network servers on the Internet, on corporate intranets or extranets, or on other company networks. The replication and distribution of databases improves database performance at end-user worksites.


To ensure that the distributive databases are up to date and current, there are two processes: replication and duplication. Replication involves using specialized software that looks for changes in the distributive database. Once the changes have been identified, the replication process makes all the databases look the same. The replication process can be very complex and time consuming depending on the size and number of the distributive databases. This process can also require a lot of time and computer resources. Duplication on the other hand is not as complicated. It basically identifies one database as a master and then duplicates that database. The duplication process is normally done at a set time after hours. This is to ensure that each distributed location has the same data. In the duplication process, changes to the master database only are allowed. This is to ensure that local data will not be overwritten. Both of the processes can keep the data current in all distributive locations.

Besides distributed database replication and fragmentation, there are many other distributed database design technologies. For example, local autonomy, synchronous and asynchronous distributed database technologies. These technologies' implementation can and does depend on the needs of the business and the sensitivity/confidentiality of the data to be stored in the database, and hence the price the business is willing to spend on ensuring data security, consistency and integrity.

Basic architecture

A database User accesses the distributed database through:


Local applications;

    Applications which do not require data from other sites.

Global applications:

    Applications which do require data from other sites.


A distributed database does not share main memory or disks.


A centralized database has all its data on one place. As it is totally different from distributed database which has data on different 

places. In centralized database as all the data reside on one place so problem of bottle-neck can occur, and data availability is not efficient as in distributed database. Let me define some advantages of distributed database, it will clear the difference between centralized and distributed database.


Advantages of Data Distribution

The primary advantage of distributed database systems is the ability to share and access data in a reliable and efficient manner.

Data sharing and Distributed Control:

 If a number of different sites are connected to each other, then a user at one site may be able to access data that is available at another site. For example, in the distributed banking system, it is possible for a user in one branch to access data in another branch. Without this capability, a user wishing to transfer funds from one branch to another would have to resort to some external mechanism for such a transfer. This external mechanism would, in effect, be a single centralized database.


The primary advantage to accomplishing data sharing by means of data distribution is that each site is able to retain a degree of control over data stored locally. In a centralized system, the database administrator of the central site controls the database. In a distributed system, there is a global database administrator responsible for the entire system. A part of these responsibilities is delegated to the local database administrator for each site. Depending upon the design of the distributed database system, each local administrator may have a different degree of autonomy which is often a major advantage of distributed databases.


Reliability and Availability:


If one site fails in distributed system, the remaining sited may be able to continue operating. In particular, if data are replicated in several sites, transaction needing a particular data item may find it in several sites. Thus, the failure of a site does not necessarily imply the shutdown of the system.


The failure of one site must be detected by the system, and appropriate action may be needed to recover from the failure. The system must no longer use the service of the failed site. Finally, when the failed site recovers or is repaired, mechanisms must be available to integrate it smoothly back into the system.


Although recovery from failure is more complex in distributed systems than in a centralized system, the ability of most of the systems to continue to operate despite failure of one site, results in increased availability. Availability is crucial for database systems used for real-time applications. Loss of access to data, for example, in an airline may result in the loss of potential ticket buyers to competitors.


Speedup Query Processing:


If a query involves data at several sites, it may be possible to split the query into subqueries that can be executed in parallel by several sites. Such parallel computation allows for faster processing of a user’s query. In those cases in which data is replicated, queries may be directed by the system to the least heavily loaded sites.