next up previous
Next: About this document Up: No Title Previous: No Title

Advanced Information Systems: Parallel and Distributed Databases

Parallel and Distributed Databases: Why?

Parallel Database system: seeks to improve performance via parallel implementation of operations; any distribution of data governed solely performance considerations

Distributed Database system: data physically stored across multiple sites each typically managed by independent DBMS; distribution of data governed by factors such as local ownership, increased availability in addition to performance issues

Architectures for Parallel Databases

basic idea: carry out evaluation steps in parallel whenever possible, in order to improve performance

For simplicity: centralized DBMS but processing of operations is parallelized

three main architecture models:

Evaluation of Architectures

Parallel Query Evaluation

relational query execution plan is graph of relational algebra operators and operators in graph can be executed in parallel

Data Partitioning

partitioning large dataset across several disks enables us to exploit I/O bandwidth of disks by reading and writing in parallel

partitioning methods:

Parallel Sequential Operator Evaluation Code

Parallelizing Relational Operations

Sorting

Parallel Join Algorithms

consider join of two relations A and B on age attribute

Parallel Hash Join


next up previous
Next: About this document Up: No Title Previous: No Title

Bhagirath Narahari
Thu Sep 11 13:14:14 EDT 1997