Parallel database system solution

Today, in a competitive world, enterprises of all kinds use and depend on timely available, up-to-date information. Information volumes are growing 25-35% per year and the traditional transaction rate has been forecast to grow by a factor of 10 over the next five years-twice the current trend in mainframe growth. In addition, there is already an increasing number of transactions arising from computer systems in business-to-business interworking and by intelligent terminals in the home, office or factory.

The profile of the transaction load is also changing as decision-support queries, typically complex, are added to the existing simpler, largely clerical workloads. Thus, complex queries such as those macro-generated by decision support systems or system-generated as in production control will increase to demand significant throughput with acceptable response times. In addition, very complex queries onvery large databases, generated by skilled staff workers or expert systems, may hurt throughput while demanding good response times.

From a database point of view, the problem is to come up with database servers that support all these types of queries efficiently on possibly very largeon-line databases. However, the impressive silicon technology improvements alone cannot keep pace with these increasing requirements. Microprocessor performance is now increasing 50% per year, and memory chips are increasing in capacity by a factor of 16 every six years. RISC processors today can deliver between 50 and 100 MIPS (the new 64 bit DEC Alpha processor is predicted to deliver 200 MIPS at cruise speed!) at a much lower price/MIPS than mainframe processors. This is in contrast to much slower progress in disk technology which has been improving by a factor of 2 in response time and throughput over the last 10 years. With such progress, the I/O bottleneck worsens with time.

The solution is therefore to use large-scale parallelism to magnify the raw power of individual components by integrating these in a complete system along with the appropriate parallel database software. Using standard hardware components is essential to exploit the continuing technology improvements with minimal delay. Then, the database software can exploit the three forms of parallelism inherent in data-intensive application workloads. Interquery parallelism enables the parallel execution of multiple queries generated by concurrent transactions. Intraquery parallelism makes the parallel execution of multiple, independent operations (e.g.,select operations) possible within the same query. Both interquery and intraquery parallelism can be obtained by using data partitioning. Finally, with intraoperation parallelism, the same operation can be executed as many sub-operations using function partitioning in addition to data partitioning. The set-oriented mode of database languages (e.g., SQL) provides many opportunities for intraoperation parallelism. For example, the performance of the join operation can be increased significantly by parallelism.

Tags : , , , , , , , , , ,

If you enjoyed this post, please consider to leave a comment or subscribe to the feed and get future articles delivered to your feed reader.

Leave Comment