Data Skew-Handling in Parallel MDIM Data Warehouses

A.M.C. Monteiro and P.N. Furtado (Portugal)

Keywords

Parallel processing, Skew Effects and Partitioning.

Abstract

Parallel approaches can be very effective at providing significant performance improvement in large data warehouses. However, the success of such strategies is most frequently dependent on good partitioning of the data and skew may condemn partitioning effectiveness. In this paper we propose and test a fragment directory with a simple data balancing strategy to handle skewed data efficiently in multidimensional partitioned data sets. We have built a simulator and tested the new strategy against skewed data sets. Our results have shown a significant improvement in query response time with the simple strategy.

Important Links:



Go Back