Workload based ordering of multi dimensional data

Page 1

Workload-Based Ordering of Multi-Dimensional Data

Abstract: Transforming multi-dimensional data into a one-dimensional sequence using space-filling curves such as the Hilbert curve, the Gray curve, and the Z-curve has been studied extensively. These techniques are not sensitive to data or workload skewness, however, in practice, user-access patterns and datadistributions are often very skewed in high dimensional space. It is desirable to produce a onedimensional sequence which keeps the multi-dimensional grid cells that are queried together close to each other. This generates sequences with higher spatial locality. We propose a workload-based approach to produce onedimensional ordering from multi-dimensional data in this paper. An extensive experimental evaluation suggests that our approach produces a high quality ordering sequence which outperforms the existing state-of-the-art Hilbert curve by a factor of 4.84, the Gray curve by a factor of 6.66, and the Z-curve by a factor of 7.26 for the number of subsequences used to answer a query; and for IO time, it outperforms the Hilbert curve by a factor of 2.20, the Gray curve by a factor of 2.25, and the Z-curve by 2.38.


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.