Compute Pairwise Euclidean Distances of Data Points with GPUs

D. Chang, N.A. Jones, D. Li, M. Ouyang, and R.K. Ragade (USA)

Keywords

High performance biocomputing, parallel and distributed computation, microarray data analysis, hierarchical clus tering

Abstract

Graphics processing units (GPUs) are powerful computa tional devices tailored toward the needs of the 3-D gam ing industry for high-performance, real-time graphics en gines. Nvidia released a new generation of GPUs designed for general-purpose computing in 2006, and a GPU pro gramming language called CUDA in 2007. The DNA mi croarray technology is a high throughput tool for assaying gene expression of cell cultures or tissue samples. During the exploratory phase of data analysis, scientists often ap ply (agglomerative) hierarchical clustering on the genes. In hierarchical clustering, a fundamental operation is to calcu late all pairwise distances among all genes. If there are n genes, it takes O(n2 ) time. In the present study, we exam ine how to use GPUs and the CUDA language to speed up the calculation. The results achieve a 20 to 44 times speed up on the GPU compared to the CPU implementation.

Important Links:



Go Back