Improving the Performance of Document Classification by using GPU Parallelism

Il-Nam Park, Byunggul Bae, and Seung-Shik Kang


GPU parallelism, document classification, CUDA framework, memory loading time


Document classification and clustering algorithm requires a lot of vector similarity calculation. The complexity of the algorithm is O(n2) and its speed is seriously lower down as the number of documents increases. The performance of similarity calculation will be highly improved if we use a GPU parallelism. In this paper, we propose a method of improving the calculation speed by CUDA framework as GPU parallelism can accelerate vector operation. Most troubling part of using CUDA framework for GPU parallel processing is a memory loading time. We tried to find the best way to reduce memory loading time. The fastest method is to use texture-memory and the second fastest way is to use global-memory. Operation on the GPU was three times faster than CPU operation when texture-memory was used.

Important Links:

Go Back