超大数据之GPU聚类 (10亿量级)
2009-12-09 09:52
183 查看
先转一篇以前的文章.
Clustering billions of data points using GPUs
"In this paper, we report our research on using GPUs to accelerate
clustering of very large data sets, which are common in today's real
world applications. While many published works have shown that GPUs can
be used to accelerate various general purpose applications with
respectable performance gains, few attempts have been made to tackle
very large problems. Our goal here is to investigate if GPUs can be
useful accelerators even with very large data sets that cannot fit into
GPU's onboard memory.
Using a popular clustering algorithm,
K-Means, as an example, our results have been very positive. On a data
set with a billion data points, our GPU-accelerated implementation
achieved an order of magnitude performance gain over a highly optimized
CPU-only version running on 8 cores, and more than two orders of
magnitude gain over a popular benchmark, MineBench, running on a single
core."
http://portal.acm.org/citation.cfm?id=1531668
rw
Clustering billions of data points using GPUs
"In this paper, we report our research on using GPUs to accelerate
clustering of very large data sets, which are common in today's real
world applications. While many published works have shown that GPUs can
be used to accelerate various general purpose applications with
respectable performance gains, few attempts have been made to tackle
very large problems. Our goal here is to investigate if GPUs can be
useful accelerators even with very large data sets that cannot fit into
GPU's onboard memory.
Using a popular clustering algorithm,
K-Means, as an example, our results have been very positive. On a data
set with a billion data points, our GPU-accelerated implementation
achieved an order of magnitude performance gain over a highly optimized
CPU-only version running on 8 cores, and more than two orders of
magnitude gain over a popular benchmark, MineBench, running on a single
core."
http://portal.acm.org/citation.cfm?id=1531668
rw
相关文章推荐
- 吴超大数据高薪就业班二期和三期(官方培训费11000)
- 基于可变数据压缩的GPU核辅助加速策略
- 利用GPU实现向量数据的相加
- 超大文本文件数据导入MYSQL
- MapReduce:超大机群上的简单数据处理
- 数据挖掘_聚类/维数灾难
- 数据挖掘笔记-聚类-Canopy-原理与简单实现
- 数据挖掘笔记-聚类-Canopy-并行处理分析
- 数据挖掘-聚类分析:k-平均(k-Means)算法实现(C++)
- 数据挖掘->Canopy 聚类
- DMX - SQL SERVER 数据挖掘聚类
- 大数据:聚类
- GPU 学习笔记(一)::CPU与GPU的数据互传
- MySQL 超大数据量的一些优化
- TODO:数据挖掘-聚类-K均值
- 超大数据量排序
- Clustering of Multivariate data 多源数据的聚类
- 基于K-Means模型的数据聚类(复习12)
- 《机器学习实战》学习笔记-[14]-无监督学习-利用二分K-均值聚类对未标注数据分组
- Directx11之Debug DirectCompute_将数据从GPU读到CPU