Hive的两个问题
2016-05-31 21:56
253 查看
Hive的两个问题:
问题一:Too Many Small Partitions
It can be tempting to partition your data into many small partitions to try to increase speed and concurrency.
However, Hive functions best when data is partitioned into larger partitions.
For example, consider partitioning a 100 TB table into 10,000 partitions, each 10 GB in size. In addition,
do not use more than 10,000 partitions per table. Having too many small partitions puts significant strain on the Hive MetaStore and does not
improve performance.
问题二:Hive Queries Fail with "Too many counters" Error
Hive operations use various counters while executing MapReduce jobs.
These per-operator counters are enabled by the configuration setting hive.task.progress.
This is disabled by default; if it is enabled, Hive may create a large number of counters (4 counters per operator, plus another 20).
Note:
If dynamic partitioning is enabled, Hive implicitly enables the counters during data load.
By default, CDH restricts the number of MapReduce counters to 120.
Hive queries that require more counters will fail with the "Too many counters" error.
What To Do
If you run into this error, set mapreduce.job.counters.max in mapred-site.xml to a higher value.
问题一:Too Many Small Partitions
It can be tempting to partition your data into many small partitions to try to increase speed and concurrency.
However, Hive functions best when data is partitioned into larger partitions.
For example, consider partitioning a 100 TB table into 10,000 partitions, each 10 GB in size. In addition,
do not use more than 10,000 partitions per table. Having too many small partitions puts significant strain on the Hive MetaStore and does not
improve performance.
问题二:Hive Queries Fail with "Too many counters" Error
Hive operations use various counters while executing MapReduce jobs.
These per-operator counters are enabled by the configuration setting hive.task.progress.
This is disabled by default; if it is enabled, Hive may create a large number of counters (4 counters per operator, plus another 20).
Note:
If dynamic partitioning is enabled, Hive implicitly enables the counters during data load.
By default, CDH restricts the number of MapReduce counters to 120.
Hive queries that require more counters will fail with the "Too many counters" error.
What To Do
If you run into this error, set mapreduce.job.counters.max in mapred-site.xml to a higher value.
相关文章推荐
- 第二阶段冲刺第七天
- paper 72 :高动态范围(HDR)图像 HDR (High Dynamic Range)
- Android加载/处理超大图片神器!SubsamplingScaleImageView(subsampling-scale-image-view)【系列1】
- 相机的同步拍摄
- MongoDB 导出导入备份恢复数据实例
- 专题三 Problem X
- Manacher算法总结
- C++ Primer 类 12.4 explicit 构造函数
- 课程进展
- 日期类型
- NSPredicate谓词
- [疯狂Java]集合:IdentityHashMap、EnumMap
- Win7下常用shell命令解析
- Win7下常用shell命令解析
- 在线参数获取
- 数据库的事务
- 3.INSTALL_FAILED_UPDATE_INCOMPATIBLE
- java的单例设计模式
- ftp-client-1
- 简单计算器