hive0.14-insert、update、delete操作测试
2016-03-24 18:54
651 查看
问题导读
1.测试insert报错,该如何解决?
2.hive delete和update报错,该如何解决?
3.什么情况下才允许delete和update?
首先用最普通的建表语句建一个表:
hive>create table test(id int,name string)row format delimited fields terminated by ',';
复制代码
测试insert:
insert into table test values (1,'row1'),(2,'row2');
复制代码
结果报错:
java.io.FileNotFoundException: File does not exist: hdfs://127.0.0.1:9000/home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/
apache-hive-0.14.0-SNAPSHOT-bin/lib/curator-client-2.6.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
......
复制代码
貌似往hdfs上找jar包了,小问题,直接把lib下的jar包上传到hdfs
hadoop fs -mkdir -p /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
hadoop fs -put $HIVE_HOME/lib/* /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
复制代码
接着运行insert,没有问题,接下来测试delete
hive>delete from test where id = 1;
复制代码
报错!:
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
说是在使用的转换管理器不支持update跟delete操作。
原来要支持update操作跟delete操作,必须额外再配置一些东西,见:
https://cwiki.apache.org/conflue ... tersforTransactions
根据提示配置hive-site.xml:
hive.support.concurrency – true
hive.enforce.bucketing – true
hive.exec.dynamic.partition.mode – nonstrict
hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on – true
hive.compactor.worker.threads – 1
复制代码
配置完以为能够顺利运行了,谁知开始报下面这个错误:
FAILED: LockException [Error 10280]: Error communicating with the metastore
复制代码
与元数据库出现了问题,修改log为DEBUG查看具体错误:
4-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(265)) - Going to execute query <select cq_id,
cq_database, cq_table, cq_partition, cq_type, cq_run_as from COMPACTION_QUEUE where cq_state = 'r'>
2014-11-04 14:20:14,367 ERROR [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(285)) - Unable to select next element for cleaning,
Table 'hive.COMPACTION_QUEUE' doesn't exist
2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(287)) - Going to rollback
2014-11-04 14:20:14,368 ERROR [Thread-8]: compactor.Cleaner (Cleaner.java:run(143)) - Caught an exception in the main loop of compactor cleaner, MetaException(message
:Unable to connect to transaction database com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'hive.COMPACTION_QUEUE' doesn't exist
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
复制代码
在元数据库中找不到COMPACTION_QUEUE这个表,赶紧去mysql中查看,确实没有这个表。怎么会没有这个表呢?找了很久都没找到什么原因,查源码吧。
在org.apache.hadoop.hive.metastore.txn下的TxnDbUtil类中找到了建表语句,顺藤摸瓜,找到了下面这个方法会调用建表语句:
private void checkQFileTestHack() {
boolean hackOn = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST);
if (hackOn) {
LOG.info("Hacking in canned values for transaction manager");
// Set up the transaction/locking db in the derby metastore
TxnDbUtil.setConfValues(conf);
try {
TxnDbUtil.prepDb();
} catch (Exception e) {
// We may have already created the tables and thus don't need to redo it.
if (!e.getMessage().contains("already exists")) {
throw new RuntimeException("Unable to set up transaction database for" +
" testing: " + e.getMessage());
}
}
}
}
复制代码
什么意思呢,就是说要运行建表语句还有一个条件:HIVE_IN_TEST或者HIVE_IN_TEZ_TEST.只有在测试环境中才能用delete,update操作,也可以理解,毕竟还没有开发完全。
终于找到原因,解决方法也很简单:在hive-site.xml中添加下面的配置:
<property>
<name>hive.in.test</name>
<value>true</value>
</property>
复制代码
OK,再重新启动服务,再运行delete:
hive>delete from test where id = 1;
复制代码
又报错:
FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table default.test that does not use an AcidOutputFormat or is not bucketed
复制代码
说是要进行delete操作的表test不是AcidOutputFormat或没有分桶。估计是要求输出是AcidOutputFormat然后必须分桶
网上查到确实如此,而且目前只有ORCFileformat支持AcidOutputFormat,不仅如此建表时必须指定参数('transactional' = true)。感觉太麻烦了。。。。
于是按照网上示例建表:
hive>create table test(id int ,name string )clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');
复制代码
insert
hive>insert into table test values (1,'row1'),(2,'row2'),(3,'row3');
复制代码
delete
hive>delete from test where id = 1;
复制代码
update
hive>update test set name = 'Raj' where id = 2;
复制代码
OK!全部顺利运行,不过貌似效率太低了,基本都要30s左右,估计应该可以优化,再研究研究
1.测试insert报错,该如何解决?
2.hive delete和update报错,该如何解决?
3.什么情况下才允许delete和update?
首先用最普通的建表语句建一个表:
hive>create table test(id int,name string)row format delimited fields terminated by ',';
复制代码
测试insert:
insert into table test values (1,'row1'),(2,'row2');
复制代码
结果报错:
java.io.FileNotFoundException: File does not exist: hdfs://127.0.0.1:9000/home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/
apache-hive-0.14.0-SNAPSHOT-bin/lib/curator-client-2.6.0.jar
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1128)
at org.apache.hadoop.hdfs.DistributedFileSystem$17.doCall(DistributedFileSystem.java:1120)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1120)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:288)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.getFileStatus(ClientDistributedCacheManager.java:224)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestamps(ClientDistributedCacheManager.java:99)
at org.apache.hadoop.mapreduce.filecache.ClientDistributedCacheManager.determineTimestampsAndCacheVisibilities(ClientDistributedCacheManager.java:57)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:265)
at org.apache.hadoop.mapreduce.JobSubmitter.copyAndConfigureFiles(JobSubmitter.java:301)
at org.apache.hadoop.mapreduce.JobSubmitter.submitJobInternal(JobSubmitter.java:389)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1285)
at org.apache.hadoop.mapreduce.Job$10.run(Job.java:1282)
at java.security.AccessController.doPrivileged(Native Method)
......
复制代码
貌似往hdfs上找jar包了,小问题,直接把lib下的jar包上传到hdfs
hadoop fs -mkdir -p /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
hadoop fs -put $HIVE_HOME/lib/* /home/hadoop/git/hive/packaging/target/apache-hive-0.14.0-SNAPSHOT-bin/apache-hive-0.14.0-SNAPSHOT-bin/lib/
复制代码
接着运行insert,没有问题,接下来测试delete
hive>delete from test where id = 1;
复制代码
报错!:
FAILED: SemanticException [Error 10294]: Attempt to do update or delete using transaction manager that does not support these operations.
说是在使用的转换管理器不支持update跟delete操作。
原来要支持update操作跟delete操作,必须额外再配置一些东西,见:
https://cwiki.apache.org/conflue ... tersforTransactions
根据提示配置hive-site.xml:
hive.support.concurrency – true
hive.enforce.bucketing – true
hive.exec.dynamic.partition.mode – nonstrict
hive.txn.manager – org.apache.hadoop.hive.ql.lockmgr.DbTxnManager
hive.compactor.initiator.on – true
hive.compactor.worker.threads – 1
复制代码
配置完以为能够顺利运行了,谁知开始报下面这个错误:
FAILED: LockException [Error 10280]: Error communicating with the metastore
复制代码
与元数据库出现了问题,修改log为DEBUG查看具体错误:
4-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(265)) - Going to execute query <select cq_id,
cq_database, cq_table, cq_partition, cq_type, cq_run_as from COMPACTION_QUEUE where cq_state = 'r'>
2014-11-04 14:20:14,367 ERROR [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(285)) - Unable to select next element for cleaning,
Table 'hive.COMPACTION_QUEUE' doesn't exist
2014-11-04 14:20:14,367 DEBUG [Thread-8]: txn.CompactionTxnHandler (CompactionTxnHandler.java:findReadyToClean(287)) - Going to rollback
2014-11-04 14:20:14,368 ERROR [Thread-8]: compactor.Cleaner (Cleaner.java:run(143)) - Caught an exception in the main loop of compactor cleaner, MetaException(message
:Unable to connect to transaction database com.mysql.jdbc.exceptions.jdbc4.MySQLSyntaxErrorException: Table 'hive.COMPACTION_QUEUE' doesn't exist
at sun.reflect.GeneratedConstructorAccessor19.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:526)
at com.mysql.jdbc.Util.handleNewInstance(Util.java:409)
复制代码
在元数据库中找不到COMPACTION_QUEUE这个表,赶紧去mysql中查看,确实没有这个表。怎么会没有这个表呢?找了很久都没找到什么原因,查源码吧。
在org.apache.hadoop.hive.metastore.txn下的TxnDbUtil类中找到了建表语句,顺藤摸瓜,找到了下面这个方法会调用建表语句:
private void checkQFileTestHack() {
boolean hackOn = HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEST) ||
HiveConf.getBoolVar(conf, HiveConf.ConfVars.HIVE_IN_TEZ_TEST);
if (hackOn) {
LOG.info("Hacking in canned values for transaction manager");
// Set up the transaction/locking db in the derby metastore
TxnDbUtil.setConfValues(conf);
try {
TxnDbUtil.prepDb();
} catch (Exception e) {
// We may have already created the tables and thus don't need to redo it.
if (!e.getMessage().contains("already exists")) {
throw new RuntimeException("Unable to set up transaction database for" +
" testing: " + e.getMessage());
}
}
}
}
复制代码
什么意思呢,就是说要运行建表语句还有一个条件:HIVE_IN_TEST或者HIVE_IN_TEZ_TEST.只有在测试环境中才能用delete,update操作,也可以理解,毕竟还没有开发完全。
终于找到原因,解决方法也很简单:在hive-site.xml中添加下面的配置:
<property>
<name>hive.in.test</name>
<value>true</value>
</property>
复制代码
OK,再重新启动服务,再运行delete:
hive>delete from test where id = 1;
复制代码
又报错:
FAILED: SemanticException [Error 10297]: Attempt to do update or delete on table default.test that does not use an AcidOutputFormat or is not bucketed
复制代码
说是要进行delete操作的表test不是AcidOutputFormat或没有分桶。估计是要求输出是AcidOutputFormat然后必须分桶
网上查到确实如此,而且目前只有ORCFileformat支持AcidOutputFormat,不仅如此建表时必须指定参数('transactional' = true)。感觉太麻烦了。。。。
于是按照网上示例建表:
hive>create table test(id int ,name string )clustered by (id) into 2 buckets stored as orc TBLPROPERTIES('transactional'='true');
复制代码
insert
hive>insert into table test values (1,'row1'),(2,'row2'),(3,'row3');
复制代码
delete
hive>delete from test where id = 1;
复制代码
update
hive>update test set name = 'Raj' where id = 2;
复制代码
OK!全部顺利运行,不过貌似效率太低了,基本都要30s左右,估计应该可以优化,再研究研究
相关文章推荐
- 项目遇到的问题汇总<一>之下拉菜单所有问题
- 【已解决】xen下xl无法使用问题
- USACO 之 Section 2.1 (已解决)
- CF 629D Babaei and Birthday Cake(线段树单点更新)
- 如何在一个函数内修改一个全局变量
- js中单引号和双引号
- Web应用层基础
- 网络流小结 - hdu 1532
- 【matlab应用】:生成老电影海报
- 免费的论文查重网站
- 485总线单点对多点问题
- 密钥 证书关系
- mysql多实例
- mysql的备份与恢复
- Java经典设计模式之十一种行为型模式
- 微信中打开网页,链接无法跳转处理
- 给定一个数组,判断从开始能否走到结束,最多需要几步?
- swift学习入门笔记1
- mybatis打印日志的log4j.xml配置
- 学习asp.net