Sqoop学习笔记 --- 增量导入数据到HBase
2016-11-16 10:02
344 查看
English Version:
Sqoop provides an incremental import mode which can be used to retrieve only rows newer than some previously-imported set of rows.
Sqoop supports two types of incremental imports:
to specify the type of incremental import to perform.
You should specify
the row’s id with
An alternate table update strategy supported by Sqoop is called
such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with
At the end of an incremental import, the value which should be specified as
subsequent import, you should specify
is the preferred mechanism for performing a recurring incremental import. See the section on saved jobs later in this document for more information.
翻译:==================================
翻译上述段落的意思其实不难理解,增量导入共有三个参数
第一个参数:
--check-column (col):控制增量的变量字段,这个字段最好不要是字符串类型的。比如说是time, id 等等字段。
第二个字段:
--incremental (mode):增加的模式选择,共有两个选择一个是 append, 一个是lastmodified.
第三个字段:
--last-value (value): 根据第一个参数的变量,从哪里开始导入,例如这个参数是 --last-value 0 那么就从0开始导入。
加上其余的语句如下:
sqoop import --connect jdbc:mysql://ip:port/db --table tablename --hbase-table namespace:tablename --column-family columnfamily --hbase-create-table -username 'username' -password 'password' --incremental append --check-column
'id' --last-value 0
Sqoop provides an incremental import mode which can be used to retrieve only rows newer than some previously-imported set of rows.
Argument | Description |
---|---|
--check-column (col) | Specifies the column to be examined when determining which rows to import. (the column should not be of type CHAR/NCHAR/VARCHAR/VARNCHAR/ LONGVARCHAR/LONGNVARCHAR) |
--incremental (mode) | Specifies how Sqoop determines which rows are new. Legal values for modeinclude appendand lastmodified. |
--last-value (value) | Specifies the maximum value of the check column from the previous import. |
appendand
lastmodified. You can use the
--incrementalargument
to specify the type of incremental import to perform.
You should specify
appendmode when importing a table where new rows are continually being added with increasing row id values. You specify the column containing
the row’s id with
--check-column. Sqoop imports rows where the check column has a value greater than the one specified with
--last-value.
An alternate table update strategy supported by Sqoop is called
lastmodifiedmode. You should use this when rows of the source table may be updated, and each
such update will set the value of a last-modified column to the current timestamp. Rows where the check column holds a timestamp more recent than the timestamp specified with
--last-valueare imported.
At the end of an incremental import, the value which should be specified as
--last-valuefor a subsequent import is printed to the screen. When running a
subsequent import, you should specify
--last-valuein this way to ensure you import only the new or updated data. This is handled automatically by creating an incremental import as a saved job, which
is the preferred mechanism for performing a recurring incremental import. See the section on saved jobs later in this document for more information.
翻译:==================================
翻译上述段落的意思其实不难理解,增量导入共有三个参数
第一个参数:
--check-column (col):控制增量的变量字段,这个字段最好不要是字符串类型的。比如说是time, id 等等字段。
第二个字段:
--incremental (mode):增加的模式选择,共有两个选择一个是 append, 一个是lastmodified.
第三个字段:
--last-value (value): 根据第一个参数的变量,从哪里开始导入,例如这个参数是 --last-value 0 那么就从0开始导入。
加上其余的语句如下:
sqoop import --connect jdbc:mysql://ip:port/db --table tablename --hbase-table namespace:tablename --column-family columnfamily --hbase-create-table -username 'username' -password 'password' --incremental append --check-column
'id' --last-value 0
相关文章推荐
- HBase学习笔记 --- RDBMS sqoop 导入数据到HBase
- 用sqoop将oracle数据导入Hbase 使用笔记
- Hbase学习笔记2@数据导入导出
- Solr学习(五)DIH增量、定时导入并检索数据
- MongoDB学习笔记(5)--数据导入导出mongoexport
- Solr学习(七)DIH增量导入数据之数据的删除
- 【黑马程序员】数据导入(学习笔记)
- R语言的数据导入与导出学习笔记
- Kettle学习笔记一 :MySQL到Postgres导入数据且发送日志邮件
- 2014-1-3_solr学习之(十一)solr3.5的DIH的增量索引和数据的条件导入
- Sqoop安装及MySql数据导入HBase
- Sqoop安装及MySql数据导入HBase
- Solr学习笔记之3、Solr dataimport - 从SQLServer导入数据建立索引
- 【甘道夫】Sqoop1.4.4 实现将 Oracle10g 中的增量数据导入 Hive0.13.1 ,并更新Hive中的主表
- sqoop向hdfs,hive,hbase导入数据
- [MySql学习笔记] 二 数据导入导出
- cassandra学习笔记5--使用Binary Memtable将大量数据导入Cassandra
- 使用sqoop将MySQL数据库中的数据导入Hbase
- 【甘道夫】Hadoop2.2.0环境使用Sqoop-1.4.4将Oracle11g数据导入HBase0.96,并自动生成组合行键
- 使用sqoop将MySQL数据库中的数据导入Hbase