您的位置:首页 > 运维架构 > Shell

How to access HBase from spark-shell using YARN as the master on CDH 5.3 and Spark 1.2

2016-06-20 10:51 871 查看


How to access HBase from spark-shell using YARN as the master on CDH 5.3 and Spark 1.2

http://somelittlebits.blogspot.com/2015/01/how-to-access-hbase-from-spark-shell.html


How to access HBase from spark-shell using YARN as the master on CDH 5.3 and Spark 1.2

From terminal:

# export SPARK_CLASSPATH=/opt/cloudera/parcels/CDH/lib/hbase/hbase-protocol.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-common.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-client.jar:/opt/cloudera/parcels/CDH/lib/hbase/lib/htrace-core.jar:/opt/cloudera/parcels/CDH/lib/hbase/hbase-server.jar:/etc/hbase/conf/hbase-site.xml

# spark-shell --master yarn-client

Now you can access HBase from the Spark shell prompt:

import org.apache.hadoop.hbase.client.HBaseAdmin
import org.apache.hadoop.hbase.{HBaseConfiguration, HTableDescriptor}
import org.apache.hadoop.hbase.mapreduce.TableInputFormat

val tableName = "My_HBase_Table_Name"

val hconf = HBaseConfiguration.create()

hconf.set(TableInputFormat.INPUT_TABLE, tableName)

val admin = new HBaseAdmin(hconf)
if (!admin.isTableAvailable(tableName)) {
  val tableDesc = new HTableDescriptor(tableName)
  admin.createTable(tableDesc)
}

val hBaseRDD = sc.newAPIHadoopRDD(hconf, classOf[TableInputFormat], classOf[org.apache.hadoop.hbase.io.ImmutableBytesWritable], classOf[org.apache.hadoop.hbase.client.Result])

val result = hBaseRDD.count()

Thanks to these refs for pointers:
http://comments.gmane.org/gmane.comp.java.hadoop.hbase.user/44744
http://apache-spark-user-list.1001560.n3.nabble.com/HBase-and-non-existent-TableInputFormat-td14370.html
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: