您的位置：首页 > 数据库

sparksql读取hive数据源配置

2017-08-24 21:23 453 查看

1、将hive-site.xml内容添加到spark conf配置文件中，内容仅需要元数据连接信息即可

<?xml version="1.0" encoding="UTF-8"?>

<configuration>

    <property>

            <name>hive.metastore.uris</name>

            <value>thrift://master-centos:9083</value>

            <description>Thrift URI for the remote metastore. Used by metastore client to connect to remote metastore.</description>

    </property>

</configuration>

并分发到各个节点中

2、如hive元数据采用的是mysql，则需将mysql-connector-java-5.1.25-bin.jar放置 spark/lib下

3、修改 spark-defaults.conf 配置文件

spark-default.conf

spark.master    spark://192.168.130.140:7077

spark.driver.memory     512m

spark.executor.memory 512m

spark.eventLog.enabled true

spark.eventLog.dir hdfs://192.168.130.140:8020/user/spark/logs （需提前在hadoop上创建好该目录）

并分发到各个节点中

4、启动hive metastore 服务

5、如需通过jdbc方式连接spark，则启动spark thriftserver服务

start-thriftserver.sh --master spark://192.168.130.140:7077 --executor-memory 1g --total-executor-cores 2 --executor-cores 1 --hiveconf hive.server2.thrift.port=10050 --conf spark.dynamicAllocation.enabled=false

内容来自用户分享和网络整理，不保证内容的准确性，如有侵权内容，可联系管理员处理

标签： spark hive

相关文章推荐

新的分享

章节导航