您的位置:首页 > 其它

编译spark2.X源码,参数说明

2018-01-23 16:36 246 查看


编译spark2.X源码

这里我们使用源码包中自带的make-distribution.sh文件进行编译。当然在编译之前你可以试着修改一些源代码。 

在spark源码目录下运行
./dev/make-distribution.sh --name "hadoop2-without-hive" --tgz "-Pyarn,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf :spark-repl_2.11


./dev/make-distribution.sh --name "hadoop2-hive" --tgz "-Pyarn,-Phive,hadoop-provided,hadoop-2.7,parquet-provided,-Dscala-2.11" -rf :spark-repl_2.11


参数解释: 

-DskipTests,不执行测试用例,但编译测试用例类生成相应的class文件至target/test-classes下。 

-Dhadoop.version 和-Phadoop: Hadoop 版本号,不加此参数时hadoop 版本为1.0.4 。 

-Pyarn :是否支持Hadoop YARN ,不加参数时为不支持yarn 。 

-Phive和-Phive-thriftserver:是否在Spark SQL 中支持hive ,不加此参数时为不支持hive 。 

–with-tachyon :是否支持内存文件系统Tachyon ,不加此参数时不支持tachyon 。 

–tgz :在根目录下生成 spark-$VERSION-bin.tgz ,不加此参数时不生成tgz 文件,只生成/dist 目录。

–name :和–tgz结合可以生成spark-$VERSION-bin-$NAME.tgz的部署包,不加此参数时NAME为hadoop的版本号。

这样大概要等二十分钟到一个多小时不等,主要取决于网络环境,因为要下载一些依赖包之类的。之后你就可以获得一个spark编译好的包了,解压之后就可以部署到机器上了。

执行以下命令,会在spark-2.0.2下生成文件 spark-2.0.2-bin-hadoop2-with-hive.tgz

 [root@master spark-2.0.2]# ./dev/change-scala-version.sh 2.11
[root@master spark-2.0.2]# ./dev/make-distribution.sh --name "hadoop2-with-hive" --tgz "-Pyarn,-Phive,hadoop-provided,hadoop-2.7,parquet-provided"
   
main:
[INFO] Executed tasks
[INFO] ------------------------------------------------------------------------
[INFO] Reactor Summary:
[INFO]
[INFO] Spark Project Parent POM ........................... SUCCESS [ 19.119 s]
[INFO] Spark Proje
4000
ct Tags ................................. SUCCESS [  7.630 s]
[INFO] Spark Project Sketch ............................... SUCCESS [  6.463 s]
[INFO] Spark Project Networking ........................... SUCCESS [ 19.845 s]
[INFO] Spark Project Shuffle Streaming Service ............ SUCCESS [ 13.890 s]
[INFO] Spark Project Unsafe ............................... SUCCESS [ 13.337 s]
[INFO] Spark Project Launcher ............................. SUCCESS [ 23.115 s]
[INFO] Spark Project Core ................................. SUCCESS [03:42 min]
[INFO] Spark Project GraphX ............................... SUCCESS [ 26.100 s]
[INFO] Spark Project Streaming ............................ SUCCESS [01:07 min]
[INFO] Spark Project Catalyst ............................. SUCCESS [02:36 min]
[INFO] Spark Project SQL .................................. SUCCESS [03:26 min]
[INFO] Spark Project ML Local Library ..................... SUCCESS [ 14.402 s]
[INFO] Spark Project ML Library ........................... SUCCESS [02:54 min]
[INFO] Spark Project Tools ................................ SUCCESS [  3.691 s]
[INFO] Spark Project Hive ................................. SUCCESS [01:32 min]
[INFO] Spark Project REPL ................................. SUCCESS [ 11.372 s]
[INFO] Spark Project YARN Shuffle Service ................. SUCCESS [ 16.772 s]
[INFO] Spark Project YARN ................................. SUCCESS [ 27.160 s]
[INFO] Spark Project Assembly ............................. SUCCESS [  5.484 s]
[INFO] Spark Project External Flume Sink .................. SUCCESS [ 22.666 s]
[INFO] Spark Project External Flume ....................... SUCCESS [ 22.288 s]
[INFO] Spark Project External Flume Assembly .............. SUCCESS [  5.101 s]
[INFO] Spark Integration for Kafka 0.8 .................... SUCCESS [ 21.637 s]
[INFO] Spark Project Examples ............................. SUCCESS [ 42.329 s]
[INFO] Spark Project External Kafka Assembly .............. SUCCESS [  8.713 s]
[INFO] Spark Integration for Kafka 0.10 ................... SUCCESS [ 22.547 s]
[INFO] Spark Integration for Kafka 0.10 Assembly .......... SUCCESS [  7.028 s]
[INFO] Kafka 0.10 Source for Structured Streaming ......... SUCCESS [ 18.807 s]
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 21:43 min
[INFO] Finished at: 2018-01-25T11:13:06+08:00
[INFO] Final Memory: 76M/327M
[INFO] ------------------------------------------------------------------------
+ rm -rf /opt/spark-2.0.2/dist
+ mkdir -p /opt/spark-2.0.2/dist/jars
+ echo 'Spark 2.0.2 built for Hadoop 2.7.3'
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: