您的位置:首页 > 其它

spark基础(三)------------------------使用maven构建一个基于scala的spark应用程序。

2015-06-14 20:11 429 查看
这一章讲解一下如何使用maven构建我们的spark应用程序。

首先,安装maven,在centos7上使用yum install maven直接安装。

然后按照maven的约定,建立如下目录:

spark-hello/

spark-hello/src

spark-hello/src/main

spark-hello/src/main/scala

spark-hello/src/main/scala/com

spark-hello/src/main/scala/com/spark

spark-hello/src/main/scala/com/spark/demo1

spark-hello/src/main/scala/com/spark/demo1/App.scala

spark-hello/pom.xml

spark-hello/target

编辑pom.xml:

<project xmlns="http://maven.apache.org/POM/4.0.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/maven-v4_0_0.xsd">
<modelVersion>4.0.0</modelVersion>

<groupId>com.spark.demo1</groupId>

<artifactId>spark-hello</artifactId>

<version>1.0-SNAPSHOT</version>

<name>${project.artifactId}</name>

<description>My wonderfull scala app</description>

<inceptionYear>2010</inceptionYear>

<licenses>

<license>

<name>My License</name>

<url>http://....</url>

<distribution>repo</distribution>

</license>

</licenses>

<properties>

<maven.compiler.source>1.5</maven.compiler.source>

<maven.compiler.target>1.5</maven.compiler.target>

<encoding>UTF-8</encoding>

<scala.version>2.11.6</scala.version>

</properties>

<!--

<repositories>

<repository>

<id>scala-tools.org</id>

<name>Scala-Tools Maven2 Repository</name>

<url>http://scala-tools.org/repo-releases</url>

</repository>

</repositories>

<pluginRepositories>

<pluginRepository>

<id>scala-tools.org</id>

<name>Scala-Tools Maven2 Repository</name>

<url>http://scala-tools.org/repo-releases</url>

</pluginRepository>

</pluginRepositories>

-->

<dependencies>

<dependency>

<groupId>org.scala-lang</groupId>

<artifactId>scala-library</artifactId>

<version>2.11.0</version>

</dependency>

<dependency>

<groupId>org.apache.spark</groupId>

<artifactId>spark-core_2.10</artifactId>

<version>1.3.1</version>

</dependency>

<!-- Test -->

<dependency>

<groupId>junit</groupId>

<artifactId>junit</artifactId>

<version>4.8.1</version>

<scope>test</scope>

</dependency>

<dependency>

<groupId>org.scala-tools.testing</groupId>

<artifactId>specs_2.9.3</artifactId>

<version>1.6.9</version>

<scope>test</scope>

</dependency>

<dependency>

<groupId>org.scalatest</groupId>

<artifactId>scalatest</artifactId>

<version>1.2</version>

<scope>test</scope>

</dependency>

</dependencies>

<build>

<sourceDirectory>src/main/scala</sourceDirectory>

<testSourceDirectory>src/test/scala</testSourceDirectory>

<plugins>

<plugin>

<groupId>org.scala-tools</groupId>

<artifactId>maven-scala-plugin</artifactId>

<version>2.15.2</version>

<executions>

<execution>

<goals>

<goal>compile</goal>

<goal>testCompile</goal>

</goals>

<configuration>

<args>

<arg>-dependencyfile</arg>

<arg>${project.build.directory}/.scala_dependencies</arg>

</args>

</configuration>

</execution>

</executions>

</plugin>

<plugin>

<groupId>org.apache.maven.plugins</groupId>

<artifactId>maven-surefire-plugin</artifactId>

<version>2.10</version>

<configuration>

<useFile>false</useFile>

<disableXmlReport>true</disableXmlReport>

<!-- If you have classpath issue like NoDefClassError,... -->

<!-- useManifestOnlyJar>false</useManifestOnlyJar -->

<includes>

<include>**/*Test.*</include>

<include>**/*Suite.*</include>

</includes>

</configuration>

</plugin>

</plugins>

</build>

</project>

红色的部分为重要部分,分别是我们的应用打包后的maven坐标,所依懒的scala,spark版本。

接着,编写我们的App.scala:

package com.spark.demo1

import org.apache.spark.SparkContext

import org.apache.spark.SparkContext._

import org.apache.spark.SparkConf

/**

* @author ${user.name}

*/

object App {

def main(args : Array[String]) {

val logFile = "/usr/local/spark/spark-1.3.1-bin-hadoop2.6/README.md" /**为你的spark安装目录**/

val conf = new SparkConf().setAppName("App")

val sc = new SparkContext(conf)

val logData = sc.textFile(logFile,2).cache()

val numAs = logData.filter(line => line.contains("a")).count()

val numBs = logData.filter(line => line.contains("b")).count()

println("Lines with a: %s,Lines with b: %s".format(numAs,numBs))

}

}

再然后,执行mvn clean package打包我们的程序。mvn会下载依懒的scala,spark等包,然后编译打包。

[root@localhost spark-hello]# mvn clean package

[INFO] Scanning for projects...

[INFO]

[INFO] ------------------------------------------------------------------------

[INFO] Building spark-hello 1.0-SNAPSHOT

[INFO] ------------------------------------------------------------------------

[INFO]

[INFO]

[INFO] --- maven-scala-plugin:2.15.2:testCompile (default) @ spark-hello ---

[WARNING] No source files found.

[INFO]

[INFO] --- maven-surefire-plugin:2.10:test (default-test) @ spark-hello ---

[INFO] Surefire report directory: /usr/local/maven-study/spark-hello/target/surefire-reports

.........................................................

Results :

Tests run: 0, Failures: 0, Errors: 0, Skipped: 0

[INFO]

[INFO] --- maven-jar-plugin:2.3.2:jar (default-jar) @ spark-hello ---

[INFO] Building jar: /usr/local/maven-study/spark-hello/target/spark-hello-1.0-SNAPSHOT.jar

[INFO] ------------------------------------------------------------------------

[INFO] BUILD SUCCESS

[INFO] ------------------------------------------------------------------------

[INFO] Total time: 14.238s

[INFO] Finished at: Sun Jun 14 20:08:32 CST 2015

[INFO] Final Memory: 13M/32M

[INFO] ------------------------------------------------------------------------

[root@localhost spark-hello]#

最后,用spark-submit提交运行


[root@localhost spark-hello]# spark-submit --class "com.spark.demo1.App" --master local[2] target/spark-hello-1.0-SNAPSHOT.jar

Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties

15/06/14 20:09:55 INFO SparkContext: Running Spark version 1.3.1

....................................................

15/06/14 20:09:59 INFO DAGScheduler: Stage 1 (count at App.scala:18) finished in 0.033 s

15/06/14 20:09:59 INFO DAGScheduler: Job 1 finished: count at App.scala:18, took 0.068178 s

Lines with a: 60,Lines with b: 29

[root@localhost spark-hello]#
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: