How-to: resolve spark "/usr/bin/python: No module named pyspark" issue
2015-08-05 13:39
961 查看
[b]Error: [/b]
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/home/hadoop/tmp/nm-local-dir/usercache/chenfangfang/filecache/43/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
Root cause:
I am using 1.7.0_45. While python spark on yarn has some issue which makes pyspark does not work with spark build with jdk7: https://issues.apache.org/jira/browse/SPARK-1520.
There was not such issue with cdh 5.4.1 spark. But cdh 5.4.1 announced that it was using jdk 1.7.0_45, while its spark was build with jdk6.
Solution:
It is not reasonable for us to rebuild spark with jdk 6, as there are some issue during building. One available solution could be:
Regerate new package with following way:
unzip -d foo spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
cd foo
$JAVA6_HOME/bin/jar cvmf META-INF/MANIFEST.MF ../spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
don't neglect the dot at the end of that command
Error from python worker:
/usr/bin/python: No module named pyspark
PYTHONPATH was:
/home/hadoop/tmp/nm-local-dir/usercache/chenfangfang/filecache/43/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
java.io.EOFException
at java.io.DataInputStream.readInt(DataInputStream.java:392)
Root cause:
I am using 1.7.0_45. While python spark on yarn has some issue which makes pyspark does not work with spark build with jdk7: https://issues.apache.org/jira/browse/SPARK-1520.
There was not such issue with cdh 5.4.1 spark. But cdh 5.4.1 announced that it was using jdk 1.7.0_45, while its spark was build with jdk6.
Solution:
It is not reasonable for us to rebuild spark with jdk 6, as there are some issue during building. One available solution could be:
Regerate new package with following way:
unzip -d foo spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
cd foo
$JAVA6_HOME/bin/jar cvmf META-INF/MANIFEST.MF ../spark/lib/spark-assembly-1.3.0-cdh5.4.1-hadoop2.6.0-cdh5.4.1.jar
don't neglect the dot at the end of that command
相关文章推荐
- Spark随谈——开发指南(译)
- Ubuntu 安装 JDK 问题
- Spark,一种快速数据分析替代方案
- jdk与jre的区别 很形象,很清晰,通俗易懂
- jdk中String类设计成final的原由
- win7下安装 JDK 基本流程
- jdk环境变量配置
- win2003 jsp运行环境架设心得(jdk+tomcat)
- windows linux jdk安装配置方法
- 简单记录Cent OS服务器配置JDK+Tomcat+MySQL
- Android开发的IDE、ADT、SDK、JDK、NDK等名词解释
- Java4Android开发教程(一)JDK安装与配置
- java中sdk与jdk的区别详细解析
- jdk中密钥和证书管理工具keytool常用命令详解
- java动态代理(jdk与cglib)详细解析
- Shell脚本实现在Linux系统中自动安装JDK
- linux安装jdk,tomcat 配置vsftp远程连接的步骤
- java jdk动态代理详解
- eclipse 开发 spark Streaming wordCount
- rocketmq的安装(简单版)