登陆

章鱼彩票老版app-Spark技巧之输出Lzo压缩文件

admin 2019-11-20 151人围观 ,发现0个评论

条件

现已配好操作系统的LZO以及Hadoop的LZO

设置executor和driver的classPath

翻开/etc/spark/conf/spark-defaults.conf,设置hadoop的lzo依靠jar

spark.driver.extraClassPath=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo.jar
spark.executor.extraClassPath=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/hadoop-lzo.jar

或:
spark.driver.extraClassPath=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
spark.executor.extraClassPath=/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*

或许翻开spark-env.sh:

SPARK_LIBRARY_PATH=$SPARK_LIBRARY_PATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/native/*
SPARK_CLASSPATH=$SPARK_CLASSPATH:/opt/cloudera/parcels/HADOOP_LZO/lib/hadoop/lib/*
LD_LIBRARY_PATH=LD_LIBRARY_PATH:/opt/cloud章鱼彩票老版app-Spark技巧之输出Lzo压缩文件era/par章鱼彩票老版app-Spark技巧之输出Lzo压缩文件cels/HADOOP_LZO/lib章鱼彩票老版app-Spark技巧之输出Lzo压缩文件/hadoop/lib/native

翻开shel雷诺科雷傲l操作


scala> import com.ha章鱼彩票老版app-Spark技巧之输出Lzo压缩文件doop.compression.lzo.LzopCodec
import com.hadoop.compression.lzo.LzopCodec

scala> val lzoTest = sc.parallelize(1 to 10)
lzoTest: org.apache.spark.rdd.RDD[Int] = ParallelCollectionRDD[0] at parallelize at :25

scala> lzoTest.saveAsTextFile("/input/test_lzo", classOf[LzopCodec])

检查/input/test_lzo目录下的文件是否是lzo格局

感兴趣的朋友点个赞,点个重视,一同学习前进吧!

请关注微信公众号
微信二维码
不容错过
Powered By Z-BlogPHP