Spark
Zeppelin
설치하기
$ wget https://downloads.apache.org/zeppelin/zeppelin-0.9.0/zeppelin-0.9.0-bin-all.tgz
$ tar -zxvf zeppelin-0.9.0-bin-all.tgz
$ mv zeppelin-0.9.0-bin-all zeppelin
$ ./zeppelin/bin/zeppelin-daemon.sh
Please specify HADOOP_CONF_DIR if USE_HADOOP is true
Zeppelin is not running
Zeppelin start [ OK ]
$ curl http://localhost:8080
Shell
복사
실습
Word Count ( Object Save )
scala> val f = sc.textFile("README.md")
scala> val wc = f.flatMap(l => l.split(" ")).map(word => (word, 1)).reduceByKey(_ + _)
scala> wc.saveAsObjectFile("hdfs://master-01:9000/wc_out.obj")
scala> val obj = sc.objectFile[(String, Int)]("hdfs://master-01:9000/wc_out.obj")
scala> obj.collect.foreach(println)
(package,1)
(this,1)
...
scala> val obj2 = obj.sortBy(x => x._2, false)
scala> val arr = obj2.take(20)
scala> arr.foreach(x => println(x))
(,73)
(the,23)
(to,16)
(Spark,14)
(for,12)
(##,9)
(a,9)
(and,9)
(is,7)
(run,7)
(on,7)
(can,6)
(also,5)
(in,5)
(of,5)
(Please,4)
(*,4)
(if,4)
(including,4)
(an,4)
Scala
복사