Pacific-Design.com

    
Home Index

1. Apache Spark

2. 02 Datasets

Apache Spark / 02 Datasets /

Spark Dataset

val lines = sqlContext.read.text("/wikipedia").as[String]

val words = lines
    .flatMap(_.split(" "))
    .filter(_ != "")

val counts = words
    .groupBy(_.toLowerCase)
    .count()

References

https://databricks.com/blog/2016/01/04/introducing-apache-spark-datasets.html