Home Index

1. Apache Spark

2. Export Data

+ Filter

Apache Spark / Export Data /

1. Set Default Keyspace

2. Spark Query to DataFrame

3. Conver to RDD

4. Save to cfs

des spark

scala> csc.setKeyspace("engine")

scala> val df1 = csc.sql("SELECT domain, count(*) total FROM links 
                          WHERE url LIKE '%health%' 
                          GROUP BY  domain 
                          ORDER BY total DESC").toDF()

scala> val df2 = df1.rdd

scala> df2.saveAsTextFile("result123")

5. List cfs directory

6. Copy from cfs://

7. Delete cfs://

bash shell

$ dse hadoop fs -ls /user/root/result123

$ dse hadoop fs -getmerge /user/root/result123 result123.dat 

$ dse hadoop fs -rmr /user/root/result123

# using cassandra user
dse -u cassandra hadoop fs -ls /