Pacific-Design.com

    
Home Index

1. Spark

2. Export Data

+ Filter

Spark / Export Data /

1. Set Default Keyspace

2. Spark Query to DataFrame

3. Conver to RDD

4. Save to cfs

des spark


scala> csc.setKeyspace("engine")

scala> val df1 = csc.sql("SELECT domain, count(*) total FROM links 
                          WHERE url LIKE '%health%' 
                          GROUP BY  domain 
                          ORDER BY total DESC").toDF()

scala> val df2 = df1.rdd

scala> df2.saveAsTextFile("result123")

5. List cfs directory

6. Copy from cfs://192.168.1.159/user/root/result123

7. Delete cfs://192.168.1.159/user/root/result123

bash shell


$ dse hadoop fs -ls /user/root/result123

$ dse hadoop fs -getmerge /user/root/result123 result123.dat 

$ dse hadoop fs -rmr /user/root/result123

# using cassandra user
dse -u cassandra hadoop fs -ls /