Pacific-Design.com

    
Home Index

1. Apache Spark

2. DataFrame

+ Filter

+ GroupBy

Apache Spark / DataFrame /

Export DataFrame to different storage systems


// save a DataFrame in JSON format
customerDF.write
  .format("org.apache.spark.sql.json")
  .save("path/to/output-directory")

// save a DataFrame in Parquet format
homeDF.write
  .format("org.apache.spark.sql.parquet")
  .partitionBy("city")
  .save("path/to/output-directory")

// save a DataFrame in ORC file format
homeDF.write
  .format("orc")
  .partitionBy("city")
  .save("path/to/output-directory")

// save a DataFrame as a Postgres database table
df.write
  .format("org.apache.spark.sql.jdbc")
  .options(Map(
     "url" -> "jdbc:postgresql://host:port/database?user=&password=",
     "dbtable" -> "schema-name.table-name"))
   .save()

// save a DataFrame to a Hive table
df.write.saveAsTable("hive-table-name")