Download file from DBFS

1. Write a spark dataframe into csv. The repartition(1) yields only one 'part' file in the output folder

The /FileStore is the default directory when click 'Browse DBFS' on databricks

(

  df.repartition(1)

    .write

    .format("com.databricks.spark.csv")

    .option("header", "true")

    .save("dbfs:/FileStore/temp/output")

)

2. Browse DBFS

Go to the file part and copy the file path

3. Connect through Databricks CLI. This needs the cluster url and access token

databricks configure --token


4. Copy the file from DBFS to local directory

databricks fs cp dbfs:/FileStore/temp/date.csv/part-xxx.csv c:/temp/output.csv


5. Delete the file from DBFS to clean up

Run the dbutils command through notebook on databricks

dbutils.fs.rm('dbfs:/FileStore/temp/output', True)