In order to explain how use our code, generate test cases and replicate the results of our study, we suggest watching the following demonstration video (feel free to increase the playback speed):
Make sure you have checked out our GitHub repository on your local device.
Go to the Google Cloud Platform website and sign in using your UCM credentials. Make sure you are inside the MarketQuake project.
From the Navigation Menu, go to "Dataproc" and then select "Clusters."
Locate the marketquake-cluster. Make sure it is running, otherwise start it by clicking the "Start" button on the same page. Navigate to Cluster Details by clicking on the cluster name.
Navigate to "VM instances" and open an SSH session of the master node (marketquake-cluster-m) in your browser. You might have to wait a couple of seconds till you can do it.
Wait for the terminal to open and authorize. Then, click "Upload File" in the top-right corner to upload all files from the scripts/ (but not preprocessing/) directory. Wait for the files to load.
(optional) After the upload, verify the files presence by typing ls in the command line.
Generate the needed PySpark and/or plotting command by running: python generate_commands.py. Answer the appearing prompts based on your specific needs.
Once the PySpark command is displayed as output in the terminal, copy and paste it into the command line and run it. Wait for the computation to complete and display the output, in particular the paths where resulting CSVs files are now stored.
(optional) Check the presence of the CSV files for plotting, After completion, go to the Navigation Menu, select "Cloud Storage", and then "Buckets." Then go to the marketquake_results bucket, navigate to the CSVs/general or CSVs/extremes directory and choose a CSV folder corresponding the displayed path in the terminal output. Click on the part and "Authenticated URL" to download it to your local device.
Return to the SSH browser session, copy the plotting command generated earlier (if applicable), paste and run in from the terminal. Wait for the plot generation to complete and display the path in the GCS where the plot has been saved.
To view the plot, go back to Google Storage, but this time navigate to marketquake_results/Plots/general.
Choose the plot you're interested in, click on its link, and then on "Authenticated URL." The plot should now be displayed in your browser window, and you can save it to your local device for further interpretation.
spark-submit main.py general Close all_markets daily_covid_deaths world World --py-files merge_by_group.py merge_all.py
spark-submit main.py general AdjustedClose sp500 daily_covid_deaths country USA --py-files merge_by_group.py merge_all.py
spark-submit main.py general Close all_sectors daily_covid_deaths regions Europe --py-files merge_by_group.py merge_all.py
spark-submit main.py extremes Close all_markets daily_covid_deaths world World --py-files merge_by_group.py merge_all.py
etc...