Amy Ngo - Separate .list files in Python to CSV file, Clean up data in R, Format data in R.
Desiree Murray - Clean up data in R, Format data in R.
Screen capture of app
https://youtu.be/mX-VnxD-Bqc
https://github.com/amyngo3/cs424_project3
http://shiny.evl.uic.edu:3838/p3g5/
https://docs.google.com/document/d/1gJMWqOxxzxTdzs8VoewGEd_Dj_MQXXXmjDW_VnAcrmc/edit?usp=sharing
We decided to have the quick info boxes at the top for users to know the average films per year, month, and running time. Then the user can select a year and decade drop down box that changes the data on multiple graphs.
The plots on the dashboard allow the user to glean information about various aspects of the database. They are able to see the top ten keywords that are associate with films as well as the distribution of genres, certificates, and running times. Additionally, there are plots that display the number of films released per month and year across the entire database.
Finally, there is an about page that can be selected from the dashboard sidebar menu which displays information about the creation of the dashboard.
We converted the list files to text files and removed lines before and after the list of films. Then we used the Python language to find the correct criteria needed into a CSV file. In the process of filtering the data, Python’s regular expression is convenient to sort out data not required and data we need. Some data in a few files could not be sorted, thus we used R to sort the data.
The data came from the internet database (IMDB) which are available at ftp://ftp.fu-berlin.de/pub/misc/movies/database/frozendata/
Another site we used is https://regex101.com/. This site is very useful to write regular expressions and test strings that fit within the expression. The explanation box provides details of the typed regular expression, match information will match the test string in groups, and a quick reference guide for the user.
The files used were: