Our project is a deep analytical dive into the world of movies. Movies have been around for years and are a crucial type of media when it comes to society. Going to the movie theater is an encouraged social event and every year millions of people go with their friends and family to watch a film. Now in the modern age we can also stream films from the comfort of our homes. Our website gives a look into the history of films that were stored in the IMDB database. We took all of this data and grouped these movies by genre, how long they ran, when they were released, what their rating was, and what certain keywords are associated with them.
When the user looks at our project, they will see two main sets graphs at the center stage of the website. The set on the left, which is the overall distribution of films, contains six graphs which can be accessed by clicking on their respective headers. This includes number of films by year, month, runtime, certificate, genre, and by the top number of keywords. Here the user can see all the available movie data but in forms of multiple graphs. The user can specify any top of number of keywords they want but the only constraint is that it has to be greater than 10 keywords. Along with this, the user can also see these plots in table form as well if they wish. They can do that by clicking on the "Tabular Format" tab. Below are screenshot examples of what the Overall Distribution of Films by Year graph looks like. Also we have the table of overall films by month below as well.
These graphs also include interactivity and allows the user to filter the graphs depending on their choice of year or decade. On the left hand side of the website, the user will see two sliders with the names "Select a Year" and "Select a Decade" on it.
The user will be able to slide the button to a certain year/decade and the graphs will reflect that change. It will only use films that were released in the select decade or selected year. While the tables really don't change in terms of their look, the graphs do. When we select a year/decade, the graph shows our selection in comparison to the rest of the data so the user can see how a certain time period stacks up. For example, if a user wanted to know what films were released in 1990, the graph for films released per month and the table for the distribution of genres for films in the 90's would look like this:
Also on the left, the user has more variety of options in which they can change the graphs. Under the year/decade, the user can select the number of top keywords, genres, an individual keyword, as well as a certificate. There is also a slider where the user can change the range of runtimes for the given movies. Initially it was set at 60 to 120 minutes but it's up to the user's choice to change that.
With the certificates, the user can select more than one and the graph will only show the data which includes just those two certificates. Let's say the user wants to see how many movies were rated G, they can click on the "certificates" button and the changes will reflect that in the graph. Here is what a graph showing how many G movies have been released in the data set per month along with the places where the user can change the different settings of the data set.
Next, we will move to the set of graphs on the right. These show the distribution of films by a genre which the user can specify. Initially when you load the website, the plots on the right, it will show the data for all the films but the user has a variety of options in terms of filtering. This set includes ten graphs and their respective tables. Included on this side is the percentage of films released per year, month, and decade. The user can select up to multiple genres and it will be reflected as such in the graphs. The same goes for runtime. If the user selects a range of runtimes (as shown in the screenshot above), the interactivity will show those films which ran in that certain length. So if the user wanted to see the distribution of Action and Comedy films released per decade along with the percentage of films released per month, we can display that scenario through the graphs.
In addition to choosing multiple genres, the user can also choose a combination of genre, a keyword, certificate, and the graphs will reflect that.
At the bottom you will see four information boxes which update given the different inputs the user gives. You have the average number of films per year and month with the average runtime of all the movies in the third box. We also have a fourth box a bit farther from the rest called "Current Total", that box will reflect the number of total movies which fit the criteria the user specified by their inputs. This is the only box which changes as the numbers in the rest of the three information boxes do not change per user input.
At the top left of the website, we have our About page which displays our names, the packages we used in R, and where we got the data from along with a download link to those files. Also included is a logo from IMDB as we got the data from there and the point of our visualizations were to display the different types of ways we can use the movie data.
Finally, here is a screenshot of how our webpage/project looks as a whole: