COMPUTER PROGRAMMING 2

Course Description


The course covers advanced topics in computer programming using general-purpose programming language to solve computing problems. The emphasis of this course is to train students to design, implement, test, and debug programs intended to solve computing problems by implementing advanced programming constructs.


Specification, this course will use the Python programming language to introduce students to advanced topics in computer programming. It will include topics in object-oriented programming (OOP), advanced string manipulation using a regular expression (regex), accessing files and databases, data analysis and data visualization, and graphical user interface (GUI).

JOHN LEO D. ECHEVARIA

Bachelors of Science in Computer Science 

Specialization Track in Software Engineering I

Growing up in Manila, I was always fascinated by the power of technology and its ability to change the world. Back in high school, I had the opportunity to join a contest and create a website for our school publishing team, and to my surprise, we were able to reach regionals. That was my first introduction to programming and since then, I fell in love with it. So, when it came time to choose a college, I knew that I wanted to pursue a degree in computer science. But I also knew that to truly excel in this field, I needed to immerse myself in a new environment where I could focus on my studies and gain hands-on experience.


That's why I made the decision to move to Quezon Province to attend college. It's been a big transition, but I'm excited to be here and to be taking classes that will help me achieve my goal of becoming a competent computer science graduate. I'm determined to work hard and make the most of this opportunity.


When I'm not studying, I love to spend my time painting and creating mini programming projects. I believe that art and technology go hand in hand, and I find that these hobbies help me stay creative and inspired. They also let me pursue my interest in both fields and make a balance between the technical and creative side of my life. I also believe that nothing is impossible with hard work and determination, and I'm excited to see what the future holds for me.

FacebookInstagramLinkedInGitHub

Course Learning Outcomes and Performance Indicators 

MIDTERM COURSE OUTCOMES

COURSE OUTCOME PERFORMANCE INDICATOR

EXPECTATIONS

In this period I expected to learn about data analysis and visualization. I expected that I'll be able to learn how to create data visualizations that will be of great help if we are going to conduct research in the future. I also expected to be able to learn or atleast have an idea on how does those intricate and complex data visualization with maps are created.

APPLICATION

Midterm Course Outcome Indicator Output

Behind The Spotlight : 

Data Analysis on  25k IMDb Movies Data Set

Movies have been an integral part of our lives for over a century. They have the power to transport us to different worlds, evoke a range of emotions, and inspire us in countless ways. From the silent films of the early 1900s to the modern-day blockbusters, movies have undergone a tremendous transformation. With the advancements in technology, the way we consume and create films has changed drastically. As a data enthusiast and a lover of films, I have always been curious about the trends and patterns in the film industry. Therefore, I decided to utilize the power of Pandas and hvplot to dive into the world of movie data, wrangle, analyze, and visualize it to gain insights that can help us better understand the industry. In this portfolio, I will showcase my skills in data wrangling, analysis, and visualization by exploring a dataset of over 25,000 movies to answer various questions about the film industry.

25k IMDb Movies Dataset

25k IMDb movie Dataset

About the Dataset:

The data used for the analysis and visualization was sourced from Kaggle and is called the '25k IMDb Movie Dataset'. It includes columns such as Movie Title, Run Time, Rating, User Rating, Genres, Overview Plot Keyword, Director, Top 5 Casts, Writer, Year, and Path, each containing different data types including strings, integers, floats, and lists. However, the raw dataset contained some impurities, such as typographical errors and mixed values in columns, which required data wrangling to address.

Jupyter Notebook of the Data Analysis:

NOTE: Visualization outputs of the Jupyter Notebook could not be displayed in the embed and thus another section is created in the portfolio for the discussion of outputs and its interpretations.

Data Wrangling Performed on the Dataset:

To clean and prepare the dataset for analysis and visualization, I performed the following data wrangling operations:

These operations involved renaming columns, dropping and converting columns, cleaning data, and exploding lists to make the data easier to work with and analyze. The resulting dataset is now ready for analysis and visualization using pandas and hvplot.

Overview on the Data Analysis Process:

Once the data was processed through data wrangling techniques, I proceeded to analyze and visualize it using pandas and hvplot. To manipulate the data and extract insights, I employed several functions including groupby, count, mean, and sum. For visualization, I utilized various functions offered by the hvplot library such as line graph, bar, and horizontal bar. Additionally, I incorporated holoview's line and text functions to create an overlay effect in the visualization.

Results and Interpretation of the Analysis

How Did the Advancement in Film Technology Affect the Number of Films Produced Over the Year?

There has been a significant increase in the number of movies released each year from 1906 to 2022. The findings of the data analysis and visualization show a clear upward trend in the number of films released per year, which indicates that the film industry has been rapidly growing over the past century.

The significant increase in the number of films produced may be attributed to the advancement in our film technology. As we continue to introduce new technologies, our ability to create films has become more efficient and innovative. Here are some of the technological advancements plotted in the graph:

What Are The Trend For Each Genre Over Time?

The development of film technology has not only impacted the quantity of films produced but also the types of films that we make. As shown in these graphs, the drama genre has consistently seen a rise in popularity since the early days of the film industry. In contrast, we can observe that science fiction or sci-fi films experienced a significant surge in popularity during the introduction of special effects technology because film makers are now able to make 'impossible' scenes possible in their movies. This suggests that technological advancements can influence the trends in film genres and audience preferences. 

Data on all 22 Genres available on the Jupyter Notebook in the GitHub Repository

Which Genre Were Commonly Used on Films or Movies?

According to the analysis, here are the top 10 most common genres of movies or films. We can see that drama is the most common, this may be because it is one of the oldest genre of film.

Top Genres of Movie Based on Ratings

Although drama is the most used genre, it is not the most liked genre. According to the analysis, 'Western' genre received the highest average ratings based on the ratings of the films classified under the western genre.

Which Director Has Directed the Most Number of Movies?

Woody Allen is the director with the most number of films directed. He has directed a total of 45 movies according to the data analysis.

Which Director Has The Highest Rating on The Films That They Have Produced?

But quantity doesn't guarantee quality because according to the data analysis, Frank Darabont has directed a the most number of high rated films. 

Frank Darabont is an American film director, screenwriter, and producer known for his work in the horror, science fiction, and drama genres. He is best known for directing the critically acclaimed films "The Shawshank Redemption" (1994), "The Green Mile" (1999), and "The Mist" (2007), all of which are adaptations of Stephen King novels. Darabont's style of storytelling is often characterized by his ability to blend compelling human drama with suspenseful and eerie elements. His films have garnered multiple Academy Award nominations and critical acclaim, cementing his place as one of the most prominent directors in Hollywood.

Which Writers Has Most Number of Writing Credits?

Again on the top of the list is Woody Allen with 45 writing credits according to the analysis of the 25k IMDb Movies Dataset.

Which Writer Has The Highest Rating on The Films That They Have Written?

the same as before, quantity doesn't guarantee quality because according to the data analysis, Stephen King has written the most number of high rated films. 

Stephen King is an American author of horror, supernatural fiction, suspense, and fantasy novels. King has published over 60 novels, six non-fiction books, and countless short stories. His works have sold more than 350 million copies worldwide and have been adapted into numerous films, TV shows, and comic books. Some of his most famous works include "Carrie," "The Shining," "The Stand," "IT," "Misery," and "The Dark Tower" series. King is widely regarded as one of the greatest horror writers of all time and has received numerous awards for his contributions to the literary world.

Which Actors/Actresses Appeared In Most Number of Films (Top 10)

At the top of the list, once again, is Woody Allen. Woody Allen is an American film director, writer, and actor who has been making films since the 1960s. He is known for his prolific output, having directed over 50 films throughout his career.

Actors/Actresses that Starred on High Rated Films

According to the analysis the actor that appeared in most number of high rated films is Bob Gunton.

Bob Gunton is an American actor, born on November 15, 1945, in Santa Monica, California. He has appeared in numerous films, TV shows, and stage productions throughout his career, but is perhaps best known for his portrayal of Warden Norton in the 1994 classic film "The Shawshank Redemption."

Gunton's other notable film roles include playing Harold Attinger in "Transformers: Age of Extinction," Chief George Earle in "Argo," and Cyrus Gould in "Fracture." He has also made appearances on popular TV shows such as "24," "Desperate Housewives," "The Blacklist," and "Daredevil."

Midterm Output Reflection

During the midterm period, I gained a valuable understanding of the pandas and hvplot library and how to use them for data wrangling, analysis, and visualization. Applying these skills, I was able to conduct data analysis on the 25k IMDb Movies Dataset and answer various data analysis questions about the film industry, a topic that really interest me as a person that likes to watch films and movies. The contribution of these new skills and knowledge to my career prospects as an IT professional is significant. Data analysis is an in-demand job that offers high pay, and as an IT professional, I can use this new skill and knowledge in various fields such as research and business, which makes me more versatile as an IT professional.
One of the main challenges that I encountered during the creation of my midterm course outcome output was finding a dataset that interested me and had a clean and accurate data because most of the data that I saw in kaggle had a lot of impurities like typographical errors and mixed/inconsistent data types in columns. I overcame this challenge by exploring different datasets on Kaggle and eventually found one that interested me. However, the data still had a lot of impurities, so I had to apply my knowledge of data wrangling to clean the data before conducting data analysis.
Overall, the midterm period has been an enriching experience for me. Not only did I learn new skills, but I also gained practical experience in using these skills to conduct data analysis. I am excited to continue building on these skills and applying them in various professional settings in the future. I might also be able to use it next year because I plan to apply for a part-time or sideline job next year.