Unit 9 Projects

Unit 9 Projects: Working with Files and Real-World Data

Project 1: Student Test Scores
Description:
Write a program to read a file of students with their test scores and a boolean indicating if they passed. For each student, display their name, scores, and pass/fail status. Also, compute the average score of all students.

Starter File (students.txt):
Alice 90 95 true
Bob 80 85 false
Charlie 100 100 true

Example Output:
Alice: 90, 95, Passed? true
Bob: 80, 85, Passed? false
Charlie: 100, 100, Passed? true
Average Score: 91.67

Optional Dataset Questions:

What if a student's scores are missing? How will your program handle it?
Could this dataset have bias if it only contains high-performing students?

Project 2: Movie Ratings
Description:
Read a CSV file of movies (title, rating, year). Display only movies with a rating higher than 8.0.

Starter File (movies.csv):
The Godfather,9.2,1972
Inception,8.8,2010
Titanic,7.8,1997

Example Output:
The Godfather (1972) - Rating: 9.2
Inception (2010) - Rating: 8.8

Optional Dataset Questions:

How might missing ratings affect your results?
Could there be bias in which movies were included?

Project 3: Word Count in a Text
Description:
Read a file containing a paragraph. Count the number of words, find the average word length, and identify the longest word.

Starter File (paragraph.txt):
The quick brown fox jumped over the lazy dog.

Example Output:
Number of Words: 9
Average Word Length: 3.9
Longest Word: jumped

Optional Dataset Questions:

How would punctuation affect word counting?
Could this dataset be biased if it only contains short sentences?

Project 4: Letters in Words
Description:
Read a file containing a sentence. Ask the user for a letter and display all words containing that letter.

Starter File (sentence.txt):
Where are they now?

Example Input:
Letter: e

Example Output:
Where
are
they

Optional Dataset Questions:

What if the sentence has mixed case letters? How will you handle that?
Could the dataset contain typos that affect results?

Project 5: Real-World Data Exploration
Description:
Pick a dataset from a real-world source (CSV or TXT). Write a program to analyze it for a specific question. For example, you could analyze:

Average temperatures from a weather dataset
Top-rated books or movies
Student grades from a school dataset

Starter File (data.csv):
Students choose a small real dataset.

Example Output:
Varies based on dataset and question.

Optional Dataset Questions:

Is the dataset complete and accurate?
Is there any bias in the dataset?
Could the dataset answer your question appropriately?

Project 6: Advanced Scores Analysis
Description:
Read students.txt containing names, 3 test scores, and a boolean for pass/fail. Compute:

Each student's average
Class average
Students who scored below average

Starter File (students.txt):
Alice 90 95 88 true
Bob 80 85 78 false
Charlie 100 100 100 true

Example Output:
Alice Average: 91
Bob Average: 81
Charlie Average: 100
Class Average: 90.67
Students Below Average: Bob

Optional Dataset Questions:

How does missing or incorrect data affect averages?
Could this dataset contain bias if some groups are underrepresented?

Project 7: Sports Stats Analyzer
Description:
Read a file of players with points scored in multiple games. Compute total points, average per game, and display players with above-average performance.

Starter File (players.txt):
LeBron 25 30 28
Durant 20 22 27
Curry 30 32 29

Example Output:
LeBron Total: 83, Average: 27.7
Durant Total: 69, Average: 23.0
Curry Total: 91, Average: 30.3
Above Average Players: LeBron, Curry

Optional Dataset Questions:

How would you handle missing scores?
Could the dataset be biased toward certain types of games or players?

Project 8: Temperature Trends
Description:
Read a file of daily temperatures (date, high, low). Compute the highest and lowest temperatures and the average high/low for the month.

Starter File (temps.txt):
2025-09-01 85 70
2025-09-02 88 72
2025-09-03 90 75

Example Output:
Highest Temperature: 90
Lowest Temperature: 70
Average High: 87.7
Average Low: 72.3

Optional Dataset Questions:

How might missing days affect averages?
Could the dataset contain bias if it only includes one city or region?

Project 9: Book Ratings Filter
Description:
Read a file of books (title, author, rating, genre). Ask the user for a minimum rating and display all books meeting that rating.

Starter File (books.txt):
The Hobbit, Tolkien, 9.0, Fantasy
1984, Orwell, 8.5, Dystopia
Twilight, Meyer, 5.5, Romance

Example Input:
Minimum Rating: 8.0

Example Output:
The Hobbit by Tolkien - Rating: 9.0
1984 by Orwell - Rating: 8.5

Optional Dataset Questions:

How would missing ratings or genres affect filtering?
Could the dataset be biased if it only contains popular books?

Project 10: Sales Data Analysis
Description:
Read a file of sales (employee, amount sold). Compute total sales per employee, overall total, and highest-selling employee.

Starter File (sales.txt):
Alice 200
Bob 150
Charlie 300

Example Output:
Alice Total: 200
Bob Total: 150
Charlie Total: 300
Overall Total: 650
Top Seller: Charlie

Optional Dataset Questions:

How would missing sales entries affect totals?
Could the dataset be biased toward certain employees or products?

Project 11: Movie Genre Counts
Description:
Read a file of movies (title, genre). Count the number of movies in each genre and display the results.

Starter File (movies_genre.txt):
Titanic, Romance
Inception, Sci-Fi
The Godfather, Crime
La La Land, Romance

Example Output:
Romance: 2
Sci-Fi: 1
Crime: 1

Optional Dataset Questions:

Could missing or miscategorized genres affect counts?
Is the dataset appropriate for analyzing trends across genres?

Project 12: Word Frequency Counter
Description:
Read a paragraph from a file. Count how many times each word appears and display the top 3 most frequent words.

Starter File (text.txt):
The quick brown fox jumps over the lazy dog. The dog was not amused.

Example Output:
the: 3
dog: 2
quick: 1

Optional Dataset Questions:

How would punctuation or capitalization affect counting?
Could this dataset contain bias if it is too short or unrepresentative?

Page updated

Report abuse