Unit 9 Notes: Working with Files, Data Sets, and Real-World Data
1. Introduction
In this unit, you will learn how to read, process, and analyze data from files using Java.
Real-world data is everywhere: from apps and games to business, medicine, and civil planning.
Understanding how to process data allows you to answer questions, calculate statistics, and make decisions using programs.
Key Concepts:
- Files store data persistently, even when the program is not running.
- Data sets are collections of information used to solve problems.
- Ethical issues include privacy, bias, and data quality.
2. File Basics
Step 1: Import the required classes
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
Step 2: Open a file
File file = new File("students.txt"); // must exist in the project folder
Scanner input = new Scanner(file);
Step 3: Handle exceptions
public static void main(String[] args) throws IOException {
// file reading code here
}
Step 4: Close the file
input.close(); // always close the scanner
3. Scanner Methods in APCSA
Method | What it does | Notes / Exceptions
-------------------- | --------------------------------| -----------------------------------------------
Scanner(File f) | Creates a Scanner to read a file| Must handle IOException
int nextInt() | Returns next int from file | Throws InputMismatchException if not an int
double nextDouble() | Returns next double from file | Throws InputMismatchException if not a double
boolean nextBoolean()| Returns next boolean from file | Throws InputMismatchException if not a boolean
String next() | Returns next word (token) | Skips whitespace
String nextLine() | Returns next line as String | Can return empty string if used after nextInt/next()
boolean hasNext() | Checks if more input is available| Often used in while loops
void close() | Closes Scanner and file | Always call to release resources
4. Example Files and Programs
4.1 students.txt
Alice 90 95 true
Bob 80 85 false
Charlie 100 100 true
Program: Using next(), nextInt(), nextBoolean()
import java.io.File;
import java.io.IOException;
import java.util.Scanner;
public class StudentReader {
public static void main(String[] args) throws IOException {
File file = new File("students.txt");
Scanner input = new Scanner(file);
while (input.hasNext()) {
String name = input.next();
int score1 = input.nextInt();
int score2 = input.nextInt();
boolean passed = input.nextBoolean();
System.out.println(name + ": " + score1 + ", " + score2 + ", Passed? " + passed);
}
input.close();
}
}
Output:
Alice: 90, 95, Passed? true
Bob: 80, 85, Passed? false
Charlie: 100, 100, Passed? true
4.2 lines.txt
Alice 90 95 true
Bob 80 85 false
Charlie 100 100 true
Program: Using nextLine()
while (input.hasNextLine()) {
String line = input.nextLine();
System.out.println("Line: " + line);
}
Output:
Line: Alice 90 95 true
Line: Bob 80 85 false
Line: Charlie 100 100 true
4.3 split_example.txt
Alice,90,95,true
Bob,80,85,false
Charlie,100,100,true
Program: Using split()
while (input.hasNextLine()) {
String line = input.nextLine();
String[] parts = line.split(",");
String name = parts[0];
int score1 = Integer.parseInt(parts[1]);
int score2 = Integer.parseInt(parts[2]);
boolean passed = Boolean.parseBoolean(parts[3]);
System.out.println(name + ": " + score1 + ", " + score2 + ", Passed? " + passed);
}
Output:
Alice: 90, 95, Passed? true
Bob: 80, 85, Passed? false
Charlie: 100, 100, Passed? true
4.4 movies.csv
The Godfather,9.2,1972
Inception,8.8,2010
Titanic,7.8,1997
Program: Real-World Data Example
while (input.hasNextLine()) {
String[] fields = input.nextLine().split(",");
String title = fields[0];
double rating = Double.parseDouble(fields[1]);
int year = Integer.parseInt(fields[2]);
if (rating > 8.0) {
System.out.println(title + " (" + year + ") - Rating: " + rating);
}
}
Output:
The Godfather (1972) - Rating: 9.2
Inception (2010) - Rating: 8.8
5. Data Table Examples
Students:
Name | Score1 | Score2 | Passed
------------------------------------
Alice | 90 | 95 | true
Bob | 80 | 85 | false
Charlie | 100 | 100 | true
Movies:
Movie | Rating | Year
---------------------------------
The Godfather | 9.2 | 1972
Inception | 8.8 | 2010
Titanic | 7.8 | 1997
6. Common Pitfalls
1. FileNotFoundException – file does not exist or wrong path.
2. InputMismatchException – reading wrong type (int vs String).
3. Empty lines – can cause unexpected behavior with nextInt(), nextBoolean().
4. Mixing nextLine() with next() – may skip lines.
5. Incorrect parsing with split() – must convert Strings to int/double/boolean as needed.
7. Planning Data Algorithms
- Use tables or charts to plan before coding.
- Decide:
- What information will you read?
- What statistics or results do you need?
- How will you handle errors or missing data?
Example table for student test data:
Name Score1 Score2 Passed
Alice 90 95 true
Bob 80 85 false
Charlie 100 100 true
8. Ethical & Social Considerations
- Privacy: Don’t display personal info without permission.
- Bias: Ensure data represents all groups fairly.
- Data Quality: Watch for missing or inaccurate entries.
- Appropriate Data Sets: Data for one question may not work for another.