Unit 9 Notes: Working with Files, Data Sets, and Real-World Data


1. Introduction

In this unit, you will learn how to read, process, and analyze data from files using Java.

Real-world data is everywhere: from apps and games to business, medicine, and civil planning. 

Understanding how to process data allows you to answer questions, calculate statistics, and make decisions using programs.


Key Concepts:

- Files store data persistently, even when the program is not running.

- Data sets are collections of information used to solve problems.

- Ethical issues include privacy, bias, and data quality.


2. File Basics

Step 1: Import the required classes

import java.io.File;

import java.io.IOException;

import java.util.Scanner;


Step 2: Open a file

File file = new File("students.txt"); // must exist in the project folder

Scanner input = new Scanner(file);


Step 3: Handle exceptions

public static void main(String[] args) throws IOException {

    // file reading code here

}


Step 4: Close the file

input.close(); // always close the scanner


3. Scanner Methods in APCSA


Method               | What it does                     | Notes / Exceptions

-------------------- | --------------------------------| -----------------------------------------------

Scanner(File f)      | Creates a Scanner to read a file| Must handle IOException

int nextInt()        | Returns next int from file       | Throws InputMismatchException if not an int

double nextDouble()  | Returns next double from file    | Throws InputMismatchException if not a double

boolean nextBoolean()| Returns next boolean from file   | Throws InputMismatchException if not a boolean

String next()        | Returns next word (token)       | Skips whitespace

String nextLine()    | Returns next line as String     | Can return empty string if used after nextInt/next()

boolean hasNext()    | Checks if more input is available| Often used in while loops

void close()         | Closes Scanner and file         | Always call to release resources


4. Example Files and Programs


4.1 students.txt

Alice 90 95 true

Bob 80 85 false

Charlie 100 100 true


Program: Using next(), nextInt(), nextBoolean()

import java.io.File;

import java.io.IOException;

import java.util.Scanner;


public class StudentReader {

    public static void main(String[] args) throws IOException {

        File file = new File("students.txt");

        Scanner input = new Scanner(file);

        

        while (input.hasNext()) {

            String name = input.next();

            int score1 = input.nextInt();

            int score2 = input.nextInt();

            boolean passed = input.nextBoolean();

            

            System.out.println(name + ": " + score1 + ", " + score2 + ", Passed? " + passed);

        }

        

        input.close();

    }

}


Output:

Alice: 90, 95, Passed? true

Bob: 80, 85, Passed? false

Charlie: 100, 100, Passed? true


4.2 lines.txt

Alice 90 95 true

Bob 80 85 false

Charlie 100 100 true


Program: Using nextLine()

while (input.hasNextLine()) {

    String line = input.nextLine();

    System.out.println("Line: " + line);

}


Output:

Line: Alice 90 95 true

Line: Bob 80 85 false

Line: Charlie 100 100 true


4.3 split_example.txt

Alice,90,95,true

Bob,80,85,false

Charlie,100,100,true


Program: Using split()

while (input.hasNextLine()) {

    String line = input.nextLine();

    String[] parts = line.split(",");

    String name = parts[0];

    int score1 = Integer.parseInt(parts[1]);

    int score2 = Integer.parseInt(parts[2]);

    boolean passed = Boolean.parseBoolean(parts[3]);

    

    System.out.println(name + ": " + score1 + ", " + score2 + ", Passed? " + passed);

}


Output:

Alice: 90, 95, Passed? true

Bob: 80, 85, Passed? false

Charlie: 100, 100, Passed? true


4.4 movies.csv

The Godfather,9.2,1972

Inception,8.8,2010

Titanic,7.8,1997


Program: Real-World Data Example

while (input.hasNextLine()) {

    String[] fields = input.nextLine().split(",");

    String title = fields[0];

    double rating = Double.parseDouble(fields[1]);

    int year = Integer.parseInt(fields[2]);


    if (rating > 8.0) {

        System.out.println(title + " (" + year + ") - Rating: " + rating);

    }

}


Output:

The Godfather (1972) - Rating: 9.2

Inception (2010) - Rating: 8.8


5. Data Table Examples 


Students:

Name     | Score1 | Score2 | Passed

------------------------------------

Alice    | 90     | 95     | true

Bob      | 80     | 85     | false

Charlie  | 100    | 100    | true


Movies:

Movie             | Rating | Year

---------------------------------

The Godfather     | 9.2    | 1972

Inception         | 8.8    | 2010

Titanic           | 7.8    | 1997


6. Common Pitfalls

1. FileNotFoundException – file does not exist or wrong path.  

2. InputMismatchException – reading wrong type (int vs String).  

3. Empty lines – can cause unexpected behavior with nextInt(), nextBoolean().  

4. Mixing nextLine() with next() – may skip lines.  

5. Incorrect parsing with split() – must convert Strings to int/double/boolean as needed.


7. Planning Data Algorithms

- Use tables or charts to plan before coding.  

- Decide:

  - What information will you read?  

  - What statistics or results do you need?  

  - How will you handle errors or missing data?  


Example table for student test data:

Name     Score1  Score2  Passed

Alice    90      95      true

Bob      80      85      false

Charlie  100     100     true


8. Ethical & Social Considerations

- Privacy: Don’t display personal info without permission.  

- Bias: Ensure data represents all groups fairly.  

- Data Quality: Watch for missing or inaccurate entries.  

- Appropriate Data Sets: Data for one question may not work for another.