Reference: Read & Debug Part 1 - PIF
Introduction
I encourage you to get through at least the first 2 lessons/days, at this point you will have a working environment and could probably start to benefit from reading some example code and using the debugger.
The goals here are:
- Get better at reading and understanding MIPS assembly
- Get more experience with using the MAME debugger
- Understand the N64 boot process
- A lot of this!
Please provide feedback this is kind of a technical topic there are going to be areas where it needs to be more clear and areas that I misunderstand or just plain get wrong so let me know.
Background
The actual PIF chip has many different functions:
- Store MIPS BIOS Instructions (1,984 bytes IPL1 & IPL2)
- These instructions are addressed at the VR4300 boot vector 1FC0 0000
- This memory becomes unreadable before the boot is complete.
- Handle the CIC copy protection
- Complex continuously executed algorithm
- With control over the Console Reset functions
- Handles the Joybus protocol for all supported hardware
- Game controllers
- EEPROM
- Manages the user accessible reset button
- Notifies the process it has been pressed
- Then waits 0.5 seconds
- Then resets all system components
The PIF is actually a processor itself. It's a slightly customized Sharp SM5 4-bit Microcontroller, so lots of work happens in this little device. Our focus today is only on the first piece.
The first thing to understand is that the PIF is mapped in to the Console Memory like the other hardware devices. It's Read-Only Memory address is 0x1FC0 0000 which is standard for MIPS processors for the location of the first instruction to execute, but wait we did a hard reset, and the debugger goes to 0xBFC0 0000? Correct this is part of the N64 caching feature there is a very subtle reference in the last lines of the memory map document that mentions that 0x8000 0000 and 0xA000 0000 are mirrors of 0x0000 0000 to 0x1FFF FFFF.
Which means that 0x1FC0 0000 is the "same" memory as both 0x9FC0 0000 and 0xBFC0 0000, there is a behavior difference because the BFC0 range has a caching feature that we are going to ignore for now.
- 0x9FC0 0000 points to 0x1FC0 0000
- 0xBFC0 0000 points to 0x1FC0 0000
This will come up again in the first couple of instructions, so we'll cover more then.
Electrically the PIF is connected to the following parts of the N64.
- MIPS CPU - Pin INT2
- Controller Ports 1, 2, 3, 4. One pin each.
- RCP - Clock, Address, Data (Memory Map)
- CIC Input & CIC Output pins
- Cart EEPROM Two from PIF
- One pin Cartridge EEPROM Data
- One pin Cartridge EEPROM & CIC Clock
- NMI on MIPS CPU
- Essentially a warm reset.
- Memory / Registers are not cleared
- Currently executing instruction allowed to complete
- Essentially a warm reset.
- Cold Reset
- Sends Reset Signal to MIPS, RCP, RAMBUS clock generator and Cartridge Slot
- Output Reset - Not clear what it does.
- Input from Console User accessible Reset Switch
Some of this is taken from Marshallh's comments which are quoted from here and a very informative comment on the cen64 github repository.
- Console Receives Power
- PIF holds the MIPS CPU, RCP, RAM (DD64) in Reset via NMI
- PIF checks that the Cartridge is valid (via CIC)
- PIF writes CIC seed value and other bits to word in PIF RAM at offset 0x24
- When the PIF is satisfied it releases the NMI pin.
- The MIPS CPU then starts executing the first instruction taken from PIF ROM
MAME Debugging
Use the official builtin documentation by typing help which will display the available topics then type help <topic> to see the actual commands or their website. Below are the ones I needed during this exercise.
bpset <address>
bpclear <bp id>
bplist
Sets a break point
Remove a break point
To find the Id for bpclear
The following function key shortcuts are helpful
And this is where our code execution begins....
First Steps
The PIF is just under 500 instructions, so I'll be focused on the high points and since it may be considered legally questionable I'll refer to everything by the address in the disassembly viewer.
With a successfully compiled ROM in the folder use the 'run debug' command (It's included in the bass compiler download in Lesson 1). If you already have MAME open with the debugger running you can use the menu option Debug | Hard Reset to start at the beginning.
I would encourage setting a few break points for experience and in case an F5-Run gets away from you.
bpset A4001000 // second stage of PIF BIOS code
bpset A4001118 // Copy Cart to RAM
bpset A4001134 // End of Start Block
bpset A400113C // Start of Subroutine 1
bpset A4001184 // Start of Middle Block
bpset A4001420 // After Algorithm
bpset A4001550 // Start of Subroutine 2
bpset A400156C // Start of Final Block
bpset A400163C // Start of Cart Code
bpset A4000040 // Cart Startup Code
After a hard reset the Value 0x0000 3F3F is located at PIF RAM Memory Location 0xBFC0 07E4 this is based on the CIC Seed for the 6102.
BFC0 0000 - BFC0 0004
Sets some configuration bits in the MIPS Coprocessor 0
Enable COP0 and COP1 (MIPS Control and Floating point Coprocessors)
Enable 32 qty Floating Point Registers
BFC0 0008 - BFC0 0010
Set some more configuration bits in the MIPS Coprocessor 0
Enable kseg0 caching
Big Endian Mode
BFC0 0014 - BFC0 0024
Loop until the SP Status Register Halt bit is set
BFC0 0028 - BFC0 0030
The halt bit is written which clears it and the clear interrupt bit is written which clears the interrupt.
The 'broke' bit remains on.
BFC0 0034 - BFC0 0044
Loop until the SP Registers DMA Status bit is 0
BFC0 0048 - BFC0 0050
Reset PI Controller and Clear Interrupt
BFC0 0054 - BFC0 005C
Set VI Vertical Interrupt to 1024
BFC0 0060 - BFC0 0064
Set VI Horizontal Video Init to 0
BFC0 0068 - BFC0 006C
Set VI Current Vertical Line to 0
BFC0 0070 - BFC0 0074
Set AI DRAM Address to 0
BFC0 0078 - BFC0 007C
Set AI Length Register to 0
BFC0 0080 - BFC0 0090
Loop until SP DMA busy is 0
BFC0 0094 - BFC0 00BC
Copy the rest of the PIF Instructions to IMEM 192 Instructions, 768 bytes
a400 1000 DESTINATION
bfc0 00d4 SOURCE
bfc0 071c SOURCE_END
BFC0 00C0 - BFC0 00D0
Jump to A400 1000
Since the code changes Memory Addresses at this point it's probably a good time to reflect on some of the patterns we are seeing.
- Setup
- Execute
There are commonly 1 or more instructions setting up an instruction that either branches, or reads or writes to memory. In some of the loops it's interesting how the developer/compiler chose to reuse the same register for the setup and the result.
We are only 52 instruction in, but lots of hardware has been initialized. These instructions are executing from very slow memory but our fast memory has been loaded and it's time to execute from it.
Now Executing from the IMEM
A400 1000 - A400 1014
Loop until PIF DMA is 0
A400 1018 - A400 1028 & A400 1030
Get CIC value from PIF RAM offset 0x24
Two PIF error bits are extracted (18, 19)
A400 102C & A400 1034
If first bit is an error change t3 register address to 0xA600 0000
A400 1038
Extract 2nd Byte from PIF RAM value
A400 103C
Extract 1st Byte from PIF RAM value
A400 1040 & A400 1050
Extract error bit 17
A400 1044
Read PIF Status Register (Data 0x0080)
A400 1058 - A400 106C
Read SI (PIF) Status Register until IO Read busy
A400 1070
Write to PIF Control Register 0x09 aka 1001 binary
A400 1074 - A400 1090
Set Default values for the PI Registers. These values are overwritten in just a few more instructions?
PI dom1 latency 0xFF
PI dom1 pulse width 0xFF
PI dom1 page size 0x0F
PI dom1 release 0x03
A400 1094
Read in the ROM Endian Bytes
We are another 25 instructions further in and the code is kind of jumping around. I'm not going to claim that it's intentionally obfuscated, it's still pretty clear what is going on. My guess is it's probably performance optimizations. Just keep a look out for the hex addresses to bounce around.
A400 1098 & A400 10C0 Continues below A400 10D4
Read DP Status Register
A400 109C
Doesn't seem to do anything?
A400 10A0 - A400 10BC
Initialize some PI Registers based on the ROM Endian bytes
PI dom1 latency = The 4th byte 0x40
PI dom1 pulse width = the 1st, 2nd & 3rd bytes 0x803712
PI dom1 page size = the 3rd byte (reused) and the 4th byte 0x8037
PI dom1 release = 0x0803 (The top 3 nibbles of the ROM Header Value)
A400 10C0
Read DP Command Status Byte
A400 10C4
Load the lower portion of the PIF Command Register
A400 10C8
Complete the cached memory address of the cartridge to the 0x40 byte, just beyond the header.
A400 10CC - A400 10D0
Extract the xbus dma flag if it's zero skip down to a400 10F0
A400 10D8 - A400 10EC
Loop checking the cmd busy flag in the DP Status Register.
A400 10F0 - A400 10F4 & A400 10FC
Take 2(Second time reviewing this article, trying to expand and clarify)
Setup a temp register with a Destination Memory Address
A400 1100 - A400 1114
Loop and copy the first 4,032 bytes of executable code from Cart to DMEM
A400 1118 - A400 1120
Multiply hard coded values (This is to check that the header and CIC complement each other, 6102 in our example).
LSB of Cart Endian Indicator 0x3F
6C078965 * 3F = 1A 95DA CFDB
A400 1124 - A400 1128
Setup a0 Register for subroutine call
A400 112C - A400 1130
Setup a1 Register for subroutine call
A400 1134 - A400 1138
Subroutine call
This is an interesting point in the code.
- Unconditioned subroutine call
- a0 = 95DA CFDC
- a1 = A400 0040
- a2 = 0000 0FC0
- a3 = 0
Up to this point the code has been "linear" start at the top and execute instructions as it goes. If I could describe the code flow in the A4 section it would be:
- Start Block -> Middle Block
- Subroutine 1
- Calls Subroutine 2
- Middle Block -> Final Block
- Call Subroutine 1
- Subroutine 2
- Final Block
Back to our analysis:
A400 113C - A400 1180 - Subroutine 1
Calls A400 1550
Main Code Flow
A400 1184 - A400 11A8
Save the required Registers to the stack A400 1F10 - A400 1FF0
A40011AC
Load first Executable Instruction from Cart to temporary register
A40011B0 - A40011D4
Create a value of 0xD55A A7DC and repeatedly copy it to the
PIF RAM range 0xA400 1F88 - 0xA400 1FC4, 16 occurrences of the value in this range.
A40011D8 - A400141C
This is an algorithm of 145 Instructions (+ subroutines) that runs for approx 435,170 Instruction cycles (including subroutines).
The End result is the following values located in memory at 0xA4001F84