February 15, 2019
For today:
1) Finish Homework 2.5
2) Read Head First C Chapter 3
3) Prepare for a quiz
4) Think about project ideas
Today:
1) Quiz
2) Project planning
3) Homework 2.5 solutions
4) HFC 3
For next time:
1) Project proposal
2) Read Think OS Chapter 4 and do the reading quiz
3) Start Homework 3
4) Read the HFC 3 notes below
Bonus question:
Normally page tables are stored in kernel memory, but suppose some lunatic suggests storing each process' page table in its virtual address space, maybe in the unused space above the stack. Why might that be a bad idea? Specifically, in what way would it undermine one of the goals of the process abstraction?
Our model of computer architecture, revisited:
ALU = arithmetic/logic unit
MMU = memory management unit
HBA = host bus adapter
MCU = memory control unit
DC = disk controller
NIC = network interface controller
Think about
1) Your learning goals
2) Topics you are interested in
3) Your level of facility with C
4) Your level of commitment to the project
5) People you know you work well with (but also consider working with someone new)
Project proposal due Tuesday
1) find_track solution and reflection
2) BigInt solution and reflection. Were the tests adequate?
C and UNIX co-evolved, so they have many concepts in common.
In C/UNIX a "file" is really an abstraction that represents a stream of bytes. Those bytes might be coming from or going to a file or almost any other device. Or they can be piped from one process to another.
These streams are referenced by "file descriptors", which are integer indices into the file descriptor table.
Each process has its own FDT, initialized with three entries, stdin, stdout, and stderr.
Read more here: https://en.wikipedia.org/wiki/File_descriptor
As a programmer, you should write as if you don't know where these streams are really coming from or going. They get wired up when the process starts.
If you want explicit control of where the data comes from and goes, you can open files with fopen().
The result is a "file pointer" (type FILE *), which is a structure that contains info about the open file, including the file descriptor.
If you open the same file twice, you get two different file pointers, which contain two different file descriptors, which can be at different locations in the file.
So "file pointer" is to "file" as "process" is to "program".
How to think like a tool
Small tools like the ones in this chapter are a big part of what makes programming in a Unix command line environment so powerful. Consider a simple challenge, like building a histogram of common words in a file:
$ cat song.txt | tr ' ' '\n' | tr [a-z] [A-Z] | sort | uniq -c | sort -nr | head -n 10
27 YOU
26 GONNA
24 NEVER
12 AND
7 TELL
6 TO
6 SAY
6 MAKE
5 KNOW
5 A
Is this the most elegant solution possible, or the one I would deploy to handle 10,000 queries/sec? Nope.
Does it prove the concept, work well enough for most cases, and take 45 seconds to write? You betcha. Sometimes the fastest programs to write are those you don't write at all.
As you progress in the software world, keep learning these little tools. They come in handy for your scripts, and they're good models as you write your own building block programs. A few tips:
Text in, text out. Operate on one line at a time if you can.
Don't clutter stdout with messages, to allow chaining tools together. Add a verbose option if you need it.
Do one thing well. Decomposing your problem into simple tools is a lot like breaking your program into smart functions.
This isn't just for C. You can write small tools that behave this way in any language (Python sys.argc/argv anyone?)