In this assignment you will solve a synchronization problem using the synchronization primitives in OS161, two of which you have implemented in the previous assignment. You will also learn about having multiple remotes configured for your git clone, and about branching and merging with git.
Step 1. Prepare
Make sure you have a working implementations of locks and condition variables
Before you begin working on this assignment, you must have a working implementation of the synchronization primitives that you completed in Assignment 2. If your implementation is not working, you are welcome to use the solution code that we distributed to you separately. (If you are not sure how to obtain it, ask your TA.) It is entirely up to you how you want to integrate the solution code into your tree. One option is to simply copy over the affected files. There is probably only two of them at this point: synch.h and sync.c. Another option is to use the git reset command to discard your commits and apply the patch to your tree, then commit and push the changes to git. Regardless of what you decide to do, make sure that your working tree compiles, runs and passes all the synchronization tests in the kernel menu (sy2, sy3 and sy4). If you need help, talk to your TAs during the lab session.
Pull some new code required for this assignment
We have created stubs and driver code for the synchronization problem you will be solving in this assignment. Follow these steps to integrate the code into your repository.
First, you need to add another remote to your git clone. The new remote is for the git repository set up by your instructor to share with you certain bits of the OS161 codebase. It is the same repository that you used to create your original master repository a couple of weeks ago. Go into the directory where you have the clone of your master repo (probably ~/os161/src) and type the following:
git remote add instructor http://dev.ece.ubc.ca/git/OS161
Next, examine the remotes your repository knows about by typing:
git remote -v
You will see the output that looks something like this:
instructor http://dev.ece.ubc.ca/git/OS161 (fetch)
instructor http://dev.ece.ubc.ca/git/OS161 (push)
origin https://username@bitbucket.org/username/os161.git (fetch)
origin https://username@bitbucket.org/username/os161.git (push)
As you can see, the new remote was added under the name "instructor". There is nothing magic about the name "instructor". It is simply the name we chose to use; you could give it any other name you wish. Now, if you want to fetch from that repository, you will refer to it by its name: 'instructor', in this case. (As a side note, if you use git commands without providing the name of the remote, git automatically defaults to the name "origin", which is the name assigned to your own master repository).
Now, fetch data from the new remote:
git fetch instructor
You will see that this command resulted in fetching of the two branches: master and synchprobs. Next, you need to integrate the code from synchprobs into your current tree. You will do this by merging the branch instructor/synchprobs into your master branch. (More information on git merging can be found here.)
git merge instructor/synchprobs
At this point, git will probably complain about merge conflicts. Follow this tutorial to resolve them before proceeding.
If you now examine your source tree, you will see that there is a couple of new files that have been added, such as kern/conf/SYNCHPROBS and kern/synchprobs/airballoon.c. The first one is the configuration file that you will use for your assignment. The second one is the stub code for the synchronization problem you will solve. Make sure these files were successfully added to your source tree.
Now, push that new code into your master repository and tag your tree to indicate that you are beginning to work on Assignment 3:
git push
git tag asst3-start
git push --tags
Step 2. Understand the problem
Here is the synchronization problem you will have to solve. This is a well-known and a very serious known as AirBalloon.
After a war erupts in their kingdom, Princess Marigold must help Prince Dandelion (her younger brother) escape from danger. Marigold places Dandelion in a hot air balloon, which is connected to the ground by NROPES ropes -- each rope is connected to a hook on the balloon as well as a stake in the ground. Marigold and Dandelion must work together to sever all of these ropes so that Dandelion can escape. Marigold unties the ropes from the ground stakes while Dandelion unhooks the ropes from the balloon. To free a rope it must be either severed from the ground or unhooked from the balloon: not both.
Unfortunately, one of Princess Marigold and Prince Dandelion's enemy, Lord FlowerKiller, is also at work. FlowerKiller is rearranging the ropes to thwart Princess Marigold and Prince Dandelion. He will randomly switch ropes attached to two diffrent stakes. This leads to chaos!
Without Lord FlowerKiller's dastardly behavior, there would be a simple 1:1 correspondence between balloon hooks and ground stakes (each hook corresponds to exactly one stake, and each stake corresponds to exactly one hoop). However, while Lord FlowerKiller may break this symmetry. Say, Lord FlowerKiller decides to switch ropes connected to Stake 1 and Stake 5. Before the switch, the rope connected to Stake 1 on the ground is also connected to Hook 1 on the balloon, and the rope connected to Stake 5 on the ground is also connected to Hook 5 on the balloon. After the switch, however, one rope is connected to Stake 1 on the ground and to Hook 5 on the balloon, and vice versa.
As Marigold and Dandelion cut ropes, they must delete mappings, so that they remove all the ropes as efficiently as possible. Each character is represented by a thread. Dandelion selects ropes to sever by generating a random balloon hook index, and Marigold selects ropes by generating a random ground stake index.
Marigold and Dandelion must work as efficiently as possible, so one should not try to disconnect a rope that has already been disconnected. For example, if Dandelion already unhooked a rope, Marigold should see that the rope is hanging loose and not try to unook it from the stake. Similarly, Dandelion should not try to unook a rope that Marigold disconnected on the ground.
Lord FlowerKiller is on the ground, so like Marigold, he selects ropes by their ground stake index.
Due to recent advances in genetic technology, there is another unfortunate circumstance: Lord FlowerKiller figured out how to clone himself, so now there are N_LORD_FLOWERKILLER copies of of him at work (each is a separate thread)!
To begin thinking about problems to avoid in your implementation, consider these examples:
Marigold randomly selects Stake 7, sees that the rope is still attached. She marks the rope as severed. Suppose that the hook correpsonding to the severed rope is Hook 11. If Dandelion now randomly selects Hook 11, he must see that the rope is severed and not try to sever it again. What data structures might you use to avoid a race condition in this situation? What would you need to lock to make sure that Marigold and Dandelion don't sever the same rope twice?
Worse yet, Lord FlowerKiller might be wreaking havoc with the same ropes. For example, imagine that he decides to swap the rope attached to Stake 7 with the rope attached to Stake 4, and vice versa. How do you make sure to avoid race conditions that this situation creates?
Now consider how Lord FlowerKillers might step on each other toes. Suppose Lord FlowerKiller1 grabs Stake1 and Stake4 to swap the corresponding ropes, and Lord FlowerKiller2 graps Stake4 and Stake1. How might this cause deadlock? What would you do to prevent it?
In addition to our character threads, there is also a Balloon thread that does not do much. It gets started in the beginning of the program, at the same time as the other threads, and just sits there and waits until all the ropes have been severed. Then it announces that Prince Dandelion escaped and exits.
Step 3. Understand the requirements
Your solution must satisfy these conditions:
Avoid deadlocks and race conditions
Stick to the API provided in the driver code. You may write any other code you want, but do not change or go around the code already provided to you. If unsure, as your TA!
You may use semaphores, locks and condition variables available in OS161, but do not use spinlocks, wchans, or atomic primitives directly, and don't turn on/off the interrupts.
Permit Marigold, Dandelion and Lord FlowerKiller threads to operate concurrently (no "big lock" solutions)
The Balloon thread exits after all ropes have been severed
The Main thread (the one that starts all other threads) exits after all threads are done.
You must deallocate all the memory allocated by your code before all threads exit. If you don't, there will be a memory leak! We will run your program multiple times to make sure your kernel does not leak.
Marigold must access the ropes via stakes. She does not have access to hooks. Programmatically, this means that you need to have some kind of data structure representing stakes, which Marigold indexes. Marigold can update stakes and ropes (that are connected to stakes), but she may not update hooks. Similarly, LordFlowerkillers may access ropes via stakes only. Dandelion, on the other hand, may access ropes only via hooks. He cannot see or access stakes.
To grade your code, we will read it to check that it meets the specification and use an automated script to check the correctness of the output. Please read this section very carefully to understand what your code should print. Be sure to follow these rules. Otherwise our script may report your solution as incorrect. You don't want to lose points for small typos!
Every time Marigold severs the rope, the Marigold thread prints: "Marigold severed rope N from stake K", where N is the index of the severed rope, which can range from 0 to NROPES - 1, and K is the index of the stake to which the rope was attached. K also ranges from 0 to NROPES-1. Be sure to print the statement exactly as shown, with no extra characters. The print statement must appear on its own line.
Every time Dandelion severs the rope, the Dandelion thread prints: "Dandelion severed rope N", where N is the index of the severed rope, which can range from 0 to NROPES - 1. Be sure to print the statement exactly as shown, with no extra characters. The print statement must appear on its own line.
Lord FlowerKiller prints the following statement every time he switches a rope: "Lord FlowerKiller switched rope N from stake K to stake P". N, K and P are corresponding rope and stake indices. If he switches two ropes, which is exactly what he will do every time, he must print two statements: one for each rope.
There must be a print statement for every rope indicating that it was severed by either Dandelion or Marigold.
Each rope must be severed exactly one: only one "severed" print statement for each rope.
To randomly select a stake to unhook (or two stakes to swap) or a hook, use random() function in OS161.
Every time a character succeeded unhooking one rope or switching one pair of ropes, the corresponding thread must yield (call thread_yield()).
Marigold's print statements must indicate that she is severing the ropes from correct stakes. For example, if there is a print statement from Lord FlowerKiller saying "Lord FlowerKiller switched rope 4 from stake 4 to stake 2", then Marigold's statement about severing rope 2 must say: "Marigold severed rope 4 from stake 2". (Dandelion's print statements do print the stake index corresponding to the severed rope, because Dandelion cannot access stakes.)
When all ropes have been severed (after the "severed" print statements for all NROPES ropes were displayed), the Balloon thread must print "Balloon freed and Prince Dandelion escapes!" and exit.
Each thread, except the main one, must print the "thread starting" message just as it begins running. I.e.:
"Dandelion thread starting"
"Marigold thread starting"
"Lord FlowerKiller thread starting"
"Balloon thread starting"
Each thread, including the main one, must print the "thread done" message just before it exits. I.e.:
"Dandelion thread done"
"Marigold thread done"
"Lord FlowerKiller thread done"
"Balloon thread done"
"Main thread done"
The main thread must print only after all other threads have printed their departing statements. That is, the "Main thread done" statement must be the last one to appear before your kernel returns to the main menu.
The print statements must be atomic: characters from one print statement must not interleave with another.
The print statements from different actors must interleave: for instance you can't have all Dandelion print statements appear together first, and then all Marigold statements appear together. You will accomplish it by using fine-grained locks (no one big lock) and by calling thread_yield() every time after Dandelion or Marigold sever the rope or after Lord FlowerKiller switches the ropes.
Step 5. Plan your time
You will be given partial credit if you implement the solution to the problem assuming that there is no Lord FlowerKiller and no Balloon threads (just don't start the corresponding threads). So if you are struggling, complete the solution involving Marigold and Dandelion only first. You will be given more points if you add the Balloon thread and even more points if you add the Lord FlowerKiller threads.
Step 6. Write, test and debug the code
Put all your code into kern/synchprobs/airballoon.c. To make sure that the new files get compiled, configure the kernel as shown below, and don't forget to bmake depend:
cd kern/conf
./config SYNCHPROBS
cd ../compile/SYNCHPROBS
bmake depend
bmake
bmake install
Test your code by invoking sp1 from the kernel menu. And don't forget what you learned about debugging with gdb in Assignment 1! Also, check that your sys161.conf file uses at least four CPUs. We will test your code with four CPUs!
Debugging tips:
Get one type of thread working first, e.g., Marigold or Dandelion. Make sure it does not deadlock with itself and otherwise works as you expect. Then add other thread types.
Deadlocks are easier to debug than race conditions. If you are struggling, put one big lock around the critical sections that are causing you trouble. Make sure your code is working. Then gradually reduce the size of the critical section, making your big lock cover a smaller and smaller area of the code, until your code breaks. This way you will narrow down the few lines of code where the error is occurring.
If you encounter assertions in the VM system (e.g., in the kmalloc.c file), chances are you are freeing a pointer that you did not allocate or are doing something else funky with the memory. If your kernel panics on a TLB fault, you are probably accessing a pointer that you did not allocate or a null pointer.
If you need to figure our where a thread is stuck, print the address of the thread struct in your code: kprintf("%p\n", curthread);. We will call the pointer that is printed threadaddr. Then you can print the contents of the thread struct in GDB: print *(struct thread) <threadaddr>. You can also figure out which thread holds the lock by printing the lock struct in gdb, for example: print stake->lk_stake;. If you are keeping track of the lock holder inside the lock structure, that field will show the threadaddr or the holder thread.
If your code is stuck in a deadlock, you should be able to attach to it with the debugger using the os161-gdb command and following the instructions linked in A1. This should work even if you did not launch your sys161 with -w flag.
Step 7. Submit the assignment
Submit the assignment as usual:
git commit -a -m "Solution to Assignment 3"
git push
git tag asst3-submit
git push --tags
Congratulations! You are done with Assignment 3.
Passing tests:
If all functionality is implemented:
4 — all three tests pass, no major or minor errors
3 — all three tests pass, some minor errors
2 — 2 out of 3 tests pass with no major errors (minor errors ok)
1 — 1 out of 3 tests pass with no major errors (minor errors ok)
0 — 0 out of 3 tests pass with no major errors.
We reduce the mark by an additional 10% if all is fine, but the main thread is not synchronized properly.
Code quality:
0 - non-existent or incomprehensible
1 - kind-of sort-of
2 - basically ok
3 - dholland kind code
TAKE THE FINAL SCORE AND MULTIPLY IT BY 18. THIS WILL GIVE THE TOTAL POSSIBLE MARK OF 126.
Subtractions that affect functionality and correctness.
Take the following percentage points off the aggregate mark if:
40%: If you do not implement FlowerKiller.
10%: If you do not implement the baloon thread.
7%: Spinlocks, atomic primitives or interrupts are used instead of higher-level synchronization primitives.
10%: Actors select ropes directly, not stakes or hooks
50%: A big lock solution is used. Threads are not operating concurrently.
10%: Memory is not freed.
4%: If there are any mistakes in print statements. If you don't use the print statements exactly as specified in the assignment, our scripts report errors and we have to resolve these manually, so please use the print statements precisely!