Problem Set 4
You can find the text of problem set 4 here.
Everyone will use the same graph: amazon.
You can find the .mat file and the edge list in
the directory of files for problem set 4.
The categories are numbered 1 through 40.
I have assigned everyone who submitted problem set 3 a category.
This page contains a list of netids. The number in front is your category.
The list of 10 vertices in category i is in the file in_i.txt, and the list of 100
vertices not in category i is in the file out_i.txt.
These are also in the directory of files for problem set 4.
I have created a few test graphs that you can use to test your code.
They are labeled ps4test1 - ps4test4, and are in this directory.
In each graph, the first half of the vertices form one cluster.
So, if you select 10 vertices at random from the first half and 10 vertices at random from the second half, solving the linear system should enable you to find something close to the split.
These graphs go from small to sort of big, so you can use them to test if you code is fast enough, and hopefully figure out why if it is not.
I have also created a list of the vertices in category 41 from this graph.
It has 91 vertices. If I choose 10 inside at random and 100 outside at random to feed to the algorithm, it returns a set of around 200 vertices that contains around 70 from the set.