The main issue in the software realm is speed of computation. Two algorithms in particular were slow: connecting different groups of pixels, and determining how to draw the picture.
To connect different groups of pixels, we first have to be able to identify them.
If we have the list of all “on” pixels, we pick one of them and call that the starting location of the first group. We look at all of its neighbors, and if any of those pixels are also “on”, add them to the group. Next, we look at all the neighbors of the group, with its new members, and repeat the process. We do this until there are no more neighbors that are “on”. This means that we have reached the end of this group of pixels. If there are more pixels in the global list of “on” pixels, that means there is another group not connected to the one we just found. So, we pick another pixel in that list and repeat the same process.
This algorithm is fairly straightforward, and there is not much room for optimization. The main issue is how to deal with the list of pixels. For example, once an “on” pixel is found, it is best to remove it from the list of global pixels because it is no longer needed. This will make using that list faster, as it has become smaller.
After we have the list of all groups, we have to connect them all.
If there is more than one group of pixels, we find the two closest ones (as to minimally alter the picture) and connect them. To connect them, we find the two pixels in each that are the closest, and draw a straight line between them. We repeat the process until all groups have been connected.
This algorithm has much room for improvement. Firstly, finding the two closest groups requires finding the distance between all groups. Rather than having to do this every time we iterate, we store the distances we calculate, so we never have to calculate the same distance twice (since most groups of pixels will not be affected in each iteration). Also, we can reuse some of the information we find. Find the distance between two groups of pixels requires finding the distance between any two pixels in each group, and then finding the closest two. These two pixels are also needed when connecting the groups. We can take advantage of this by saving these two pixels when we calculate the distance.
Secondly, we don’t have to only connect two groups in each iteration. If there are multiple groups separate by the same distance (as occurs frequently with groups separated by only one pixel), we can go ahead and connect all of them in the same iteration.
With these modifications, the connecting algorithm becomes much faster. The limiting factor is finding the distance between two groups of pixels, which is slow because it requires n2 computations, where n is the size of the group. Depending on the image, the whole process takes on the order of 10 seconds for a 150x150 pixel image.
Next, determining how to draw the picture is particularly slow.
The algorithm we use for this is fairly simple. We keep track of the pixels we have and haven’t drawn. At each step, we find the closest pixel that still has to be drawn, and move to that pixel. To find the closest undrawn pixel, we implement a breath-first search. We take the pixel we’re at and call that the starting location. We then look at all the neighbors of that location that are in the picture and see if any of those are not already drawn. If not, we do the same with the neighbors of the neighbors of the start, and so forth. We repeat this process until we have found a pixel that we still need to draw. To determine the path to that pixel, we use an implementation of Djikstra’s algorithm. We repeat this process until the whole picture has been drawn.
The limitations of this algorithm are harder to address. Upon profiling the Python code to see what was taking the most time, we found that the list operations (such as seeing if an element is in a list) were taking the longest. This is because the breadth-first search becomes particularly ineffective when finding pixels that are far away, as you need to look at many levels of neighbors. One potential solution to this problem that we haven’t implemented is to change from a breadth-first to a depth-first search when the number of pixels remaining is small. If there are few pixels remaining, then it is easier to calculate the path to each one and then chose the smallest path. However, if this change is done too soon, finding the path of each of many remaining pixels will be a tedious and long process.
Our algorithm for determining how to draw the picture, as currently implemented, takes on the order of 30 seconds for a 150x150 pixel image, again depending on the image and the starting point.