2012 KTH Challenge - solutions

Post date: Aug 16, 2014 3:16:47 AM

This is another solutions post. It's been a year since I last wrote one (here), so hopefully the next one won't take quite as long...

This problem set is from the 2012 KTH Challenge. Everything is on the UBC ACM practice website here, including all my (extremely messy) code. You probably want to at least look through the problemset, and ideally have a try at solving the problems so my solutions actually mean something. Also, of course, my solutions should not be taken as fact, and the best way to improve your programming skills from reading this post would be to code up solutions yourself, using this post only when absolutely necessary.1

Problem A - Spam Filter

The key observation that makes this problem much easier to solve can be seen if we consider a certain test case. Let's say we had a 100000-character length string and k = 100, and you had a substring of 100 characters that contained 99 ones. How could we improve on this? If we check the next character and it's a 1, that's an improvement. If the next character after that is also a 1, that's also an improvement, and so on. But here's the thing: if we lengthen our original 100-length substring by another 100 characters, there's only one case in which the original 99% success rate is improved, which is when every character in the added 100 is a 1, leading to a 99.5% success rate over 200 characters. However, we could just take the latter 100 characters and get a 100% success rate! Working along these lines, we can show2 that you only ever need to check substrings of length k to 2k inclusive, so we can just use brute force and we'd be checking a maximum of 2k*100000 = 2*10^7 characters, which will safely run in time.

Problem B - Birds on a Wire

There isn't much to this problem - the large size limit of l means you can't go through every position linearly 3, but the relatively small size limit of n means you can sort the positions of the birds and check each adjacent pair. For each pair, the number of birds that can be placed in between can be easily determined as one less than the difference between the birds' positions divided by d. The problem's more annoying than anything, since you also have to account for the ends of the wire.

Problem C - Lifting Walls

Note that if a solution exists, the number of cranes required is at most four - one crane for each wall. Thus, if we first calculate which walls every crane could reach (which runs very quickly, since there are only 4 walls for 30 cranes), we can check if one crane covers all the walls, in which case the answer is 1. If this isn't possible, we select two cranes and see if it's possible. Then we check three, and four. If four cranes can't cover the walls, then there is no solution.

What's the time complexity of this? For the last step, checking if the four walls are covered by four cranes4, there are 30 choose 4 = 27405 ways of selecting four cranes to check, so we are well within time constraints.

Also, how do we actually check if a subset of cranes covers all four walls? An easy way is to represent the walls that can be covered by each crane as a 4-bit number, then OR numbers together and check if it equals 15. Make sure you use parentheses, as bitwise operators have lower precedence than the equivalence operator!

Problem D - Toilet Seat

This is a (relatively) simple implementation problem and all you need to do is follow the instructions5. It might be possible to do something fancy instead of going through the string thrice, but it's not necessary since the string length is 1000 characters at most.

Problem E - Pub-lic Good

Some thought leads to a couple observations:

  1. DFS/BFS6 can be run on the graph, with an arbitrary root. If pubs and houses are assigned to alternating depths (i.e. the root is a pub, all nodes with depth 1 are houses, all nodes with depth 2 are pubs, etc.), then we are guaranteed that if a node is a pub, its parents/children are houses, and vice versa. This satisfies the pub/house condition.
  2. If there is an isolated node that isn't connected to anything, we can't satisfy the condition, since if it's a pub it's not adjacent to a house and vice versa. This is the only case in which no answer exists.

There isn't much more to say for this question, except that DFS/BFS work fine for this problem since there are at most 10000 nodes and 100000 edges.

Problem F - Xor Maximization (Thanks to Paul Liu)

This is one of those problems where someone else (in this case, Paul) thinks of something, tells you about it, successfully defends the idea against all attempts you make to prove it wrong, and you code it and it seems to work and you submit and it's correct. So from my perspective, I'm probably not going to give a very illuminating solution here, but hopefully you can figure it out yourself if you are so inclined.

The solution that doesn't involve creating a 60 by 100000 matrix (ask Paul about that) involves the following steps:

  1. Find the numbers whose most significant bit (MSB) are set and at index 59 (where index 0 corresponds to 2^0 = 1, index 1 to 2^1 = 2, etc.). If none exist, skip to step 4.
  2. Find the largest of these numbers.
  3. For the other numbers, XOR the largest number with each of them (and overwrite the original number).
  4. Repeat step 1, except decrement the bit being checked, unless we just checked for numbers with MSB set at index 0, in which case end.
  5. At this point, we can sort the numbers from highest to lowest, create a sum starting at 0 and iterate through the numbers, XORing the sum by a number if it increases the sum. Then the final sum is the answer.

I wouldn't normally tell the reader to refer to my code, but I think it's actually written somewhat not terribly for once so you can look at it if the above wasn't clear enough.

Why does this work? Well, I don't have a very clear answer. The greedy part in step 5 works because at that point, every number's MSB is unique (i.e. there aren't any other numbers that have the same MSB set at the same index). So if we had something like

index: | v sum: 100101011111 num: 000000111111 XOR: 100101100000 (becomes new sum)

we can see that even if the bits to the right of the index change from 1 to 0, if the index bit changes from 0 to 1 we have an improvement. In other words, we only care about the bit at the MSB index of the number we are looking at.

As for the rest of the algorithm...I think it has something to do with the fact that A XOR B XOR B = A XOR (B XOR B) = A XOR 0 = A. So when we XOR a number against everything else, we are just simplifying the number set while keeping the set of all possible XOR results unchanged. (Try it yourself!)

Basically though, this was just a really convoluted way of saying "Paul please explain how to do this question in the comments". But it's okay since ACM competitions are team events! Right?

Problem G - Restaurant Orders

The problem is very similar to the knapsack problem, and the solution can be coded as a slight variant of the well-known pseudo-polynomial algorithm to solve it. There are some slight differences of note. For one, there isn't any optimization of cost; we can simply keep track of whether a selection of items exists that sums to a total. And if there are at least two ways of reaching a total, we can immediately mark it as ambiguous.

Also, it's necessary (unless you have a clever optimization) to keep track of frequencies of items, rather than dumping all the items into each value. Consider the test case with one item of weight 1; if you store <1> at value 1, <1 1> at value 2, and so on, you'll run out of memory7. Instead, if you store frequencies, you'd store <{1 1}> at 1, <{2 1}> at 2, etc. Clearly, this scales linearly with size, instead of with the square of size.

Problem H - Three Digits

At first, the problem looks very easy - we'd figure that multiplying each number in the factorial, dividing by 10 until the last digit is non-zero, and modding by 1000 every time would work. But this starts to fail when the multiplication results in many powers of 10, and when you mod you don't get three nonzero numbers and every subsequent calculation is incorrect. You could try modding by 10000, or 100000, or an even higher power of 10, but it's likely to fail.

Instead, we could do the same thing, but before performing the multiplication we could check multiplicities of 2 and 5 (the prime factors of 10), and when both are non-zero we can divide powers of 2 and/or 5 out of the numbers being multiplied. That is, we'd divide powers of 10 out before they actually appear in the multiplication, so they don't non-zero numbers back to where we might unwittingly mod them out.

1This is totally why my code is borderline unreadable (but at least formatted nicely, with opening braces on the correct line, i.e. on a line by itself).

2Well, proving it outright isn't mandatory, but in an actual contest where you're pressed for time, it's a tradeoff between whether you think your hunch is correct so you don't need to spend more time proving it and accruing a large time penalty from incorrect submissions.

3Maybe it's possible - you'd represent the birds as 1s in a block of 1 billion bits, generate answers for every possible block of 16 bits and then iterate through the bits, 16 at a time...but it's pretty clear that for this problem, there is a solution that is obviously the intended solution, and anything else is "okay that's neat but why would you actually go through that when there is a way easier solution".

4I did this part with a quadruple-nested for loop. I regret nothing.

5Which is apparently too hard for some people, who know who they are.

6Depth-first search, or breadth-first search.

7Although this actually works on our judge, and Paul's solution that generates 30000*30001/2 is accepted instead of getting a memory limit exceeded result. But in an actual competition you'd have a) an actual judge that sandboxes solution programs (please don't try system calls on our judgea) and b) posted memory limits. And in any case you probably want the better-scaling solution.

aReally, don't.