Southeast Regionals 2010 - Solutions

Post date: Aug 2, 2013 6:00:18 AM

Hi all. Apparently it's my turn to write up the solutions. Today's problems are brought to you by SER2010™. None of the problems in this set were particularly trivial, though none were out of reach either. The winning team solved 7 out of the 9 problems. Today we're going to solve all 9.

The paragraphs below only gives a high level overview of the insights needed to solve the problem. 'Pologies in advance if there isn't enough detail. My implementations to the problem are given though.

Problem A: Balloons

For this who know the min-cost max flow pattern, this problem can be solved with a standard application of it. However, the intended solution is greedy.

The intuition is that if a team is closer to A than B, then that team should get its balloons from A if possible. Additionally, the savings in distance by delivering from A is dB-dA. However, if two teams are both closer to A than B, then we should deliver the balloon to the team that has a greater savings in distance. The same logic applies if a team is closer to B than A.

Now we have enough for a greedy algorithm:

Sort the teams by |dA-dB|.
For each team, if dA < dB, deliver all the balloons you can from A, using balloons from B if you run out.
If dA > dB, deliver all the balloons you can from B, using balloons from A if you run out.
If dA == dB, we can deliver balloons arbitrarily from either A or B.

[code]

Problem B: Bit Counting

This problem has a recursive feel to it, and recursion usually means DP (dynamic programming). Suppose we had some sort of solution f(n, x) that tells us the # of integers in [1, n] that has a K-value of x. Then given the inputs in the problem, the solution would simply be f(hi, x) - f(lo, x). We will try to come up with a way of computing f(n, x) below.

The main observation is the following: after just one iteration of bit counting, any number in the interval [1, 10^18] drops down to the interval [1, 64]. For each integer in [1, 64], we can calculate its K-value with brute force. Let's assume for now that the K-values for [1,64] are stored in an array called period[k]. Hence, if we wanted to compute f(n, x), we can solve the following subproblem instead:

For each integer j in [1, 64] where period[j] = x-1, how many integers in the much larger interval [1, n] goes to j after one iteration?

This is the same subproblem as:

How many integers in [1,n] have exactly j bits set to 1 and the rest set to 0?

This subproblem can be solved with DP.

Let's first denote the solution to the subproblem g(n, j). For intuition, let's examine the special case of when n = 2^r -1. In this case, the answer is simply rCk (r choose k), since any combination of the r bits may be set to 1 without overflowing the interval [1,n].

In the general case, not all combinations of rCk bits may be used, since it may generate a number outside of [1,n]. Let's write n = 2^r + m, where 2^r is the largest power of 2 <= n. We can split the counting of g(n,j) into two cases: (1) the case where the r-th bit is not set to 1, and (2) the case where the r-th bit is set to 1. If we do not use the r-th bit, then we are free to use any of the first r-1 bits without restriction. Hence case (1) contributes (r-1)Ck. Case (2) can be solved recursively, it is simply g(n-2^r, j-1).

Finally, the binominal coefficients rCk must be calculated by another DP.

[code]

Problem C: Data Recovery

This problem is an application of the max-flow pattern. Suppose we just wanted a single candidate solution for a given matrix. First we can take all the elements are not -1, subtract them from their respective row sum and column sum, and set them to 0. This gets us a subproblem that is a bit easier to manage:

Given row sums and column sums of a matrix consisting of 0's and -1's, fill in the -1's so that the row sums and column sums are satisfied.

From this point on we assume the given problem has been reduced to this subproblem. Then we may set up the following graph:

Let the source be s, and the sink be t.
Let each row be represented by a node r_i and each column be represented by a node c_j.
Connect s to each r_i, using the row sum of row i as the capacity.
Connect each c_j to t, using the column sum of col j as the capacity.
If there is a -1 in the (i,j)th position of our reduced matrix, then connect r_i and c_j by an edge of capacity 100 (since this is the largest an element in the matrix can be).

If we run max-flow on this graph, the flow through an edge (r_i, c_j) gives us exactly the value of the element in the (i,j)th position of the matrix.

This gives us a candidate solution, but how do we tell if an element (i,j) is unique?

For intuition, consider the case where we have four -1's arranged in a rectangle,

-1 -1

with row sums being 50 for both rows, and column sums being 50 for both columns.

We can immediately see that many solutions are possible:

Soln 1:

25 25

Soln 2:

24 26

26 24

Soln 3:

26 24

24 26

... and so on.

Looking at the differences between these solutions (and I mean differences in a 'subtract the matrices' way), we can see that going from Soln 1 -> Soln 2 can be described by the matrix:

-1 +1

+1 -1

Going from Soln 2 -> Soln 3 can be described by the matrix:

+2 -2

-2 +2

Just looking at the signs of the matrices above, we can see that transitioning from solution to solution is either described by a

+ -

- +

matrix or a

- +

+ -

matrix.

This is because if we increase the upper-left corner of the rectangle of -1's, we will have to decrease the upper-right corner of the rectangle to compensate. Decreasing the upper-right corner means we'll have to increase the lower-right corner, and increasing the lower-right corner means we will have to decrease the lower-left corner. The observation to make is that this is simply a cycle of alternating +'s and -'s (placed on some -1's of the given matrix), where each + is connected to two -'s (one on the same row, and one on the same column), and each - is connected to two +'s.

Finally, note that if our candidate solution has a 0 in the (i,j)th element, then we cannot decrease this element (i.e. we cannot assign it a -). Likewise, if our candidate solution has a 100 in the (i,j)th element, then we cannot increase this element (i.e. we cannot assign it a +). So there are a couple of restrictions to check when you are looking for a cycle of alternating +'s and -'s.

Hence to determine if the (i,j)th element of the matrix is unique, we simply have to dfs through the matrix, and see if we can find a cycle as described above that contains the (i,j)th element.

Note: It turns out that finding such a cycle containing (i,j) is equivalent to finding a cycle of alternating forward capacity edges and backward residual edges on the residual graph after the max flow. The cycle must start at r_i and end at c_j without using either the capacity edge (r_i, c_j) or the residual edge (c_j, r_i). This is what is implemented in the code below.

[code]

Problem D: Equal Angles

This problem can be solved by ternary search. For an angle t, we define the function f(t) to be the area made by triangle abc when we draw 3 lines with angle t from the side of the triangle, as shown in the diagram below.

One of the points that we are looking for (P to be exact) is located at the angle t* for which f(t*) = 0. Intuitively, we can see that f(t) only has one minimum. Additionally we can see that for all angles t < t*, f(t) is decreasing and for all t > t*, f(t) is increasing. The left bound of t is clearly 0, and the right bound of t is the minimum angle between the 3 corners of the given triangle. From this point on it is a standard application of ternary search.

Note: Daniel has raised up the point that if we used the signed area of abc, we may use binary search instead.

[code]

Problem E: Maximum Square

There are many solutions to this problem (even one that reduces this problem to problem G). I solved it by building the profile matrix Angus described in his previous blog post for problem H (PACNW 2009). From this profile matrix, we can reduce each column to an instance of the classical largest area under a histogram (see solution to problem H). I solved it this way mostly as a coding exercise. It is definitely not the easiest solution to code (for example, Daniel's solution was six times shorter than mine after he compressed it).

[code]

Problem F: Palindrometer

The observation to make in this problem is that all palindromes of length n can be generated by enumerating the first n/2 digits, and then mirroring them for the last n/2 digits (with an extra digit in the middle if n is odd).

Hence we can solve the problem by:

First generating all the palindromes (which take at most 10^5 operations since we only have 9 digits)
Storing all palindromes into a list
Sorting the list
Finding the next largest palindrome to a given number by binary search.

[code]

Problem G: Profits

This is the classic maximum sum subarray problem, which can be solved by Kadane's algorithm (as covered in Angus's previous blog post).

[code]

Problem H: Roller Coaster

Let's say that the bounds are small for a second. Then this problem would be a textbook application of the knapsack problem. Using the maximum dizziness as the knapsack, and the fun as the weights, we can solve this problem in O(N*L) space and O(N*L) run time. As a reminder, our DP state is:

dp[n][l] := the maximum fun we can have, with dizziness <= l.

dp[n][l] = max(dp[n-1][l-(dizziness at the nth segment)] + (fun at the nth segment), dp[n-1][l+k])

However, reality is cruel, and using the solution above would use over a gigabyte of memory.

Fortunately for us, the maximum fun we can have on a section of the track is bounded by only 20! So we can flip this DP around, and consider the state

dp_better[n][f] := the minimum dizziness we can have, with fun = f.

dp_better[n][f] = max(dp_better[n-1][f-(fun at the nth segment)] + (dizziness at the nth segment), dp_better[n-1][f]-k)

This reformulated DP can be solved in O(N^2 * maxF) space and O(N^2 * maxF) time. There are also some checks you need to do (such as not taking solutions that exceed the maximum dizziness), but this is mostly an implementation issue.

We can recover the solution by iterating over all values of dp_better[N][j] and picking the highest j with dp_better[N][j] <= l.

Note: Since the recursion we formulated only involves the nth row and (n-1)th row, we can actually save more space by only storing the last two rows. This takes the memory down to O(N*maxF), and is implemented in the solution below.

[code]

Problem I: Skyline

By doing a few small test cases, one will see that the number of good skylines is simply the Catalan numbers. Computing the Catalan numbers themselves is a simple application of dynamic programming. I haven't been able to find an easy proof that maps this problem to the Catalan numbers, but this solution does pass all the test cases.

[code]

Problem J: Underground Cables

Ignoring the restriction on crossing cables, this problem is solved by finding a minimum spanning tree of the graph where cities are nodes, and every city is connected to every other city by an edge weighted by their euclidean distance (representing a possible cable connection).

Now suppose we had a minimum spanning tree of this graph that had crossing edges a<->b and c<->d. After a little thought (and a little afterthought), we can see that we'd get a shorter minimum spanning tree by un-crossing the edges into a<->c and b<->d. Hence by contradiction, the minimum spanning tree has no crossing edges.

So it suffices to just construct the graph mentioned above and run any minimum spanning tree algorithm.

[code]