Practice 2016-09-03: Solutions for Universidade de São Paulo Team Tryouts 2016

Post date: Sep 6, 2016 5:51:57 AM

The contest wasn't too bad although there were some harder problems! I encourage everyone to think and read the solutions and solve a problem they didn't/couldn't solve!

If you only solve problems you could solve, it's hard to improve. The key to improvement is to solve problems you couldn't solve before.

As usual, my code is here:

https://www.dropbox.com/sh/prf8vqxp6uwcb26/AAAdlS8q09UlA8l5tbwdm0VAa?dl=0

Solutions by Paul and I are as described below

Problem A - Renzo and the Lost Artifact

Problem Summary

You have two rectangles of equal proportions, one is smaller and placed inside another. You want to find a point in the bigger rectangle such that if you map the bigger rectangle onto the smaller rectangle (by scaling, rotation, translation), the point remains in the same place (inside the smaller rectangle)! Solution is guaranteed to exist.

Solution

Let the bottom left corner of the small rectangle be z. Let the bottom side of the small rectangle be vector a, the left side of the small rectangle be vector b, both vectors pointing out from the bottom left corner. Let c,d be the corresponding vectors for the larger rectangle. The input is given such that the larger rectangle has bottom left corner at (0, 0).

Let our solution be v. The idea is that we compute some sort of "invariant coordinate" of v in both rectangles and equate them. Notice that if we project v on to the bottom side and onto the left side of a rectangle and divide the projected length by the length of the corresponding side, we get a number that is invariant to translation, rotation, and scaling.

In other words, we formulate this system of vector equations:

a dot (v - z) / |a|^2 = c dot v / |c|^2

b dot (v - z) / |b|^2 = d dot v / |d|^2

If you expand the above two equations out you get a system of 2 linear equations with 2 unknowns (i.e. of the form ax+by=e, cx+dy=f), which is easily solvable by substitution or inversion of 2x2 matrix. You will have to take care of the cases of infinite solution on a plane, infinite solution on a line and one solution.

If the solution lies on a line, you can simply obtain a point solution by trying to intersect it with each of the 4 sides of the smaller rectangle.

Complexity

Time complexity: O(1)

Alternate Solution

Initialize v arbitrarily, compute f(f(f(....f(v)))) where f is the mapping function from large to small rectangle. If this converges then obviously we have a solution. Since a solution exists and we are "shrinking" the output space with each iteration, it is guaranteed to converge.

Problem B - Buffaloes

Problem Summary

Given n <= 60, find the number of permutations of [1..n] such that its longest decreasing sequence is not of length 3 or 4.

One idea that comes up right away is to try to count permutations with LDS of length 3 or 4, and then do n! minus that.

Solution

Let's try to get the solution of n by choosing the first number and then reducing it to the solution of n-1. The key question to ask is: what do we need to know about decreasing sequences of n if we want to find the LDS of n+1? If we are adding X to the start of the sequence, we only need to care about decreasing sequences that start with <= X. To figure out decreasing sequences of length Z, we also need to know about decreasing sequences of length Z-1 of course. In other words, the sufficient information is the *minimum start* of decreasing sequences of each length we care about.

With the above observation, we perform the following dp:

dp[n][i][j][k] = the number of permutations of [1..n] such that the minimum start of decreasing sequence of length 2 is i, the min start of DS of len 3 is j, the min start of DS of len 4 is k, and there are no DS of length 5 or more.

To transition for n to n+1, we try to insert x = [1,n+1] to the start of the sequence and +1 to every number >= x. We notice that if x == 1 then i,j,k do not change as we cannot make any new DS of length 2,3,4. If 1 < x <= i, then i becomes x as we can have [x, 1] as a new DS of length 2. If i < x <= j then we can take a 2-DS starting with i and tack x onto it, making a 3-DS starting with x <= j, so j becomes x. Similarly if j < x <= k then k becomes x. If k < x then we make a DS with length 5 so we skip this case.

Complexity

Time complexity: O(n^5), but runs very fast in practice.

Problem C - Cahokia Ruins

Problem Summary

Each row has two walls spaced x units apart. The walls move toward each other 1 unit / second. How many seconds past before the pair of walls in some row collide?

Solution

Output min x/2

Complexity

Time complexity: O(H)

Problem D - Black Hills Golden Jewels

Problem Summary

Given a list of numbers A[1]...A[N], N <= 100k, find the k-th minimum sum of some pairs of these numbers.

Solution

We first sort A. Then, we can binary search for k. Given a chosen k, we can count the pairs by first iterating over A, and for each A[i] binary search A for how many numbers are <= k - A[i], excluding A[i] itself. Remember to divide by 2 as this will double count.

Complexity

Time complexity: O(log N log 10^9)

Problem E - A Word to Trump All

Problem Summary

Given N bad words and M good words made with letters a to j, N+M <= 15, total length of all strings <= 300, find the shortest string that contains all N bad words as substrings but none of the M good words.

Solution

Solution idea is by Paul but he didn't end up implementing it.

This problem sounds like it is related to string matching. In particular, when we are building the answer string, we want to, in theory, keep track of how much it matches each of those N+M strings so that we know when we match the bad words and we don't add letters causing good words to match.

When we are matching against a dictionary of strings, the Aho Corasick algorithm is a standard choice. Briefly, the idea is that we build a trie of the dictionary words, and then as we walk along a string, we move to the trie node corresponding to the longest prefix of any dictionary word that matches a suffix of the part of the string we've walked on so far. Please read up on the details of the algorithm.

We notice that the trie node in Aho Corasick encompasses everything we need to know for the next string that will possibly match. In other words, we can perform a BFS to build the string, trying all possible letters to add at each stage, with the state being {trie node, bad words we've included so far}. For each BFS node we keep its parent and the letter added when we visited it, and when we reached a state where we've included all words, we back-trace the BFS nodes to output the string

Optimizations: you need to pre-compute for each trie node, whether a suffix is a good word and a bitmask of all the bad words that are in the string represented by the trie node, and also for each possible letter to add, what is the next trie node to jump to.

Complexity

Time complexity: O(2^N * (N+M) * 10)

Problem F - Metal Detector

Problem Summary

Simulate a queue where every other person moves to the back of the queue (in order) and the remaining ones are removed. Given initial position i, find the number of people removed before the i-th person is removed.

Solution

Each "cycle" of the simulation involves dividing the queue length roughly by half and dividing the position of our person roughly by half or terminating. The devil is in the details as many seem to have off-by-one errors that cause Wrong Answer.

Basically, the key to getting this problem correct is that we have to keep track of 3 variables:

1. the size of the queue

2. the position of our person

3. whether or not the first person in the queue is removed vs. going to the back

and that we have make sure our code handles the following 4 cases correctly:

1. size of queue is even, first person is removed

2. size of queue is even, first person goes to the back

3. size of queue is odd, first person is removed

4. size of queue is odd, first person goes to the back

Complexity

Time complexity: O(log N)

Problem G - The Declaration of Independence

Problem Summary

Simulate a persistent queue. On the i-th query, we have to make version i of the queue from version v < i by either adding to the end or removing from the beginning, and output the element we removed if it's the latter case.

Solution

One, can of course use an actual persistent queue. A persistent balanced binary search tree can be used to implement the persistent queue in O(log n) time per action. See my code for an example of how to use a persistent AVL tree to solve this problem.

Persistent data structures are often highly complex and time consuming to implement, so the intended solution is to do the problem offline. We process the queries and build a tree of all elements ever inserted. As we process the queries, we keep track, for each previous version, the node corresponding to the end of the queue, and how many elements from the root is the start. For an addition, we simply insert a node to the last node of the previous version. For each deletion, we increment the "distance from root" number. Then, we perform a dfs of the entire tree, keeping track of the list of ancestors of our current node. If there is a version made from delete query that ends at the current node, we simply pick out the correct ancestor as the answer.

Complexity

Time complexity: O(N) or O(N log N)

Problem H - Pop Divas

Problem Summary

Given 2 lists of <= 16 numbers, call them A and B. Find the minimum number that is both a product of a subset of A and also a product of a subset of B.

Solution

Since |A|, |B| <= 16, we can simply brute force over all subsets. With big integers, we can simply compute the product for all subsets of A, put them in a set, and then for each subset of B, compute its product and check if it's in the set of the products for A.

What if you don't have big integers? You can use a hash table and compute product mod some large prime. If hash collisions occur you will need to check for equality, in which case you can compare prime factorization.

Complexity

Time complexity: O(2^N + 2^M * N*log 10^9)

Problem I - Protecting the Central Park

Problem Summary

Decompose a connected graph with even edges into pairs of adjacent edges.

Solution (Courtesy of Paul)

The problem tells us that we can always do this if the graph is connected and has even number of edges. This enables us to prove that the following DFS strategy works:

We perform DFS, at each node we DFS on each children, and have them report back whether some edge (other than edge to parent) from that child is unmatched. This leaves us a subgraph centered at a node with paths of length 1 or 2 coming out. For paths of length 2 we pair up the 2 edges of path. For paths of length 1, we pair them up, and we report back the one remaining edge if it exists.

This strategy can be proven by induction: the base case is a single vertex with no edges, and the induction case is the subgraph formed by the subtree of the current node of DFS tree.

Complexity

Time complexity: O(N+M)

Problem J - King of Tokyo

Problem Summary

You have a fair dice with faces 123AVE. You roll 6 dices. If you have k >= 3 of the same number x, you get x + (k-3) score. You can choose to re-roll some of the dice, up to t <= 4 times. Given the current dice roll, determine the optimal dices to reroll.

Solution

Obvious solution is dp[dice roll][t] = optimal score + optimal roll given current dice roll and t re-rolls left, and for transition you try every possible choice of 2^6 = 64 re-roll choices and look at every possible re-roll result and take the best of all 64 re-rolls you could do.

Naively, you might think there are 6^6 = 46656 different dice rolls and given the number of rerolls you might have a dp run time in the billions! However, that's not true. Since the ordering of the dice doesn't matter for scoring, you only have to consider the count for each face, and that reduces it to 462 configurations. Thus, you actually have something like 462 * 64 * 46656, which is actually somewhat manageable.

I implemented the above dp but it TLE'd so I precomputed the answer (a dp table of size 462 * 4) and submitted that.

Complexity

Time complexity: O(1)

Problem K - Mount Rushmore and Birthdays

Problem Summary

Find the minimum number of choices you have to pick out of N choices such that the probability of having two of them the same is > 50%.

Solution

Simply compute 1 - (n/n * (n-1)/n * (n-2)/n * ...), increasing the number of terms until the total is > 0.5. Make sure to handle edge cases (e.g. n = 1 or 2).

Complexity

Time complexity: O(1)

Problem L - The Knapsack Problem

Problem Summary

Solve the knapsack problem where the total weight is S <= 1 billion, but the weight of each item is <= 1000 and there are <= 1000 items.

Solution

Since weights of each item is small, you'd think you can "greedy" up to some point by picking the most value-efficient item a lot. This leads to the following heuristic strategy: compute the knapsack dp for as many values of S as possible, and then for each value of S you computed, fill in the rest with the most efficient item. This actually does not really work and fails test case 93. No worries, add another hacky greedy (courtesy of Paul) of "dividing up the total into equal blocks of <= 1000 and use the knapsack answer for that" and you will get AC.

Complexity

Time complexity: O(time limit / N) :D