Alex Raichev

Store Credit

Here's the store credit problem from Google Code Jam Qualification Round Africa 2010, which i discovered via Programming Praxis:

You receive a credit C at a local store and would like to buy two items. You first walk through the store and create a list L of all available items. From this list you would like to buy two items that add up to the entire value of the credit. The solution you provide will consist of the two integers indicating the positions of the items in your list (smaller number first). For instance, with C=100 and L={5,75,25} the solution is 2,3; with C=200 and L={150,24,79,50,88,345,3} the solution is 1,4; and with C=8 and L={2,1,9,4,4,56,90,3} the solution is 4,5.

Algorithm 1

Brute force. Test all pairs in the list. O(n^2) time and O(n) space.

Algorithm 2

Sort the list and then iterate over the list, at each step looking for the complementary price of the current price via binary search. O(n lg n) time and O(n) space.

Algorithm 3

Store the prices and indices in a dictionary and then iterate over the dictionary, at each step looking for the complementary price. Since dictionary look-up is constant time on average, this is an O(n) time (in the average case) and O(n) space algorithm. Hooray!

Here's a Python 2.7 implementation:

def find_pair(credit, prices):
    # Make a dictionary with items of the form
    # (price, set of indices of items with that price).
    # Note that the price list could contain duplicate prices.
    d = dict()
    for (i, p) in enumerate(prices):
        d[p] = d.get(p, set())
        d[p].add(i + 1)  # + 1 to index from 1 instead of 0.
    for price in d.keys():
        if price <= credit:
            # Remove price index from d.
            i = d[price].pop()
            # Check whether there's a complementary price in d.
            if credit - price in d:
                # This works even if price == credit - price.
                j = d[credit - price].pop()
                return min(i, j), max(i, j)
            # Reinsert price index into d.
            d[price].add(i)
    return "Sorry, no pair of item costs sum to %s" % credit
Author: Alex Raichev
Date: 2013-04-11
Tags: algorithm, puzzle
Permalink, Comment

Cat Vs. Dog

Updated for clarity 23 Sep 2013.

A friend of mine recently pointed me to Spotify's Cat vs. Dog puzzle. The question is long, and so i'll summarize it without its motivating context.

You are given n votes, each of which is a pair of strings of the form (Ci, Dj) (a cat-lover vote) or (Dj, Ci) (a dog-lover) for some integers 1 ≤ i, j ≤ n. This collection of votes can contain repeats. Two votes (a, b) and (c, d) are said to conflict if a = d or b = c. So a conflict can only occur between a cat-lover vote and a dog-lover vote. Your job is to find the size of a maximal subcollection of the votes in which no two votes conflict.

Algorithm 1

Brute-force. Run through all possible subcollections of votes and test each for conflicts. Choose the biggest non-conflicting subcollection.

This procedure would take O(2n) time in the worst case, because there are 2n subcollections to check. That's very slow. Can we be more clever?

Algorithm 2

Yes, model the problem with a graph.

  1. Distinguish the votes by tagging each with a unique identifier, such as an integer between 1 and n. This takes O(n) time.
  2. Build the following undirected graph G. Make each vote a vertex, and add the edge {u, v} if u and v are conflicting votes. This takes O(n2) time. Notice that G is bipartite, because every edge connects a cat-lover vote to a dog-lover vote and no vertex is both a cat-lover vote and a dog-lover vote.
  3. Notice that every maximal subset of nonconflicting votes is the complement of a minimum vertex cover for G. So find the size k of a minimum vertex cover for G and return n − k. Now, finding the size of a minimum vertex cover in an arbitrary undirected graph is an NP-complete problem, and there's no known sub-exponential time way to do that. However, our graph is bipartite, and so by Koenig's theorem, the size of a minimum vertex cover of G equals the size of a maximum matching for G, the latter of which can be found by the Hopcroft-Karp algorithm in O(|E|(n)) time, where |E| is the number of edges in G. In the worst case, |E| = n2 and running Hopcroft-Karp takes O(n2.5) time.

So all up, Algorithm 2 takes O(n2.5) time in the worst case.

Author: Alex Raichev
Date: 2013-04-02
Tags: algorithm, puzzle
Permalink, Comment

Jumping Jack

I'd like to present some solutions to the following puzzle that appeared as March 22, 2013 Programming Praxis puzzle.

Jack starts at 0 on the number line and jumps one unit in either direction. Each successive jump he makes is longer than the previous by 1 unit, and can be made in either direction. Design an algorithm that takes an integer n and returns a shortest sequence of jumps that Jack can perform to reach n along with the length of that sequence.

Algorithm 1

Brute-force. Perform a breadth-first search through the binary tree of all valid jump/integer summand sequences (0, -1, 1, -1 - 2, -1 + 2, 1 - 2, 1 + 2, ...) and stop when the target sum n is reached.

Algorithm 2

Brute-force with pruning. Label the nodes of the binary tree with their partial sums and order the nodes of so that the left child is less than the right child (0, -1, 1, -3, -1, -1, 3, ...). Notice that every pair of nodes at the same height with equal partial sums have equal subtrees. Notice also that every pair of nodes at the same height with partial sums of opposite sign have mirror image subtrees. Notice also that flipping the signs/turns of a path with partial sum s yields a path of the same length to -s. So in our breadth-first search, we don't need to search down a node at a given height if we've already discovered a node at the same height with the same magnitude partial sum.

Analyzing the worst-case time complexity of Algorithm 1 or Algorithm 2 requires knowing at what tree height the given number n is first discovered. If you can figure that out, then you probably discovered...

Algorithm 3

Smart-force. Find the least integer k such that t_k := 1 + 2 + 3 + ... + k >= |n| and t_k has the same parity (evenness/oddness) as |n|. Then flip the signs of the summands of t_k in a systematic way to total n.

Here are the details. Notice that t_k = k(k + 1)/2, a triangular number. Setting t_k = |n| and solving for k gives the ceiling of (-1 + sqrt(1 + 8 |n|))/2. So k equals the ceiling or 1 or 2 plus it, whichever of those three numbers has the same parity as n and is least. Here we're using the fact that the set {t, t + s, t + s + (s + 1)} of three consecutive triangular numbers contains both even and odd numbers for any positive integers t, s. (Simply check all four parity possibilities for t, s.)

To find a minimal length sum for n, first compute d := (t_k - n)/2. Because t_k >= |n| and t_k and n have the same parity, d lies in the set {0, 1, 2, ..., t_k}. Now repeatedly subtract: d = a_k (k) + r_k, r_k = a_{k-1} (k-1) + r_{k-1}, ..., r_2 = a_1 (1) + r_1, choosing each a_i maximally in {0, 1}. By the lemma below, r_1 = 0. So d = sum_{i=1}^k a_i i. Thus n = t_k - 2d = sum_{i=1}^k i - sum_{i=1}^k 2a_i i = sum_{i=1}^k (1 - 2a_i) i and 1 - 2a_i in {-1, 1}. So the sequence b_i := 1 - 2a_i is a path, and by the minimality of k, b_i is a minimal path.

Algorithm 3 example

Consider the target number n=12. According to Algorithm 3, the possibilities for k are 5, 6, 7. The corresponding values of t_k are 15, 21, and 28. Since 28 is the least of these with the same parity as n, we see that k=7. So d = (t_k - n)/2 = 8, which we write as 1 + 7 according to the algorithm. Thus a shortest path to 12 is -1 + 2 + 3 + 4 + 5 + 6 - 7.

I say a shortest path, because shortest paths aren't unique in general. For example, 1 + 2 -3 + 4 - 5 + 6 + 7 also works.

Algorithm 3 correctness

Lemma: Let A_k = {0, 1, 2, ..., t_k}. Then a number lies in A_k if and only if it can be expressed as a sum sum_{i=1}^k a_i i for some a_i in {0, 1}.

Proof: By induction on k. First, 0 = sum_{i=1}^0 1, the empty sum. Now suppose the result holds for all k - 1 >= 0 and let d in A_k. Repeatedly subtract: d = a_k (k) + r_k, r_k = a_{k-1} (k-1) + r_{k-1}, ..., choosing each a_i maximally in {0, 1} and stopping at the first r_j in A_j for some j < k. Then by the induction hypothesis, r_j = sum_{i=0}^j b_i i for some b_i in {0, 1}. Then d = r_j + sum_{i=j+1}^k a_k i, as desired. Conversely, a sum s := sum_{i=1}^k a_i i for a_i in {0, 1} satisfies 0 <= s <= sum_{i=1}^k i = t_k, and so s in A_k.

Algorithm 3 time complexity

Assuming that arithmetic operations are constant time, it takes O(1) time to compute k from n, and hence the length of a minimal path to n. Then it takes O(k) time to find a minimal length sum for d and O(k) time to use that sum to produce a minimal length sum for n. So O(k) = O(sqrt(n)) time all up.

Author: Alex Raichev
Date: 2013-03-28
Tags: algorithm, puzzle
Permalink, Comment


Why no comments? I used to do public comments but found that moderating and maintaining them took too much time in front of the computer, time better spent playing outdoors. So these days I only do private comments, that is, you can email me comments regarding a post by clicking the 'Comment' link at the bottom the post.