# Cat Vs. Dog

Updated for clarity 23 Sep 2013.

A friend of mine recently pointed me to Spotify's Cat vs. Dog puzzle. The question is long, and so i'll summarize it without its motivating context.

You are given n votes, each of which is a pair of strings of the form (*Ci*, *Dj*) (a cat-lover vote) or (*Dj*, *Ci*) (a dog-lover) for some integers 1 ≤ *i*, *j* ≤ *n*.
This collection of votes can contain repeats.
Two votes (*a*, *b*) and (*c*, *d*) are said to conflict if *a* = *d* or *b* = *c*.
So a conflict can only occur between a cat-lover vote and a dog-lover vote.
Your job is to find the size of a maximal subcollection of the votes in which no two votes conflict.

## Algorithm 1

Brute-force. Run through all possible subcollections of votes and test each for conflicts. Choose the biggest non-conflicting subcollection.

This procedure would take *O*(2^{n}) time in the worst case, because there are 2^{n} subcollections to check.
That's very slow.
Can we be more clever?

## Algorithm 2

Yes, model the problem with a graph.

- Distinguish the votes by tagging each with a unique identifier, such as an integer between 1 and
*n*. This takes*O*(*n*) time. - Build the following undirected graph
*G*. Make each vote a vertex, and add the edge {*u*,*v*} if*u*and*v*are conflicting votes. This takes*O*(*n*^{2}) time. Notice that*G*is bipartite, because every edge connects a cat-lover vote to a dog-lover vote and no vertex is both a cat-lover vote and a dog-lover vote. - Notice that every maximal subset of nonconflicting votes is the complement of a minimum vertex cover for
*G*. So find the size*k*of a minimum vertex cover for*G*and return*n*−*k*. Now, finding the size of a minimum vertex cover in an arbitrary undirected graph is an NP-complete problem, and there's no known sub-exponential time way to do that. However, our graph is bipartite, and so by Koenig's theorem, the size of a minimum vertex cover of*G*equals the size of a maximum matching for*G*, the latter of which can be found by the Hopcroft-Karp algorithm in*O*(|*E*|√(*n*)) time, where |*E*| is the number of edges in*G*. In the worst case, |*E*| =*n*^{2}and running Hopcroft-Karp takes*O*(*n*^{2.5}) time.

So all up, Algorithm 2 takes *O*(*n*^{2.5}) time in the worst case.

# Contours: A Mathematical Coloring Book

Here is a mathematical coloring book I recently made. It contains black and white pictures generated by mathematical formulas which you color in for fun. Best printed on size A4 paper or bigger. Enjoy and share!

# Jumping Jack

I'd like to present some solutions to the following puzzle that appeared as March 22, 2013 Programming Praxis puzzle.

Jack starts at 0 on the number line and jumps one unit in either direction. Each successive jump he makes is longer than the previous by 1 unit, and can be made in either direction. Design an algorithm that takes an integer n and returns a shortest sequence of jumps that Jack can perform to reach n along with the length of that sequence.

## Algorithm 1

Brute-force. Perform a breadth-first search through the binary tree of all valid jump/integer summand sequences (0, -1, 1, -1 - 2, -1 + 2, 1 - 2, 1 + 2, ...) and stop when the target sum n is reached.

## Algorithm 2

Brute-force with pruning. Label the nodes of the binary tree with their partial sums and order the nodes of so that the left child is less than the right child (0, -1, 1, -3, -1, -1, 3, ...). Notice that every pair of nodes at the same height with equal partial sums have equal subtrees. Notice also that every pair of nodes at the same height with partial sums of opposite sign have mirror image subtrees. Notice also that flipping the signs/turns of a path with partial sum s yields a path of the same length to -s. So in our breadth-first search, we don't need to search down a node at a given height if we've already discovered a node at the same height with the same magnitude partial sum.

Analyzing the worst-case time complexity of Algorithm 1 or Algorithm 2 requires knowing at what tree height the given number n is first discovered. If you can figure that out, then you probably discovered...

## Algorithm 3

Smart-force. Find the least integer k such that t_k := 1 + 2 + 3 + ... + k >= |n| and t_k has the same parity (evenness/oddness) as |n|. Then flip the signs of the summands of t_k in a systematic way to total n.

Here are the details. Notice that t_k = k(k + 1)/2, a triangular number. Setting t_k = |n| and solving for k gives the ceiling of (-1 + sqrt(1 + 8 |n|))/2. So k equals the ceiling or 1 or 2 plus it, whichever of those three numbers has the same parity as n and is least. Here we're using the fact that the set {t, t + s, t + s + (s + 1)} of three consecutive triangular numbers contains both even and odd numbers for any positive integers t, s. (Simply check all four parity possibilities for t, s.)

To find a minimal length sum for n, first compute d := (t_k - n)/2. Because t_k >= |n| and t_k and n have the same parity, d lies in the set {0, 1, 2, ..., t_k}. Now repeatedly subtract: d = a_k (k) + r_k, r_k = a_{k-1} (k-1) + r_{k-1}, ..., r_2 = a_1 (1) + r_1, choosing each a_i maximally in {0, 1}. By the lemma below, r_1 = 0. So d = sum_{i=1}^k a_i i. Thus n = t_k - 2d = sum_{i=1}^k i - sum_{i=1}^k 2a_i i = sum_{i=1}^k (1 - 2a_i) i and 1 - 2a_i in {-1, 1}. So the sequence b_i := 1 - 2a_i is a path, and by the minimality of k, b_i is a minimal path.

## Algorithm 3 example

Consider the target number n=12. According to Algorithm 3, the possibilities for k are 5, 6, 7. The corresponding values of t_k are 15, 21, and 28. Since 28 is the least of these with the same parity as n, we see that k=7. So d = (t_k - n)/2 = 8, which we write as 1 + 7 according to the algorithm. Thus a shortest path to 12 is -1 + 2 + 3 + 4 + 5 + 6 - 7.

I say *a* shortest path, because shortest paths aren't unique in general.
For example, 1 + 2 -3 + 4 - 5 + 6 + 7 also works.

## Algorithm 3 correctness

*Lemma*: Let A_k = {0, 1, 2, ..., t_k}.
Then a number lies in A_k if and only if it can be expressed as a sum sum_{i=1}^k a_i i for some a_i in {0, 1}.

*Proof*: By induction on k.
First, 0 = sum_{i=1}^0 1, the empty sum.
Now suppose the result holds for all k - 1 >= 0 and let d in A_k.
Repeatedly subtract: d = a_k (k) + r_k, r_k = a_{k-1} (k-1) + r_{k-1}, ..., choosing each a_i maximally in {0, 1} and stopping at the first r_j in A_j for some j < k.
Then by the induction hypothesis, r_j = sum_{i=0}^j b_i i for some b_i in {0, 1}.
Then d = r_j + sum_{i=j+1}^k a_k i, as desired.
Conversely, a sum s := sum_{i=1}^k a_i i for a_i in {0, 1} satisfies 0 <= s <= sum_{i=1}^k i = t_k, and so s in A_k.

## Algorithm 3 time complexity

Assuming that arithmetic operations are constant time, it takes O(1) time to compute k from n, and hence the length of a minimal path to n. Then it takes O(k) time to find a minimal length sum for d and O(k) time to use that sum to produce a minimal length sum for n. So O(k) = O(sqrt(n)) time all up.

# Hourly Wage

Suppose you work in New Zealand, would like to earn a given net weekly income (for 52 weeks per year) after income tax, and would like to work a given number of hours per week (for 47 weeks per year; NZ workers get 4 weeks annual leave plus roughly 1 week of public holidays). What gross hourly income must you earn?

To answer that question, i used the 1 April 2012 to 31 March 2013 NZ income tax rates for individuals, wrote this Python program, and used it with Sage to make this table:

The gross hourly incomes needed are displayed in the main body of the table in New Zealand dollars, rounded to the nearest dollar and excluding GST. For example, to earn $1200/week after tax, working 25 hours/week, you need to earn $68/hour before tax.

# Science and Python: retrospective of a (mostly) successful decade

Video sourced from YouTube here.

**Why no comments?**
I used to do public comments but found that moderating and maintaining them took too much time in front of the computer, time better spent playing outdoors.
So these days I only do private comments, that is, you can email me comments regarding a post by clicking the 'Comment' link at the bottom of the post.