Colouring a Sequence of Boxes

One of my friends was applying to Google for a job and got this problem in one of the final written rounds. This was a fun little problem that helped me procrastinate…

Problem. We are given a sequence of finitely many boxes, say \(b_1, b_2 \dots b_n\), where each box is coloured black or white. We can perform two operations on these boxes:

Reverse the colour of the box.
Place the first \(k\) many boxes (where \(k < n\)) after the last box in the same order.

The second operation can be performed at most once. We are required to find the least number of times that we need to perform the first operation such that no two neighbouring boxes have the same colour.

Solution. Let the two sequences of length \(n\) defined on \(\{0,1\}\) which have no two consecutive elements which are equal be \(v_1 = (0,1,0 \dots)\) and \(v_2 = (1,0,1 \dots)\). We define \(H(u,v)\) (the Hamming distance between two vectors) as the number of bits where \(u\) and \(v\) differ for any two \(n\)-length bit vectors \(u\) and \(v\). Label a box with \(1\) if it black and with \(0\) otherwise. Let \(\alpha_i\) denote the \(n\)-length bit vector corresponding to \(b_i, b_{i+1}, \dots b_n, b_1, b_2, \dots b_{i-1}\). Thus, our problem reduces to finding the \(min_{i,j} H(\alpha_i,v_j)\) where \(1 \leq i \leq n\) and \(j \in \{1,2\}\). Since the time taken to compute the hamming distance between two \(n\)-length bit vectors is \(\mathcal{O}(n)\) and there are \(n\) many \(\alpha_i\)s and constant many \(v_j\)s to check, a straightforward algorithm which checks all these possibilities solves this in \(\mathcal{O}(n^2)\) time.

We can do better and produce a linear time algorithm by using a recurrence relation to derive \(H(\alpha_{i+2},v_j)\) from \(H(\alpha_i,v_j)\) for all \(i \leq n-2\) and \(j \in \{1,2\}\). We take two cases: when \(n\) is even and when \(n\) is odd.

Case 1: \(n\) is even.

In this case, \(v_1 = (0,1,0 \dots 0, 1)\) and \(v_2 = (1,0,1 \dots, 1, 0)\). Note that \(H(\alpha_i,v_j)\) is equal to either \(H(\alpha_1,v_1)\) or \(H(\alpha_1,v_2)\) for any \(i \leq n\) and \(j \in \{1,2\}\). We compute these two quantities and report the smaller one in \(\mathcal{O}(n)\) time.

Case 2: \(n\) is odd.

In this case, \(v_1 = (0,1,0 \dots 1, 0)\) and \(v_2 = (1,0,1 \dots, 0, 1)\). Unfortunately, \(H(\alpha_i,v_j)\) may not be equal to either \(H(\alpha_1,v_1)\) or \(H(\alpha_1,v_2)\) for an arbitrary \(i \leq n\) and \(j \in \{1,2\}\). Thus, we have to compute all \(H(\alpha_i,v_j)\) for each \((i,j)\) pair. We take \(j=1\) and repeat the same procedure for \(j=2\). Note that the possible difference in the hamming distance between \(H(\alpha_{i+2},v_1)\) from \(H(\alpha_i,v_1)\) will arise because of the \(b_i\) and \(b_{i+1}\) boxes. Thus, to compute \(H(\alpha_{i+2},v_1)\) we just need to compute the difference between \(H(\alpha’_{i+2},(b_i,b_{i+1}))\) and \(H(\alpha’_i,(b_i,b_{i+1}))\) and add it to \(H(\alpha_i,v_1)\) where \(\alpha’_i\) is the vector corresponding to the \((b_i,b_{i+1})\) boxes in \(\alpha_i\). \(\alpha’_{i+2}\) is defined similarly. For each \(i\), this takes constant time. Thus, given the values of \(H(\alpha_1,v_1)\) and \(H(\alpha_2,v_1)\) (which can be computed in linear time), we can find the values of \(H(\alpha_i,v_1)\) for any \(i \leq n\) in linear time. Thus, \(H(\alpha_i,v_j)\) can be computed in linear time for any \((i,j)\) pair.

A sketch of the algorithm is given below.

Take \(n\) and \(\alpha_1\) as the input
Compute \(\alpha_2\)
Define \(prev_{\gamma_1}\) and \(prev_{\gamma_2}\) as the first two indices of \(\alpha_1\) and \(\alpha_2\)
Define \(v’_1\) and \(v’_2\) as the first two indices of \(v_1\) and \(v_2\)
Compute \(H(\alpha_1,v_1)\), \(H(\alpha_2,v_1)\), \(H(\alpha_1,v_2)\) and \(H(\alpha_2,v_2)\)
\(i \leftarrow 1\)
\(prev_{H1} \leftarrow H(\alpha_1,v_1)\), \(prev_{H2} \leftarrow H(\alpha_1,v_2)\)
\(curr_{H1} \leftarrow H(\alpha_1,v_1)\), \(curr_{H2} \leftarrow H(\alpha_1,v_2)\)
\(min_1 \leftarrow H(\alpha_1,v_1)\), \(min_2 \leftarrow H(\alpha_1,v_2)\)
if \(n\) is even, compute the smaller of the above two values and return that
while \(i \leq n\) do
- Define \(curr_{\gamma_1}\) and \(curr_{\gamma_2}\) to be the vector corresponding to the \(b_i\) and \(b_{i+1}\) boxes of \(\alpha_1\) and \(\alpha_2\) respectively
- \(curr_{H1} \leftarrow prev_{H1} - H(prev_{\gamma_1},v’_1) + H(curr_{\gamma_1},v’_1)\)
- \(curr_{H2} \leftarrow prev_{H2} - H(prev_{\gamma_2},v’_2) + H(curr_{\gamma_2},v’_2)\)
- if \(curr_{H1} < min_1\) then \(min_1 \leftarrow curr_{H1}\) and if \(curr_{H2} < min_2\) then \(min_2 \leftarrow curr_{H2}\)
- \(i \leftarrow i+2\)
Repeat the previous loop starting from \(i=2\) this time
Compute the smaller of \(min_1\) and \(min_2\) and return that