# Homework 3 Optional Problems

Prove that the worst-case expected running time of every randomized comparison-based sorting algorithm is $\Omega(n\log n)$. (Here the worst-case is over inputs, and the expectation is over the random coin flips made by the algorithm.)?

Suppose we modify the deterministic linear-time selection algorithm by grouping the elements into groups of 7, rather than groups of 5. (Use the “median-of-medians” as the pivot, as before.) Does the algorithm still run in $O(n)$ time? What if we use groups of 3?

ANSWER: The median for odd number of elements is the middle element, for even number it’s the first of 2 middle elements. For example, the median of 5 elements is the 3rd element, and it’s also the 3rd for 6 elements. Therefore, the median is the $\lceil \frac{k}{2} \rceil$ element for all $k \geq$ 1. Consider the following array:

x x x x x x
x x x x x x x
x x x X x x x
x x x x x x x
x x x x x x


$% X = 8, \text{number of elements} < X = 11 %]]>$

At least half of the $\lceil \frac{n}{5} \rceil$ groups contribute at least $3$ elements that are greater than $X$, except for the one group that has fewer than $5$ elements if $5$ does not divide $n$ exactly, and the one group containing $X$ itself. Discounting these two groups, it follows that the number of elements greater than $X$ is at least $3(\lceil \frac{1}{2}\lceil \frac{n}{5} \rceil \rceil - 2)$. It follows from the definition of ceiling that this expression $\geq \frac{3n}{10} - 6$. Similarly, at least $\frac{3n}{10} - 6$ elements are smaller than $X$. Thus, in the worst case, the size of the subarray in the recursive SELECT call is $(n - (\frac{3n}{10} - 6)) = \frac{7n}{10} + 6$.

Therefore, the recurrence relation is: %

This recurrence is one of the following generic form: $T(n) \leq \sum_{i=1}^k T(a_{i}n) + O(n), where \sum_{i=1}^k a_i \leq 1$

The solution to the above is given by: $% $

Since $% $

Similarly, for $m = 7$, the number of elements greater than $X$ is at least: $4(\lceil \frac{1}{2}\lceil \frac{n}{7} \rceil \rceil - 3) \geq \frac{2n}{7} - 8$

Therefore, the recurrence relation is $T(n) \leq T(\frac{n}{7}) + T(\frac{5n}{7} + 8) + O(n)$

Since $% $

For $m = 3$, the number of elements greater than $X$ is at least: $2(\lceil \frac{1}{2}\lceil \frac{n}{3} \rceil \rceil - 2) \geq \frac{n}{3} - 4$

Therefore, the recurrence relation is $T(n) \leq T(\frac{n}{3}) + T(\frac{2n}{3} + 4) + O(n)$

Since $\frac{1}{3} + \frac{2}{3} = 1, T(n) = O(n\log n)$

Given an array of n distinct (but unsorted) elements $x_1, x_2,...,x_n$ with $w_1, w_2,...,w_n$ such that $\sum_{i=1}^k w_i = W$, a weighted median is an element $x_k$ for which the total weight of all elements with value less than $x_k$ is at most $\frac{W}{2}$, and also the total weight of elements with value larger than $x_k$ is at most $\frac{W}{2}$. Observe that there are at most two weighted medians. Show how to compute all weighted medians in $O(n)$ worst-case time.

ANSWER: The weighted-median algorithm works as follows. If $n \leq 2$, we just return the brute-force solution. Otherwise, we proceed as follows (the pseudocode handles the special case of two medians, this description only applies to finding the lower/only weighted median). We find the actual median $x_k$ of the n elements and then partition around it. Then compute the total weights of the two halves. If the weights of the two halves are each $\leq \frac{W}{2}$, then the weighted median is $x_k$. Otherwise, the weighted (lower) median should be in the half with total weight exceeding $\frac{W}{2}$. The total weight of the “lighter” half is added to the weight of $x_k$, and the search continues with the heavier half.

WEIGHTED-MEDIAN(x, w, n)
if n == 1
return x
else if n == 2
// lower and upper weighted medians
if w == w
return {x, x}
else if w < w
return {x}
else
return {x}
else
// using randomized linear-time selection
find the median x[k] of X = {x, x, ..., x[n]}
// using the O(n) partition algorithm from QuickSort
partition the set X around x[k]
// using linear scan
compute W[L] = sum(x[i] < x[k]) w[i]
and W[R] = sum(x[i] > x[k]) w[i]
// x[k] = lower weighted median
if W[L] < W/2 and W[R] == W/2      // (1)
return {x[k], x[k + 1]}
// x[k] = upper weighted median
else if W[L] == W/2 and W[R] < W/2 // (2)
return {x[k - 1], x[k]}
// x[k] = weighted median
else if W[L] ≤ W/2 and W[R] ≤ W/2  // (3)
return {x[k]}
// by definition of weighted median,
// it must be on the heavier side
else if W[L] > W/2                 // (4)
w[k] = w[k] + W[R]
X' = {x[i] ∈ X: x[i] ≤ x[k]}
return WEIGHTED-MEDIAN(X')
else                               // (5)
w[k] = w[k] + W[L]
X' = {x[i] ∈ X: x[i] ≥ x[k]}
return WEIGHTED-MEDIAN(X')


Example 1:

x = {1, 2, 3, 4, 5}, w = {.15, .1, .2, .3, .25}, n = 5
W = 1, W/2 = .5
Iteration 1:
k = 3, x = 3, w = .2, W[L] = .25, W[R] = .55
Case 5
recurse with x = {3, 4, 5}, w = {.45, .3, .25}, n = 3
Iteration 2:
k = 2, x = 4, w = .3, W[L] = .45, W[R] = .25
Case 3
return {4}


Example 2:

x = {1, 2, 3, 4}, w = {.49, .01, .25, .25}, n = 4
W = 1, W/2 = .5
Iteration 1:
k = 2, x = 2, w = .01, W[L] = .49, W[R] = .50
Case 1
return {2, 3}


We showed in an optional video lecture that every undirected graph has only polynomially (in the number n of vertices) different minimum cuts. Is this also true for directed graphs? Prove it or give a counterexample.

For a parameter $\alpha \geq 1$, an $\alpha$-minimum cut is one for which the number of crossing edges is at most $\alpha$ times that of a minimum cut. How many $\alpha$-minimum cuts can an undirected graph have, as a function of $\alpha$ and the number $n$ of vertices? Prove the best upper bound that you can.
ANSWER: $n^{2\alpha}$. Proof TBD.