# Probability Questions

Let be some constant (independent of the input array length ). Recall the Partition subroutine employed by the QuickSort algorithm, as explained in lecture. What is the probability that, with a randomly chosen pivot element, the Partition subroutine produces a split in which the size of the smaller of the two subarrays is times the size of the original array?

- \(1 - 2 * \alpha\)
- \(\alpha\)
- \(1 - \alpha\)
- \(2 - 2 * \alpha\)

**ANSWER:** If the picked pivot is one of the smallest elements in the array, then the left partition’s size will be less than . Similarly, if the picked pivot is one of the largest elements in the array, the right partition’s size will be less than .
Hence, there are elements that we can pick such that both partitions have size greater than or equal to . Since the algorithm picks a pivot randomly, all elements have an equal probability of getting picked as the pivot, i.e. .
Therefore, . Option 1 is correct.

Now assume that you achieve the approximately balanced splits above in every recursive call — that is, assume that whenever a recursive call is given an array of length , then each of its two recursive calls is passed a subarray with length between and (where is a fixed constant strictly between and ). How many recursive calls can occur before you hit the base case? Equivalently, which levels of the recursion tree can contain leaves? Express your answer as a range of possible numbers , from the minimum to the maximum number of recursive calls that might be needed.

- \(-\frac{\log n}{\log \alpha} \leq d \leq -\frac{\log n}{\log(1 - \alpha)}\)
- \(0 \leq d \leq -\frac{\log n}{\log \alpha}\)
- \(-\frac{\log n}{\log(1 - \alpha)} \leq d \leq -\frac{\log n}{\log \alpha}\)
- \(-\frac{\log n}{\log(1 - 2\alpha)} \leq d \leq -\frac{\log n}{\log(1 - \alpha)}\)

**ANSWER:** If , then , thus the branch with subarray length will have greater recursion depth.

If is the height of the recursion tree,

Taking on both sides

By log rule

The best case is . Since both and are less than , so, their logarithms are negative. Adding one more negative symbol on that results in an overall positive value for minimum and maximum depth. Therefore, option 1 is correct.

Define the recursion depth of QuickSort to be the maximum number of successive recursive calls before it hits the base case — equivalently, the number of the last level of the corresponding recursion tree. Note that the recursion depth is a random variable, which depends on which pivots get chosen. What is the minimum-possible and maximum-possible recursion depth of QuickSort, respectively?

- \(Minimum: \Theta(\log n); Maximum: \Theta(n)\)
- \(Minimum: \Theta(\log n); Maximum: \Theta(n\log n)\)
- \(Minimum: \Theta(1); Maximum: \Theta(n)\)
- \(Minimum: \Theta(\sqrt{n}); Maximum: \Theta(n)\)

**ANSWER:** Option 1 is correct.

Consider a group of k people. Assume that each person’s birthday is drawn uniformly at random from the possibilities. (And ignore leap years.) What is the smallest value of k such that the expected number of pairs of distinct people with the same birthday is at least one?

[Hint: define an indicator random variable for each ordered pair of people. Use linearity of expectation.]

- \(27\)
- \(28\)
- \(366\)
- \(20\)
- \(23\)

**ANSWER:** For each pair of people in the room, let be an indicator random variable such that:

( may have been born on any of the days, and on any of the remaining days).

If

Rewriting as a general quadratic equation , for which is given by

Since the number of people cannot be negative, . Option 2 is correct.

Let denote the outcomes of three rolls of a six-sided die. (i.e., each is uniformly distributed among , and by assumption they are independent.) Let denote the product of and the product of . Which of the following statements is correct?

- Y and Z are independent, but
- Y and Z are not independent, but
- Y and Z are not independent, but
- Y and Z are independent, but

**ANSWER:** For and to be independent, by definition, the two events and must be independent . If we choose , then is the probability that the outcome of all three rolls was one. Since each roll is independent, this is equal to \({\frac{1}{6}}^3\).

Clearly, , therefore, and are not independent.

Covariance reference: Expectations on the product of two dependent random variables

Clearly, is the same.

and clearly . Similarly, . Therefore, , and from expression . Option 3 is correct.