The Median Problem can be a perfect example for the divide and conquer solving technique, and is defined as:
Given an unsorted list
A=[a_1, a_2, ..., a_n]
of n positive integers, found the median of A.
We define a median element as the k
is a number between 1 and n.
The first solution could be to first sort the algorithm, and then directly get the
But there is a better solution, know as Median of medians, thanks to Blum, Floyd, Pratt,. Rivest, Tarjan (1973).
The solution architecture remind the Quicksort algorithm, that works as follow:
- We choose a pivot
p
. - We move all the array element into three bucket,
A<p
,A=p
,A>p
. - The algorithm is recursively applied to
A<p
and theA>p
. - The final output will be the sorted list of
A<p
, then theA=p
element andA>p
.
If we chose the wrong pivot, like the maximum or minimum element, the time complexity of the quicksort will be
The perfect pivot would be the median of A, but since is the ones we are searching for, we target a good pivot, that appears in the region from 1/4 to 3/4 of the sorted array A.
Python3 implementation: kth_largest_element_dac.py
A nice solution to the problem can be:
- Break A in 5 groups of
$n/5$ elements each (one swipe of the array, $O(N)$). - For each group, sort the group and find the median of each group (only sort 5 element,
$O(1)$ for each group, $O(N)$). - Create a list S of median of all groups.
- Run the same first three steps on array S of length
$n/5$ ($O(N/5)$). - Partition the array A into three bucket,
$A<p$ ,$A=p$ ,$A>p$ (one swipe of the array, $O(N)$). - Recourse on
$A<p$ ,$A>p$ or output p depending on k value ($O(3n/4)$ , since p is a good pivot, and the two bucket have a maximum size of$3n/4$ ).
So, with this algorithm, we have a time complexity of