Saturday, September 9, 2017

[Algorithms] Find the best pivot for quicksort or quickselect algorithm

Problem with quicksort is that in worst case an input array is not divided in equal parts,
T(n) = T(n-1) + O(n), that leads to the order of O(n^2). Following is the algorithm that would find the best pivot in linear time and guarantees the best split. Then recurrence becomes,
T(n) = 2T(n/2) + O(n) which leads to order of O(n logn).

Idea is divide an input array A in 5 rows. Then each column will have atmost 5 elements. To sort one column with constant elements takes constant time. Sorting all the columns would take O(n) time. Then take median of all the columns and form an array. Recursively find the median of that array. Choose the median element as pivot.

Algorithm:
  def select_pivot(A):
    if len(A) <=5:
      sort(A)
      return A[len(A)/2]
   
    bucket = [[]] * 5
    #divide the elements into 5 buckets
    for i in range(0, A.length()):
      buckets[i % 5].append(A[i])

    next_A = []
    columns = A.length() / 5
    for col in range(0, columns):
      a = []
      for row in range(0, 5):    # extract elements of a column
        a.append(bucket[row][col])
      sort(a)
      next_A.append(a[2]) #take median
      return select_pivot(next_A)


Analysis:
Dividing the elements in 5 buckets takes O(n) time.
Sorting a single column of 5 elements takes O(1) time. Sorting n/5 columns takes O(n) time.
Next recursion takes n/5 number of elements. So order, T(n/5)

T(n) = T(n/5) + O(n) = O(n)

Following the above algorithm we can guarantee, even in the worst case, we will have a pivot that will split input array in 2 equal parts. Once we know the pivot element, we can find the index of the pivot element from quicksort procedure in O(n) time.

No comments:

Post a Comment