**Four kinds of method to "Select the kth largest or smallest element"**

**==> to make is O(n)**

A simpler formulation of a

*worst-case*O(*n*) algorithm, is as follows:- simply use the #Linear general selection algorithm - Median of Medians algorithm described in above sections to find the
*k*th element in O(n) time worst-case - use the Quicksort#Algorithm partition operation (which is O(n)) from Quicksort to partition into the elements less than and greater than the
*k*th element

### Specialized partial sorting algorithms based on mergesort and quicksort.

### ONE: The simplest is the quicksort variation: there is no need to recursively sort partitions which only contain elements that would fall after the *k*th place in the end. Thus, if the pivot falls in position *k* or later, we recur only on the left partition:

functionquicksortFirstK(list, left, right, k)ifright > left select pivotIndex between left and right pivotNewIndex := partition(list, left, right, pivotIndex) quicksortFirstK(list, left, pivotNewIndex-1, k)ifpivotNewIndex < k quicksortFirstK(list, pivotNewIndex+1, right, k)

The resulting algorithm requires an

*expected*time of only O(*n*+*k*log*k*), and is quite efficient in practice, especially if we substitute selection sort when*k*becomes small relative to*n*. However, the worst-case time complexity is still very bad, in the case of a bad pivot selection. Pivot selection along the lines of the worst-case linear time selection algorithm could be used to get better worst-case performance.**TWO**: Even better is if we don't require those

*k*items to be themselves sorted. Losing that requirement means we can ignore all partitions that fall entirely before

**or**after the

*k*th place. We recur only into the partition that actually contains the

*k*th element itself.

functionquickfindFirstK(list, left, right, k)ifright > left select pivotIndex between left and right pivotNewIndex := partition(list, left, right, pivotIndex)ifpivotNewIndex > k// new conditionquickfindFirstK(list, left, pivotNewIndex-1, k)ifpivotNewIndex < k quickfindFirstK(list, pivotNewIndex+1, right, k)

The resulting algorithm requires an expected time of only O(

*n*), which is the best such an algorithm can hope for.**Others**:

**A**==>Another simple method is to add each element of the list into an ordered set data structure, such as a heap or self-balancing binary search tree, with at most

*k*elements. Whenever the data structure has more than

*k*elements, we remove the largest element, which can be done in O(log

*k*) time. Each insertion operation also takes O(log

*k*) time, resulting in O(

*n*log

*k*) time overall.

**B**==>It is possible to transform the list into a heap in Θ(

*n*) time, and then traverse the heap using a modified Breadth-first search algorithm that places the elements in a Priority Queue (instead of the ordinary queue that is normally used in a BFS), and terminate the scan after traversing exactly k elements. As the queue size remains O(

*k*) throughout the traversal, it would require O(

*k*log

*k*) time to complete, leading to a time bound of O(

*n*+

*k*log

*k*) on this algorithm.

**C**==>We can achieve an O(log

*n*) time solution using skip lists. Skip lists are sorted data structures that allow insertion, deletion and indexed retrieval in O(log

*n*) time. Thus, for any given percentile, we can insert a new element into (and possibly delete an old element from) the list in O(log

*n*), calculate the corresponding index(es) and finally access the percentile value in O(log

*n*) time. See, for example, this Python-based implementation for calculating running median.

## No comments:

## Post a Comment