Sorting Algorithms - David-Chae/Algorithms_Notes_Solutions GitHub Wiki
A Sorting Algorithm is used to rearrange a given array or list elements according to a comparison operator on the elements. The comparison operator is used to decide the new order of elements in the respective data structure.
Sorting Terminology
What is in-place sorting?
An in-place sorting algorithm uses constant space for producing the output (modifies the given array only). It sorts the list only by modifying the order of the elements within the list. For example, Insertion Sort and Selection Sorts are in-place sorting algorithms as they do not use any additional space for sorting the list and a typical implementation of Merge Sort is not in-place, also the implementation for counting sort is not an in-place sorting algorithm. So the auxiliary space complexity of non-in-place sorting algorithms is increased by O(N) where N is the number of elements on which sorting has to be applied while for in-place algorithms it does not increase. To be more clear please try the below link: https://en.wikipedia.org/wiki/In-place_algorithm.
Types Of Sorting :
Internal Sorting
- When all data is placed in the main memory or internal memory then sorting is called internal sorting.
- In internal sorting, the problem cannot take input beyond its size.
- Example: heap sort, bubble sort, selection sort, quick sort, shell sort, insertion sort.
External Sorting
- When all data that needs to be sorted cannot be placed in memory at a time, the sorting is called external sorting.
- External Sorting is used for the massive amount of data.
- Merge Sort and its variations are typically used for external sorting.
- Some external storage like hard disks and CDs are used for external sorting.
- Example: Merge sort, Tag sort, Polyphase sort, Four tape sort, External radix sort, Internal merge sort, etc.
External sorting is a term for a class of sorting algorithms that can handle massive amounts of data. External sorting is required when the data being sorted do not fit into the main memory of a computing device (usually RAM) and instead, they must reside in the slower external memory (usually a hard drive). External sorting typically uses a hybrid sort-merge strategy. In the sorting phase, chunks of data small enough to fit in main memory are read, sorted, and written out to a temporary file. In the merge phase, the sorted sub-files are combined into a single larger file.
One example of external sorting is the external merge sort algorithm, which sorts chunks that each fit in RAM, then merges the sorted chunks together. We first divide the file into runs such that the size of a run is small enough to fit into main memory. Then sort each run in main memory using merge sort sorting algorithm. Finally merge the resulting runs together into successively bigger runs, until the file is sorted.
Example:
Merge sort Tape sort Polyphase sort External radix External merge
When We do External Sorting :
- When the unsorted data is too large to perform sorting in computer internal memory then we use external sorting.
- In external sorting we use the secondary device. in a secondary storage device, we use the tape disk array.
- when data is large like in merge sort and quick sort.
- Quick Sort: best average runtime.
- Merge Sort: Best Worse case time.
- To perform sort-merge, join operation on data.
- To perform order by the query.
- To select duplicate element.
- Where we need to take large input from the user.
What is stable sorting?
- When two same data appear in the same order in sorted data without changing their position is called stable sort.
- Example: merge sort, insertion sort, bubble sort.
What is Unstable sorting?
- When two same data appear in the different order in sorted data it is called unstable sort.
- Example: quick sort, heap sort, shell sort.
Stable and Unstable Sorting Algorithms
Stability is mainly important when we have key value pairs with duplicate keys possible (like people names as keys and their details as values). And we wish to sort these objects by keys. A sorting algorithm is said to be stable if two objects with equal keys appear in the same order in sorted output as they appear in the input array to be sorted. Informally, stability means that equivalent elements retain their relative positions, after sorting.
Formally stability may be defined as, Let A be an array, and let < be a strict weak ordering on the elements of A. A sorting algorithm is stable if- i < j::and::A[i]\equiv A[j]::implies::\pi (i) < \pi (j) where \pi is the sorting permutation ( sorting moves A[i] to position \pi(i) )
Do we care for simple arrays like array of integers?
When equal elements are indistinguishable, such as with integers, or more generally, any data where the entire element is the key, stability is not an issue. Stability is also not an issue if all keys are different.
An example where it is useful
Consider the following dataset of Student Names and their respective class sections.
\ (Dave, A)\ (Alice, B)\ (Ken, A)\ (Eric, B)\ (Carol, A)
If we sort this data according to name only, then it is highly unlikely that the resulting dataset will be grouped according to sections as well.
\ (Alice, B)\ (Carol, A)\ (Dave, A)\ (Eric, B)\ (Ken, A)
So we might have to sort again to obtain list of students section wise too. But in doing so, if the sorting algorithm is not stable, we might get a result like this-
\ (Carol, A)\ (Dave, A)\ (Ken, A)\(Eric, B)\(Alice, B)
The dataset is now sorted according to sections, but not according to names. In the name-sorted dataset, the tuple (Alice, B) was before (Eric, B), but since the sorting algorithm is not stable, the relative order is lost. If on the other hand we used a stable sorting algorithm, the result would be-
\ (Carol, A)\ (Dave, A)\ (Ken, A)\(Alice, B)\(Eric, B)
Here the relative order between different tuples is maintained. It may be the case that the relative order is maintained in an Unstable Sort but that is highly unlikely.
Which sorting algorithms are stable?
Some Sorting Algorithms are stable by nature, such as Bubble Sort, Insertion Sort, Merge Sort, Count Sort etc. Comparison based stable sorts such as Merge Sort and Insertion Sort, maintain stability by ensuring that- Element A[j] comes before A[i] if and only if A[j] < A[i], here i, j are indices and i < j.
Since i<j, the relative order is preserved if A[i]\equiv A[j] i.e. A[i] comes before A[j].
Other non-comparison based sorts such as Counting Sort maintain stability by ensuring that the Sorted Array is filled in a reverse order so that elements with equivalent keys have the same relative position.
Some sorts such as Radix Sort depend on another sort, with the only requirement that the other sort should be stable.
Which sorting algorithms are unstable?
Quick Sort, Heap Sort etc., can be made stable by also taking the position of the elements into consideration. This change may be done in a way which does not compromise a lot on the performance and takes some extra space, possibly \theta(n).
Can we make any sorting algorithm stable?
Any given sorting algo which is not stable can be modified to be stable. There can be sorting algo specific ways to make it stable, but in general, any comparison based sorting algorithm which is not stable by nature can be modified to be stable by changing the key comparison operation so that the comparison of two keys considers position as a factor for objects with equal keys.
Time Complexities of all Sorting Algorithms
Reference: https://www.geeksforgeeks.org/sorting-algorithms/?ref=lbp https://www.geeksforgeeks.org/sorting-terminology/ https://www.geeksforgeeks.org/stable-and-unstable-sorting-algorithms/ https://www.geeksforgeeks.org/external-sorting/