Sorting Algorithms - Tomekske/algorithms GitHub Wiki
In this section, we will explore various sorting algorithms.
Bubble Sort
The bubble sort algorithm is a simple comparison-based sorting algorithm that repeatedly steps through the list to be sorted, compares adjacent elements, and swaps them if they are in the wrong order. The process is repeated until the list is sorted.
Here's a step-by-step explanation of the bubble sort algorithm:
- Start with an unsorted list of elements.
- Compare the first element with the second element. If the first element is greater than the second element, swap them. If not, leave them as they are.
- Move to the next pair of adjacent elements and compare them in the same way, swapping if necessary. Continue this process until you reach the end of the list.
- At this point, the largest element will have "bubbled" to the end of the list (i.e., it will be in its correct sorted position).
- Repeat steps 2-4 for the remaining elements, excluding the last element that has already been sorted.
- Repeat steps 2-5 until the entire list is sorted. This process will require n-1 passes, where n is the number of elements in the list.
- Once all the passes are completed, the list will be sorted in ascending order.
The algorithm gets its name from the way larger elements gradually "bubble" to the end of the list during each pass. Bubble sort is known for its simplicity and ease of implementation, but it is not considered efficient for large lists as it has a time complexity of $O(n^2)$, where n is the number of elements in the list.
Pseudo Code
procedure bubbleSort(A: list)
n = length(A)
for i from 0 to n-1 do
for j from 0 to n-i-1 do
if A[j] > A[j+1] then
swap A[j] and A[j+1]
end if
end for
end for
end procedure
Complexity
Time Complexity
- Worst case: $O(n^2)$
- Average case: $O(n^2)$
- Best case: $O(n)$
Space Complexity
- Worst: $O(1)$
Pros and Cons
Pros
- Simplicity: Bubble sort is one of the simplest sorting algorithms to understand and implement
- Space Efficiency: Bubble sort operates in-place, saving memory resources as it doesn't require additional space apart from a few variables
- Adaptive: Bubble sort is adaptive and performs well on partially sorted lists. It can detect if the input is nearly sorted or already sorted, allowing it to terminate early and improve performance in those cases
Cons
- Inefficiency: Time complexity of $O(n^2)$. This makes it highly inefficient for large lists
- Lack of Efficiency in Average Case: Bubble sort's average case performance is also $O(n^2)$. Even when the input is not already sorted or partially sorted
- Lack of Adaptability: While bubble sort is adaptive in detecting sorted or partially sorted inputs, it still requires multiple passes through the entire list to ensure proper sorting
- Stable Sorting: It achieves stability by performing unnecessary swaps, which adds to its inefficiency
Usage
-
Small Data Sets: If the data set is small and the simplicity of implementation is more important than speed, bubble sort can be a viable choice.
-
Nearly Sorted Data: Bubble sort performs reasonably well on data sets that are already partially or nearly sorted. It has a characteristic of "bubbling" larger elements to the end of the list with each pass. Therefore, if the input is already close to its sorted state, bubble sort can quickly detect and swap out of order adjacent elements, making it efficient in such cases.
-
Educational Purposes: Bubble sort is often used in educational settings to introduce the concept of sorting algorithms. Its simplicity and visualization make it an excellent tool for teaching basic sorting techniques and the concept of algorithmic complexity.
-
Performance Analysis: Although bubble sort is not typically used in practical applications for large data sets, it can be useful for comparing the performance of other sorting algorithms. By comparing the number of comparisons and swaps required by bubble sort to other algorithms, one can gain insights into the efficiency of different sorting techniques.
Insertion Sort
The insertion sort algorithm is a simple sorting algorithm that builds the final sorted list one item at a time. It works by iterating through the input list and repeatedly inserting each element into its correct position in the already sorted portion of the list.
Here's how the insertion sort algorithm works:
- Start with the second element (index 1) of the list and consider it as the "key"
- Compare the key with the element(s) before it in the sorted portion of the list. Move the greater elements one position up to make space for inserting the key
- Repeat step 2 until either the key is in its correct position or there are no more elements left in the sorted portion
- Move to the next element (increment the key position by 1) and repeat steps 2 and 3 until all elements have been processed.
- The result is a sorted list in ascending order
The insertion sort algorithm has a time complexity of $O(n^2)$ in the worst and average cases, where n is the number of elements in the list. It performs well on small lists and is efficient for partially sorted or nearly sorted lists. It is also an in-place sorting algorithm, requiring only a constant amount of additional space.
Pseudo Code
procedure insertionSort(A: list)
n = length(A)
for i from 1 to n-1 do
key = A[i]
j = i - 1
while j >= 0 and A[j] > key do
A[j + 1] = A[j]
j = j - 1
end while
A[j + 1] = key
end for
end procedure
Complexity
Time Complexity
- Worst case: $O(n^2)$
- Average case: $O(n^2)$
- Best case: $O(n)$
Space Complexity
- Worst: $O(1)$
Pros and Cons
Pros
- Simplicity: Insertion sort is relatively simple to understand and implement. It has a straightforward logic compared to more complex sorting algorithms
- Efficient for Small Lists or Partially Sorted Lists: Insertion sort performs well for small lists or partially sorted lists. It has a best-case time complexity of O(n) when the input is already sorted, making it efficient in such scenarios
- In-Place Sorting: Insertion sort operates in-place, meaning it does not require additional memory beyond the original list. It is memory efficient and suitable for scenarios with limited memory resources
- Adaptive: Insertion sort is adaptive and efficient in scenarios where the input list is already partially sorted. It requires fewer comparisons and swaps to sort the list in such cases, leading to improved performance
Cons
- Inefficiency for Large Lists: Insertion sort has a time complexity of O(n^2) in the average and worst cases. This makes it inefficient for sorting large lists, as the number of comparisons and shifts grows quadratically with the input size
- Lack of Efficiency for Randomly Sorted Lists: When the input list is randomly ordered or in reverse order, insertion sort's performance degrades. It requires more comparisons and shifts to sort the list, resulting in decreased efficiency
- Not Suitable for Large-Scale Applications: Due to its relatively high time complexity, insertion sort is not commonly used in large-scale applications or scenarios where performance is a critical factor. More efficient sorting algorithms like merge sort or quicksort are preferred in such cases
- Unstable Sorting: It does not guarantee the preservation of the relative order of equal elements during the sorting process
Usage
Small Data Sets: Insertion sort performs well on small data sets. Partially Sorted Data: Insertion sort is beneficial for partially sorted data or data sets with a few out-of-order elements. It efficiently inserts elements into their correct positions within the sorted portion of the array, making it effective for somewhat ordered input data. Online Sorting: Insertion sort is well-suited for continuously changing or streaming data, as it can efficiently maintain a sorted array by incorporating new elements incrementally. Adaptive Sorting: Insertion sort is efficient for partially sorted data, minimizing comparisons and swaps needed Stable Sorting: Insertion sort's stability makes it advantageous for preserving the original order of equivalent elements during sorting
Merge Sort
The Merge Sort algorithm is a divide-and-conquer algorithm that aims to sort an array or a list of elements. It follows the following steps:
- Divide: The algorithm recursively divides the input array into two halves until each half contains only one element or is empty. It continues dividing the array until it reaches the base case of having single-element subarrays.
- Conquer: After the array is divided into the smallest possible subarrays, the algorithm starts merging them back together in a sorted manner. During the merging process, adjacent subarrays are compared, and the elements are rearranged in sorted order.
- Merge: The merging process compares the elements of the two subarrays and combines them into a single sorted subarray. It compares the elements from the subarrays one by one and places them in order into a temporary array or buffer. The comparison continues until all elements from both subarrays have been merged.
- Copy: Once the merging is complete, the sorted elements are copied back into the original array or a separate output array. The elements are copied back in the correct order, resulting in a fully sorted array.
Pseudo Code
procedure mergeSort(A: list)
if length(A) <= 1 then
return A
else
middle = length(A) / 2
left = mergeSort(subarray(A, 0, middle))
right = mergeSort(subarray(A, middle, length(array)))
return merge(left, right)
end if
end procedure
procedure merge(left: list, right: list)
merged = []
i = 0
j = 0
while i < length(left) and j < length(right) do
if left[i] <= right[j] then
append merged, left[i]
i = i + 1
else
append merged, right[j]
j = j + 1
end if
end while
while i < length(left) do
append merged, left[i]
i = i + 1
end while
while j < length(right) do
append merged, right[j]
j = j + 1
end while
return merged
end procedure
Complexity
Time Complexity
- Worst case: ${O(n.log(n))}
- Average case: ${O(n.log(n))}
- Best case: ${O(n.log(n))}
Space Complexity
- Worst: ${O(n)}
Pros And Cons
Pros
- Stable Sorting:
- Guaranteed Worst-case Time Complexity: Merge Sort has a consistent worst-case time complexity of O(n log n), regardless of the input. This makes it a reliable choice for sorting large data sets.
- Efficient for Linked Lists: Merge Sort performs well with linked lists since it requires only a constant amount of additional memory to perform the merging process. It can efficiently sort linked lists without requiring additional space proportional to the size of the list.
- Divide-and-Conquer Paradigm: Merge Sort follows the divide-and-conquer paradigm, which breaks down the problem into smaller, more manageable subproblems. This makes the algorithm easier to implement and understand.
Cons
- Additional Space Complexity: Merge Sort requires additional memory to create temporary arrays during the merging process. This can be a drawback when working with large arrays, as the space complexity is O(n).
- Not In-place: Merge Sort does not sort the array in-place. It needs additional memory proportional to the size of the input array to perform the merging process. This can be a limitation when memory usage is a concern.
- Recursive Nature: Merge Sort is implemented recursively, which incurs function call overhead and requires a recursive stack. This can lead to performance issues when sorting extremely large arrays due to the recursion depth.
- Non-adaptive: Merge Sort does not take advantage of any existing order or partially sorted nature of the input array. It performs the same number of comparisons and merges regardless of the initial order of elements, making it less efficient in certain scenarios.
Usage
Large Data Sets: Merge sort is an efficient choice for large data sets with a predictable time complexity of O(n log n). It divides the array, sorts the smaller parts recursively, and merges them, ensuring reliable performance for substantial amounts of data. Stability: Merge sort's stability makes it advantageous for preserving the original order of equivalent elements during sorting External Sorting: Merge sort is well-suited for external sorting, where data resides on slower external storage devices like hard drives. It minimizes data movement during merging operations, making it efficient for handling large data sets with slower external storage access. Parallelization: Merge sort can be easily parallelized due to its divide-and-conquer nature. Stability over Time Complexity: Merge sort offers consistent performance, although it may not have the fastest average case time complexity compared to algorithms like quicksort. It ensures a worst-case time complexity of O(n log n), making it a reliable choice when predictable sorting performance is desired.