Bubble Sort

Bubble Sort #

Contents #

Bubble sort is a basic comparison sort which requires \(\Theta(n^2)\) time and \(\Theta(1)\) space to process an unsorted array of length \(n\).

 INPUT    an array A[0:n - 1] of n values where n >= 0
OUTPUT    the same array sorted in ascending order

BUBBLESORT(A, n)                        // move large elements to the right, and iterate over left subarrays
1    for i = 0 to (n - 1) - 1           // 0->n-2
2        for j = 0 to (n - 1 - i) - 1   // 0->n-2, 0->n-3, ..., 0->0 (first to second to last)
3            if A[j] > A[j + 1]
4                swap(A[j], A[j + 1])

A: 6 5 4 3 2 1     i = 0
   |           <-- j = 0

A: 5 6 4 3 2 1     i = 0
     |         <-- j = 1

A: 5 4 6 3 2 1     i = 0
       |       <-- j = 2

A: 5 4 3 6 2 1     i = 0
         |     <-- j = 3

A: 5 4 3 2 6 1     i = 0
           |   <-- j = 4 (5 iterations)

A: 5 4 3 2 1 6     i = 1
   |           <-- j = 0

A: 4 5 3 2 1 6     i = 1
     |         <-- j = 1

A: 3 4 5 2 1 6     i = 1
       |       <-- j = 2

A: 4 3 2 5 1 6     i = 1
         |     <-- j = 3 (4 iterations)

A: 4 3 2 1 5 6     i = 2
   |           <-- j = 0

A: 3 4 2 1 5 6     i = 2
     |         <-- j = 1

A: 3 2 4 1 5 6     i = 2
       |       <-- j = 2 (3 iterations)

A: 3 2 1 4 5 6     i = 3
   |           <-- j = 0

A: 2 3 1 4 5 6     i = 3
     |         <-- j = 1 (2 iterations)

A: 2 1 3 4 5 6     i = 4
   |           <-- j = 0 (1 iteration)

A: 1 2 3 4 5 6

BUBBLESORT(A, n)                  // move small elements to the left, and iterate over right subarrays
  for i = 0 to (n - 1) - 1     // 0->n-2
      for j = n - 1 to i + 1   // n-1->1, n-1->2, ..., n-1->n-1 (last to second)
          if A[j] < A[j - 1]
              swap(A[j], A[j - 1])

A: 6 5 4 3 2 1     i = 0
             | <-- j = 5

A: 6 5 4 3 1 2     i = 0
           |   <-- j = 4

A: 6 5 4 1 3 2     i = 0
         |     <-- j = 3

A: 6 5 1 4 3 2     i = 0
       |       <-- j = 2

A: 6 1 5 4 3 2     i = 0
     |         <-- j = 1 (5 iterations)

A: 1 6 5 4 3 2     i = 1
             | <-- j = 5

A: 1 6 5 4 2 3     i = 1
           |   <-- j = 4

A: 1 6 5 2 4 3     i = 1
         |     <-- j = 3

A: 1 6 2 5 4 3     i = 1
       |       <-- j = 2 (4 itertations)

A: 1 2 6 5 4 3     i = 2
             | <-- j = 5

A: 1 2 6 5 3 4     i = 2
           |   <-- j = 4

A: 1 2 6 3 5 4     i = 2
         |     <-- j = 3 (3 iterations)

A: 1 2 3 6 5 4     i = 3
             | <-- j = 5

A: 1 2 3 6 4 5     i = 3
           |   <-- j = 4 (2 iterations)

A: 1 2 3 4 6 5     i = 4
             | <-- j = 5 (1 iteration)

A: 1 2 3 4 5 6

Complexity #

Time #

The outer loop iterates over \(n - 1\) elements. The inner loop iterates over at most \(n - 1\) elements.

\(T(n) = (n - 1)(n - 1) = O(n^2)\)

On each iteration BUBBLESORT places an element into sorted order and then recursively sorts the remaining elements. There are \(n\) expansions of the recursive function. Each term in the expaned form is no greater than \(n\) and so we can bound the expression by \(n\) terms of \(n\).

\( T(n) = \begin{cases} T(1) = 0 \\ T(2) = 1 \\ T(n) = T(n - 1) + n - 1 && n \gt 2 \\ \end {cases} \)

\( T(n) = \begin{cases} T(0) &= 1 \\ T(n) &= n + T(n - 1) \\ \end {cases} \)

\( \begin{aligned} T(n) &= n + T(n - 1) \\ & \left.\begin{aligned} &= && n && + && (n - 1) && + && (n - 2) && + && \dotsb && + && 1 \\ &\le && n && + && n && + && n && + && \dotsb && + && n \\ \end{aligned}\right. \\ &= n^2 \end {aligned} \)

Can we say that the time complexity of BUBBLESORT is not only \(O(n^2)\) but \(o(n^2)\)?

The complexity function has the quadaratic form \(an^2 + bn + c\) where \(a \ne 0\).

\( \begin{aligned} \lim_{n \to \infty} \frac{an^2 + bn + c}{n^2} = a \lim_{n \to \infty} 1 + b \lim_{n \to \infty} \frac{1}{n} + c \lim_{n \to \infty} \frac{1}{n^2} = a \end {aligned} \)

Since the limit evaluates to a nonzero constant we can say that BUBBLESORT is \(\Theta(n^2)\) but not \(o(n^2)\).

For an array of size \(1\) there are \(0\) comparisons since there is only one element.

\(T(1) = 0\)

For an array of size \(2\) there is \(1\) comparison among the two elements.

\(T(2) = 1\)

For an array of size \(3\) there are \(2\) initial comparisons, one comparison between the first and second elements and another comparison between the second and third elements. At this point the largest element in the array is now located at the end of the array. There are then \(T(2) = 1\) comparisons between the first and second elements.

For an input of size \(n\), there are a total of \(T(n) = T(n - 1) + n - 1\) comparisons.

\( T(n) = \begin{cases} T(0) & = 0 \\ T(1) & = 0 \\ T(n) & = T(n - 1) + n - 1 && n \ge 2 \\ \end {cases} \)

To solve this linear nonhomogeneous recurrence relation with constant coefficients we must

1. solve its associated linear homogeneous recurrence

\(T(n)^{(h)} = T(n - 1)\)

Its solutions are

\(c \cdot 1^n = c\) for some constant \(c\)

2. find a particular solution for the given linear nonhomogeneous recurrence

\( \begin{aligned} T(n) & = T(n - 1) + n - 1 \\ T(n) & = T(n - 1) + F(n) && F(n) = n - 1 \\ \end {aligned} \)

Since \(F(n) = n - 1 = (b_1n + b_0)s^n\) where \(b_1 = 1, b_0 = -1, s = 1\) there is a particular solution of the form \(n(p_1n + p_0)s^n = n(p_1n + p_0) = p_1n^2 + p_0n\).

\( \begin{aligned} T(n) & = T(n - 1) + n - 1 \\ p_1n^2 + p_0n & = p_1(n - 1)^2 + p_0(n - 1) + n - 1 \\ p_1n^2 + p_0n & = p_1(n^2 - 2n + 1) + p_0(n - 1) + n - 1 \\ p_1n^2 + p_0n & = p_1n^2 - 2p_1n + p_1 + p_0n - p_0 + n - 1 \\ 0 & = -2p_1n + p_1 - p_0 + n - 1 \\ 2p_1n - n - p_1 + p_0 + 1 & = 0 \\ (2p_1 - 1)n + (p_0 - p_1 + 1) & = 0 \\ \end {aligned} \)

\(2p_1 - 1 = 0 \implies p_1 = 1/2\)

\(p_0 - p_1 + 1 = 0 \implies p_0 = p_1 - 1 = (1/2) - 1 = -1/2\)

\( \begin{aligned} T(n)^{(p)} = p_1n^2 + p_0n = (1/2)n^2 + (-1/2)n = (1/2)n(n - 1) \\ \end {aligned} \)

\( T(n) = T(n)^{(h)} + T(n)^{(p)} = c + (1/2)n(n - 1) \implies T(1) = 0 = c + (1/2)(1)((1) - 1) \implies c = 0 \)

\( \boxed{ \begin{aligned} T(n) = \frac{n(n - 1)}{2} \end {aligned} } \)

Correctness #

\(\textbf{Precondition}\)

1. An array \(A[0:n - 1]\) of \(n\) values where \(n \ge 0\)
2. \(i = 0\)

\(\textbf{Condition}\)

\(i \lt n - 1\)

\(\textbf{Invariant}\)

1. \(0 \le i \le n - 1\)
2. After \(i\) iterations the subarray \(A[n - i:n - 1]\) contains the largest \(i\) values of \(A\) sorted in ascending order

\(\textbf{Variant}\)

\(n - i\)

\(\textbf{Postcondition}\)

After \(n\) iterations the subarray \(A[0:n - 1]\) contains all \(n\) values of \(A\) sorted in ascending order

\(\textbf{Initialization}\)

The first iteration (iteration \(0\)) performs a pass over subarray \(A[0:n - 1 - 1]\).

At some index \(\ell\) for \(0 \le \ell \le n - 2\) it is the case that \(A[\ell] \gt A[j]\)

At the end of iteration \(0\), the subarray \(A[n - 1:n - 1]\) contains the single largest value of \(A\) which is trivially in sorted order.

\(\textbf{Maintenance}\)

At the end of iteration \(k\), the subarray \(A[n - 1 - k:n - 1]\) contains the \(k + 1\) largest values of \(A\) in sorted order.

Iteration \(k + 1\) performs a pass over subarray \(A[0:n - 1 - k - 1]\) and at some index \(\ell'\) where \(0 \le \ell' \le j\) is the case that \(A[\ell'] \gt A[j]\).

At the end of iteration \(k + 1\), the subarray \(A[n - 1 - k - 1:n - 1]\) contains the \(k + 2\) largest values of \(A\) in sorted order.

\(\textbf{Termination}\)

At the end of the last iteration \(n - 1\), the subarray \(A[0:n - 1]\) contains the \(n\) largest values of \(A\) in sorted order.

Implementations #

C #

// bubble_sort.c

#include <stdio.h>

void print_array (int A[], int n) {
  for (int i = 0; i < n; i++) {
    printf("%d ", A[i]);
  }
  printf("\n");
}

void swap (int *a, int *b) {
  int temp = *a;
      *a   = *b;
      *b   = temp;
}

void bubble_sort (int A[], int n) {
  for (int i = 0; i < n - 1; i++) {
    for (int j = 0; j < n - 1 - i; j++) {
      if (A[j] > A[j + 1]) {
        swap(&A[j], &A[j + 1]);
      }
    }
  }
}

int main () {
  int A[] = {5, 4, 3, 2, 1};
  int n   = sizeof(A) / sizeof(A[0]);

  print_array(A, n);
  bubble_sort(A, n);
  print_array(A, n);

  return 0;
}

Python #

def bubble_sort (A, n):
  for i in range(n - 2 + 1):       # +1 bc upper bound is exclusive
    for j in range(n - 2 - i + 1): # +1 bc upper bound is exclusive
      if A[j] > A[j + 1]:
        A[j], A[j + 1] = A[j + 1], A[j]

A = [5, 4, 3, 2, 1]
n = len(A)

print(A)
bubble_sort(A, n)
print(A)

[5, 4, 3, 2, 1]
[1, 2, 3, 4, 5]

Acknowledgments #

https://www.geeksforgeeks.org/bubble-sort/