Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Please note that this is programming I purely did for the learning experience. The pure R bubble sort implemented in this post is veeeeery slow for two reasons:
- Interpreted code with lots of iteration is very slow.
- Bubble sort is one of the slowest sorting algorithms (
O(N^2)
)
The bubble sort sorting algorithm works by iterating over the unsorted vector and comparing pairs of numbers. Let’s say the first point pair is c(61, 3)
, here the numbers need to be swapped as the 3 should be earlier in the sorted vector. The following function returns TRUE
if the numbers should be swapped, and returns FALSE
otherwise:
larger = function(pair) { if(pair[1] > pair[2]) return(TRUE) else return(FALSE) }
This function is used by the following function:
swap_if_larger = function(pair) { if(larger(pair)) { return(rev(pair)) } else { return(pair) } }
which returns the swapped version of the pair if appropriate, or the original pair if the order is ok. For each point pair (element1-element2, element2-element3, etc) swap_if_larger
is called:
swap_pass = function(vec) { for(i in seq(1, length(vec)-1)) { vec[i:(i+1)] = swap_if_larger(vec[i:(i+1)]) } return(vec) }
One pass of this function performs a comparison on all pairs, swapping if necessary. To fully sort the vector, we need to perform multiple passes until no swaps are needed anymore. I chose to implement this using recursion:
bubble_sort = function(vec) { new_vec = swap_pass(vec) if(isTRUE(all.equal(vec, new_vec))) { return(new_vec) } else { return(bubble_sort(new_vec)) } }
The function starts by perform a swapping pass over the vector. If the new vector is equal to the old vector, no swaps where needed, i.e. the vector is already sorted. The function than returns the vector. Alternatively, if the vectors are different, the vector is not yet fully sorted, and we need to perform more passes. This is accomplished by recursively calling bubble_sort
again on the vector. An example of the function in action:
> test_vec = round(runif(100, 0, 100)) > bubble_sort(test_vec) [1] 1 1 6 6 9 10 10 10 13 14 14 15 19 19 20 21 23 24 24 24 26 26 26 26 27 [26] 28 28 30 31 32 34 35 35 36 36 37 39 39 40 40 40 41 41 41 41 43 43 43 45 46 [51] 47 51 56 56 57 57 57 58 58 59 61 61 62 63 64 65 68 68 69 70 71 71 72 73 74 [76] 75 75 75 78 79 82 82 84 85 88 88 89 90 91 91 91 92 92 92 92 93 93 96 96 99 >
The full sorting process is nicely illustrated by the following animated gif (linked from wikipedia):
This implementation is horribly slow:
> system.time(bubble_sort(test_vec)) user system elapsed 0.076 0.000 0.077 > system.time(sort(test_vec)) user system elapsed 0.001 0.000 0.001
Probably implementing the relatively slow bubble sort in a compiled language pose a dramatic increase in speed. Maybe a nice first testcase for Rcpp
…
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.