Understanding Radix Sort
Disclaimer: Radix sort will only work for sorting numbers. I’ll explain why later on!
Throughout my studies of data structures and algorithms over the last couple of months, I have learned several sorting algorithms. Most of them are fairly similar in the way that they sort data. They compare two pieces of data at once and put them into the right place based on that. Bubble sort, merge sort, selection sort, insertion sort, and quick sort all fall into this category, which are called “comparison” algorithms. While these are all useful algorithms to know, and each have specific situations that they shine in, they are not remarkably fast. Bubble, insertion, and selection sort all have average time complexity of O(n²). This week, I learned about Radix sort. It is the first sorting algorithm I have learned that does not directly compare two items to sort. So, if the algorithm is not comparing any two items to each other, how can it properly sort data?
The answer is, we use “buckets.” We create a bucket for each number 0–9, and then put each number into the correct bucket. For example, if we were sorting a list of 21, 43, and 15, they would be put into the 1, 3, and 5 buckets respectively, then merged back into the original data set. But just doing this alone does not sort the data. This process needs to be repeated as many times as there are digits in the highest number of your data set. I know that is a bit of a wordy explanation, so I’ll give an example. If the highest number in your dataset is 100, you need to sort 3 times. If the highest is 1000, you need to run it 4 times. By doing this, we do not need to compare each number individually, just put it into the correct bucket. Once that is done, we push them back into the original list in ascending bucket order (0–9).
In order to write a radix sort function, helper functions can clean up the code significantly. Aside from actually sorting the data, we also need to do a few other things first. First and foremost, we need to know how many rounds of sorting are necessary. In order to do this, I wrote two functions. The first is called getDigits. It takes a number as an argument and returns how many digits are in that number. The second helper function I wrote is called mostDigits. It takes the dataset that we are sorting as an argument and calls getDigits on all of them, before finally returning the amount of digits in the largest number.
Next, I wrote a function called getNum. The arguments it takes are a digit and the place or power of 10 we are looking for that digit in. for example, in the number 1752, i would be 3 if you were looking for the 1, 2 if you were looking for the 7, 1 if you were looking for the 5, and 0 if you were looking for the 2.
Now that we have all of these helper functions out of the way, we are ready to write Radix sort!
I found writing radix sort to be an extremely challenging and rewarding experience. The challenge was not in the logic itself, but in approaching a sorting algorithm that was not a comparison algorithm. Getting out of the frame of mind of writing a comparison algorithm was tough, but I’m glad that I learned how radix sort works, it was definitely an eye-opener.
I think seeing this represented visually makes it much easier to understand, and I would highly recommend going to visualgo and checking out their customizable radix sort visualization.