I knew we couldn’t avoid it forever, the complicated stuff that is. Our days were always numbered, that’s just life. Our doom is upon us, alas, great mourning and suffering are about to stretch their cruel hand upon us.
Ok, I might have been a little dramatic. QuickSort is probably slightly more complicated than our last two DSA topics, Linked Lists, and Binary Search. But, if you remember our Binary Search probably, which required looking for a particular item in a sorted list/array, this topic of QuickSort is a natural next step.
Remember in our Binary Search discussion, it required a sorted list/array to work with? Someone might have asked, but where did this pre-sorted list/array come from? Well, that is the point of a QuickSort algorithm.
If we have a jumbled mess of a list or array, how can we efficiently sort it? Again, I’m going to keep this at a high level, so some things I’m going to skim over.
Thanks to Delta for sponsoring this newsletter! I personally use Delta Lake on a daily basis, and I believe this technology represents the future of Data Engineering. Check out their website below.
The Big Idea
One of the concepts behind Quick Sort is the “divide and conquer” approach. I encourage you to read more about this topic, I think it will be illuminating for you in how to understand complicated problems, even in your everyday Data Engineering life.
Break down a problem into it’s the smallest possible unit, then repeat, recursively.
The idea of a function calling itself, being recursive is a powerful one when it comes to writing algorithms, and QuickSort is probably one of the simplest DSA problems that use this idea of “divide and conquer” combined with recursive function calls to solve a problem.
{smallest unit of work} + (recursive) = answer
QuickSort Algorithm
With all that mumbo-jumbo out of the way, let’s dive into QuickSort. This is a great introduction to sorting algorithms, the complexity of that topic, divide and conquer, and finally recursive functions.
sorting complexity
divide and conquer
recursion
The basis of Quick sort, the problem, and the solution can be seen in the below diagram. The idea is you have a list/array of unsorted items, how do you efficiently sort them into the right order?
Here is my 10,000-foot view of how to do that.
First, pick a “pivot” point out the array.
typically the first, or last item.
Iterate through the rest of the items.
if the item is greater > than your pivot push to another array
if the item is less > than your pivot, push it to another array.
Of course, there are slight variations in the implementations, depending on the language. In Python, for example, many examples will simply shuffle the items in the array, but I felt using the concept of the brand new array to capture what happening was helpful.
Let’s walk through a QuickSort example in Rust.
First things first, let’s make an unordered list of ints in a Vector.
The next step would be actually writing our QuickSort function, it will of course take a Vec and return a Vec. If you’re not familiar with Rust, just think of a Vector like it’s a List or Array.
You will also not in the above function that we check for an empty Vec, and probably should actually change that code to be < 2, so if there are 1 or 0 elements, we simply return as there is no sorting to do.
The next few steps in my QuickSort implementation are fairly simple, I will pick a pivot, the first time, and create two new Vectors, to hold values above and below my pivot.
The Meat and Taters.
Ok, so now that the easy stuff is behind us we are getting to the meat n’ taters portion of the QuickSort, where all the magic happens. Let’s review quickly what we need to happen before we dive in.
We picked a pivot, or number to hold, in our case the first item in the Vector.
We want to iter through the remaining items in the Vector checking if they are greater than or less than our pivot.
We want to push items into one of our two Vectors, the left, less than or equal to side, or right, higher than side.
Then we must assemble these pieces back together into a single Vector, [left] + pivot + [right']
We must then call the function on itself and apply this same logic over and over again in a recursive manner.
Let’s get cracking.
The next most obvious thing to do is iterate the input Vector, pushing the items after our pivot into the correct new Vector.
Again, this part isn’t all that complicated. You can see we start to iterate the Vector starting from position 1, notice it’s not 0, that’s our pivot. We then simply check it against our pivot and based on if it’s less than or greater or equal to the pivot, it will go into either the left or right Vector.
Now for the finishing touches.
Let’s see if you notice the special sauce in the mix. See it? We take our left, right, and pivot Vectors and combine them into a single result. With one catch.
We insert a recursive call to the function we are writing, we call itself on the left and right vectors while appending them to the result. This might look a little strange at first if you’re not used to recursive calls, a function calling itself, but it is indeed a powerful idea.
What’s the result of running this code?
On the top, you can see the unsorted Vector printed out, and on the bottom the sorted one. It’s like magic. You can find the entire code here.
Closing Thoughts.
I hope that was a decent introduction to QuickSort, and a great building block upon the Linked Lists and Binary Search we have already covered in this introduction to DSA for the rest of us.
I think that QuickSort definitely ups the ante a notch, but mostly because of two important concepts.
Divide and conquer.
Recursion.
I think both these topics are well worth your time to spend a little energy researching and reading up on the two, if not to become an expert, just to ensure you understand them, and can at least talk with some form of confidence on the topic.