DSA For The Rest Of Us - Part 2

Introduction to Binary Search.

Mar 27, 2023

In Part 1 of this most illustrious and illuminating series on DSA (Data Structure and Algorithms), we converted the basics of Linked Lists. In Part 2 of this series, we going to cover the much-maligned Binary Search.

I don’t know if it’s from LeetCode hatred, which is justified, or what, but it seems like Binary Search has always taken the beating in some online rant about how out of touch the interview process is with real life. While this might be true, we are taking a different approach in this series on DSA.

Sure, some things might be pointless on the face value that you won’t use them explicitly per se in your day job, but to think that becoming smarter, and understanding fundamentally how many popular DSA topics work, well, that’s just short-sighted to your future self.

Pushing ourselves as Data Engineers and code writers is the only way to get better. Like it or not, DSA comes up in the interview life. But, it’s also true that a fundamental understanding of what even a Linked List is, or how basic Binary Search would make you worse off, well that just isn’t true. In fact, the opposite. You will be more well-versed, accomplished, and confident.

Thanks to Delta for sponsoring this newsletter! I personally use Delta Lake on a daily basis, and I believe this technology represents the future of Data Engineering. Check out their website below.

Introduction to Binary Search. What is it?

Code available on GitHub.

Binary search, I’m sure you’ve heard about it at least once or twice in passing. Sounds scary right? Well, it’s not the evil monster you think it is. We are going to take the 10,000-foot view of binary search, let’s make it easy.

What is binary search?

“An algorithm that can quickly (logarithmic time) search through a sorted list/array of elements to find what it’s looking for.”

Let’s think about the problem at a high level.

You have a “list”, “array” or whatever of sorted items.
You want a function that can perform the best Big O notation search to find the location of a random element.

Ok, so you really only have two options to solve this problem.

Starting going through the items one by one until you find what you’re looking for.
Jump around in some sort of “smart” manner to find the item more quickly.

The answer that makes the binary search a great option and teaching tool for algorithms is that the solution is straightforward, and makes you say “well duh.” The idea is to cut the search area in half every time, you can do this by saying is the x I’m looking for greater or less than where I am currently at. Then jumping to the middle of the next slice and repeating the process.

Example

I’m sure it will make more sense with an example, first, we will walk through a simple example using pictures then we will switch to code.

Let’s say we have an array of numbers, 1-12, like in our picture above. For argument’s sake let’s say we want to write a function that can take any number, x, and find its location.

The first most obvious thing to do, in binary search, is to go directly to the middle of the array and ask if the x is either equal to, greater than, or less than that number. If it’s greater, then we can throw away the bottom half of the array, if it’s less than that, we can throw away the top portion.

Let’s say x came in as 4.

If x was equal to 4, and we started in the middle, at 6, and 4 is less than 6, we can throw away half the array. Obviously, we could throw away 6 as well, but you get the point. And guess what, if we do that again, we can do the same thing, cut the search area in half!

And so it goes. Each time we can throw away half the list/array area we are searching. This is the infamous binary search. Look at that, all your fears about running into binary search in some interview or now swiftly fading into the distant background! Well, maybe.

Let’s try to write some code first before we declare victory.

Binary Search with Rust.

Ok, since we are gluttons for punishment, and of course, we could use Python like everyone else, we will do no such thing! Rust it is. So we have the theory, right? We need a function that can take any number x, search through an array of integers and return the location of that x if it exists.

First, the array.

Well, that was the easy part. Next our function.

Hmm… so the first thing we would want to do is slice into the very middle of numbers, and check if that position is equal to, greater than, or less than our target number.

That should do the trick, the length of the Vector divided by 2. So here is the shell of our Rust function so far.

Ok, but now following our logic, we need to check if our target is above or below the index, and throw away what we don’t need. Let’s add some more if-else logic.

Ok, what would happen if we put this together and run it so far?

So we can see we are headed in the right direction! We passed in 42 as our target and the code picked the correct half of the array to work on!

But we have a problem, this function requires some loop or recursion. That should be obvious to smart engineers like yourself. Also, we either have to modify the Vector based on throwing out what we dont need, or look into a slice of the array we are interested in.

Here is the working code, while loop, and all. Although I think it would be a fun exercise to use recursion (function calling itself) rather than a while loop. Also, not modifying the Vector like I’m doing would be a better implementation as well.

Let me preface this code by saying I don’t know Rust very well, only a month or two in. Also, my solution is just that, my first solution that is not production quality, I pray you can write a better one! But, does it find what it’s looking for? Yes!

And here is the results of the print statements. The important thing to note is how to search area is sliced in half each time, which is the point of binary search, finding our needle in the haystack as quickly as possible.

A very interesting and fairly simple algorithm!

Not the best binary search, far from it, but I think it’s a simple enough example to show us the ideas behind binary search and possible implementations. It’s a great place to start, a fun project to work on (without Googling or ChatGTP’ing answers), which can be tempting to do, but detrimental to learning.

After I finished this little fiasco, I went and asked ChatGTP to do it. It’s smarter than me and gave better code, here it is.

Using ChatGTP or CoPilot to write code is dangerous, sure it can write it, but if you’re learning a new topic, you should struggle through it yourself so you actually learn something.

I learned a lot about Rust when writing my version, I was especially excited to learn the drain() method on Vectors. If I had asked AI for the answer, I would have learned nothing, but now I have binary search embedded in my brain because I struggled through it.

I hope you’ve enjoyed this simple series on DSA as much as I have. It’s good to do the basics, and learn them from scratch. It does a lot to make it stick in your head for the long term.