Steve Daskam's Blog

Blog about the daily practices and methods of an agile software developer.

The Binary Search Challenge

with 5 comments

Binary SearchI came across this post recently that claimed that only 10% of professional programmers can write a correct implementation of a binary search. The post quotes the book “Programming Pearls” by Jon Bentley, and says that there’s a passage which claims that the assignment of writing a proper binary search routine was given to over a hundred programmers at companies like Bell Labs and IBM. These developers were given several hours to come up with a solution and afterwords were allowed to test their code. Only 10% had gotten it right.

Now this is not surprising to me as Knuth points out that while the first binary search was published in 1946, the first published binary search without bugs did not appear until 1962. And come on, how often do programmers get anything right the first time? That’s why they invented TDD and refactoring.

So, since I have never actually written this algorithm myself, I thought it would be fun to take the challenge. Using only the requirements below, write a correctly functioning implementation of the binary search algorithm in the language or your choice.

Binary search solves the problem [of searching within a pre-sorted array] by keeping track of a range within the array in which T [i.e. the sought value] must be if it is anywhere in the array. Initially, the range is the entire array. The range is shrunk by comparing its middle element to T and discarding half the range. The process continues until T is discovered in the array, or until the range in which it must lie is known to be empty. In an N-element table, the search uses roughly log(2) N comparisons.

I spent around 1 1/2 to 2 hours max to come up with my initial implementation in java below…pardon my unclean rough draft. Unfortunately, I didn’t get it right the first time, but was able to easily write the code and tests for it within the above timeframe. The algorithm has some small subtleties that are tricky.

I have tested it with the following data and it works fine. Note that the algorithm requires the array to be pre-sorted.
Test 1: private Integer[] myArray = {1};
Test 2: private Integer[] myArray = {1, 5};
Test 3: private Integer[] myArray = {1, 3, 5};
Test 4: private Integer[] myArray = {1, 3, 5, 8, 9, 10, 12, 15};

My Binary Search Implementation

public int binarySearch(int t) {		
   return search(t, 0, myArray.length - 1);
}

private int search(int t, int start, int end) {
   // Not enough items to divide in half
   if (end - start < 2) {
      for (int i = start; i <= end; i++) {
         if (t == myArray[i]) {
            return i;
         }
      }
      return -1;
   }
		
   int pos = ((end - start) / 2) + start;
   if (t == myArray[pos]) {
      return pos;
   } else if (t > myArray[pos]) {
      return search(t, pos + 1, end);
   } else {
      return search(t, start, pos - 1);
   }
}

I originally vascillated between using a loop and using recursion, but I ended up choosing recursion as it seemed a little easier to get my head around. Also, I chose to return -1 if the search value was not found (opposed to the java way of returning -(insertion point + 1)) as there was no specifics about this in the requirements.

After I had finished the code, I came across this post by Josh Bloch that talks about his implementation (which is much cleaner, see below). Bloch talks about how a bug had been reported on his implementation of the binary search algorithm in java, and I was happy to find that I had been able to avoid the bug in my own code above without even knowing that the problem had existed in java :)

Java Binary Search Algorithm – written by Josh Bloch

1:     public static int binarySearch(int[] a, int key) {
2:         int low = 0;
3:         int high = a.length - 1;
4:
5:         while (low <= high) {
6:             int mid = low + ((high - low) / 2);
7:             int midVal = a[mid];
8:
9:             if (midVal < key)
10:                 low = mid + 1
11:             else if (midVal > key)
12:                 high = mid - 1;
13:             else
14:                 return mid; // key found
15:         }
16:         return -(low + 1);  // key not found.
17:     }

Advertisement

Written by stevedaskam

October 1, 2010 at 3:16 pm

Posted in Algorithms

5 Responses

Subscribe to comments with RSS.

  1. You did fall for the oldest bug in binary searches. ;)

    int mid = (low + high) / 2;

    can overflow if (low + high) > Integer.MAX_VALUE resulting in a negative value for mid.

    The solution is to use

    int mid = (low + high) >>> 1;

    It does illustrate your point that its not easy to get right.

    Peter Lawrey

    October 1, 2010 at 4:16 pm

    • Actually, I think this is the point you were trying to make, however you could have provided the fixed code rather than the one with the known bug.

      BTW:

      int pos = (end + start) >>> 1;

      is shorter, possibly more efficient.

      Peter Lawrey

      October 1, 2010 at 4:19 pm

  2. The correct way is:

    mid = lo + (hi – lo) / 2;

    Hugh Brown

    October 3, 2010 at 9:31 pm

  3. Only the purist would argue over an edge-case that occurs when the array being searched is > Integer.MAX_VALUE / 2 in length. If your array is regularly reaching that length, it might be worth investigating storing it on disk to avoid the limitation of array bounds.

    (low + high) >>> 1 will still overflow in the same way (low + high) / 2 does, no?

    We teach a similar puzzle at computer camp to our ~11 year old campers. After solving the ‘have the user guess the (random) number the computer is thinking’ problem, simply flipping the problem into ‘have the computer guess the number the user is thinking of’ makes the problem a lot more interesting. It’s pretty sad professional programmers get this wrong.

    Charlie Hayes

    October 4, 2010 at 1:57 pm

  4. Thanks guys!! I fixed the faulty line in Josh Bloch’s implementation.

    stevedaskam

    October 5, 2010 at 10:56 pm


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.