I noticed that one of the topics of interest is interviews and tasks for developers from large tech corporations. In this sense, Google is perhaps one of the most desirable places for most programmers, so many taste their luck with their hiring process.

They usually do three tasks at the interview, but in this text, I will present only one.

Task:

A sorted array of N integer elements is given. It is necessary to count how many times a given number appears.

A trivial solution is to count all occurrences of this element in one pass through the array in linear complexity. But, with such knowledge, you probably won't pass interviews, because this solution is ineffective. Since the array is sorted, the idea is to use a binary search. If we were to check whether an element exists in the sequence at all, this would be easily done in logarithmic complexity, but our task is somewhat more difficult because we are looking for the length of the part of the sequence where this element appears.

The first way with binary search is to find the required element, and then iterate to the left and right of it and count the occurrences. However, in the worst case, this approach has linear complexity, because only that element may be found in the entire sequence.

We will solve this by using a slightly different binary search that looks for the last element in the sequence that is strictly smaller than X. With this algorithm, we will be able to find the ends of the interval that contains the required number X:
to find the beginning of the interval, we will call binary search with parameter X, and as its return value we will get the first element to the left of the substring we are interested in, so we simply increase its index by 1 to find the end of the interval, we will call the binary search with the parameter X+1, and the return value will represent exactly the end of the interval
The idea is very simple, the problem could be the realization that they would surely ask you at the interview because there are a lot of special cases that you need to pay attention to:

  • What if that element doesn't exist?
  • What if the substring is at the very beginning of the string?
  • What if the substring is at the very end of the string?

The following is a C++ code that successfully solves the given problem.

int countOccurancesInSortedArray(int *array, int len, int x) {

    // Special cases

     

    if (len == 0) {

        // Empty array

         

        return 0;

    }

     

    if (x < array[0] || x > array[len-1]) {

        // Number definitely not present in array

         

        return 0;

    }

     

    // Find the end of the interval

     

    int right = findLastElementSmallerThan(x+1, array, 0, len-1);

     

    if (array[right] != x) {

        // Element does not exist, since the end of the interval isn't X

         

        return 0;

    }

     

    // No need to go over the entire array, just left of the end of the interval

    // Also, if the array starts with X, the return index is -1

    // We need to pass the appropriate parameters, i.e. "-1" and "right"

     

    int left = findLastElementSmallerThan(x, array, -1, right) + 1;

     

    // Result

     

    return right - left + 1;

}

int findLastElementSmallerThan(int x, int *array, int left, int right) {

    while (left < right) {

        int mid = (right == 0) ? 0 : (left + right) / 2 + 1;

         

        if (array[mid] >= x) {

            right = mid - 1;

        } else  {

            left = mid;

        }

    }

     

    return right;

}