Previous Table of Contents Next


We could put a zero byte at the end of our buffer to allow strstr() to work, but why bother? The strstr() function must spend time either checking for the end of the string being searched or determining the length of that string—wasted effort given that we already know exactly how long our search buffer is. Even if a given strstr() implementation is well-written, its performance will suffer, at least for our application, from unnecessary overhead.

This illustrates why you shouldn’t think of C/C++ library functions as black boxes; understand what they do and try to figure out how they do it, and relate that to their performance in the context you’re interested in.

Brute-Force Techniques

Given that no C/C++ library function meets our needs precisely, an obvious alternative approach is the brute-force technique that uses memcmp() to compare every potential matching location in the buffer to the string we’re searching for, as illustrated in Figure 5.1.

By the way, we could, of course, use our own code, working with pointers in a loop, to perform the comparison in place of memcmp(). But memcmp() will almost certainly use the very fast REPZ CMPS instruction. However, never assume! It wouldn’t hurt to use a debugger to check out the actual machine-code implementation of memcmp() from your compiler. If necessary, you could always write your own assembly language implementation of memcmp().


Figure 5.1
  The brute-force searching technique.

Invoking memcmp() for each potential match location works, but entails considerable overhead. Each comparison requires that parameters be pushed and that a call to and return from memcmp() be performed, along with a pass through the comparison loop. Surely there’s a better way!

Indeed there is. We can eliminate most calls to memcmp() by performing a simple test on each potential match location that will reject most such locations right off the bat. We’ll just check whether the first character of the potentially matching buffer location matches the first character of the string we’re searching for. We could make this check by using a pointer in a loop to scan the buffer for the next match for the first character, stopping to check for a match with the rest of the string only when the first character matches, as shown in Figure 5.2.

Using memchr()

There’s yet a better way to implement this approach, however. Use the memchr() function, which does nothing more or less than find the next occurrence of a specified character in a fixed-length buffer (presumably by using the extremely efficient REPNZ SCASB instruction, although again it wouldn’t hurt to check). By using memchr() to scan for potential matches that can then be fully tested with memcmp(), we can build a highly efficient search engine that takes good advantage of the information we have about the buffer being searched and the string we’re searching for. Our engine also relies heavily on repeated string instructions, assuming that the memchr() and memcmp() library functions are properly coded.


Figure 5.2
  The faster string-searching technique.

We’re going to go with the this approach in our file-searching program; the only trick lies in deciding how to integrate this approach with restartable blocks in order to search through files larger than our buffer. This certainly isn’t the fastest-possible searching algorithm; as one example, the Boyer-Moore algorithm, which cleverly eliminates many buffer locations as potential matches in the process of checking preceding locations, can be considerably faster. However, the Boyer-Moore algorithm is quite complex to understand and implement, and would distract us from our main focus, restartable blocks, so we’ll save it for a later chapter (Chapter 14, to be precise). Besides, I suspect you’ll find the approach we’ll use to be fast enough for most purposes.

Now that we’ve selected a searching approach, let’s integrate it with file handling and searching through multiple blocks. In other words, let’s make it restartable.

Making a Search Restartable

As it happens, there’s no great trick to putting the pieces of this search program together. Basically, we’ll read in a buffer of data (we’ll work with 16K at a time to avoid signed overflow problems with integers), search it for a match with the memchr()/memcmp() engine described, and exit with a “string found” response if the desired string is found.

Otherwise, we’ll load in another buffer full of data from the file, search it, and so on. The only trick lies in handling potentially matching sequences in the file that start in one buffer and end in the next—that is, sequences that span buffers. We’ll handle this by copying the unchecked bytes at the end of one buffer to the start of the next and reading that many fewer bytes the next time we fill the buffer.

The exact number of bytes to be copied from the end of one buffer to the start of the next is the length of the searched-for string minus 1, since that’s how many bytes at the end of the buffer can’t be checked as possible matches (because the check would run off the end of the buffer).

That’s really all there is to it. Listing 5.1 shows the file-searching program. As you can see, it’s not particularly complex, although a few fairly opaque lines of code are required to handle merging the end of one block with the start of the next. The code that searches a single block—the function SearchForString()—is simple and compact (as it should be, given that it’s by far the most heavily-executed code in the listing).

Listing 5.1 nicely illustrates the core concept of restartable blocks: Organize your program so that you can do your processing within each block as fast as you could if there were only one block—which is to say at top speed—and make your blocks as large as possible in order to minimize the overhead associated with going from one block to the next.


Previous Table of Contents Next

Graphics Programming Black Book © 2001 Michael Abrash