Previous Table of Contents Next


The third reason is often fallacious. C library functions are not always written in assembly, nor are they always particularly well-optimized. (In fact, they’re often written for portability, which has nothing to do with optimization.) What’s more, they’re general-purpose functions, and often can be outperformed by well-but-not- brilliantly-written code that is well-matched to a specific task. As an example, consider Listing 1.5, which uses internal buffering to handle blocks of bytes at a time. Table 1.1 shows that Listing 1.5 is 2.5 to 4 times faster than Listing 1.4 (and as much as 49 times faster than Listing 1.1!), even though it uses no assembly at all.

Clearly, you can do well by using special-purpose C code in place of a C library function—if you have a thorough understanding of how the C library function operates and exactly what your application needs done. Otherwise, you’ll end up rewriting C library functions in C, which makes no sense at all.

LISTING 1.5 L1-5.C

/*
* Program to calculate the 16-bit checksum of the stream of bytes
* from the specified file. Buffers the bytes internally, rather
* than letting C or DOS do the work.
*/
#include <stdio.h>
#include <fcntl.h>
#include <alloc.h>   /* alloc.h for Borland,
                                malloc.h for Microsoft  */

#define BUFFER_SIZE  0x8000   /* 32Kb data buffer */

main(int argc, char *argv[]) {
      int Handle;
      unsigned int Checksum;
      unsigned char *WorkingBuffer, *WorkingPtr;
      int WorkingLength, LengthCount;

      if ( argc != 2 ) {
            printf(“usage: checksum filename\n”);
            exit(1);
      }
      if ( (Handle = open(argv[1], O_RDONLY | O_BINARY)) == -1 ) {
            printf(“Can’t open file: %s\n”, argv[1]);
            exit(1);
      }

      /* Get memory in which to buffer the data */
      if ( (WorkingBuffer = malloc(BUFFER_SIZE)) == NULL ) {
            printf(“Can’t get enough memory\n”);
            exit(1);
      }

      /* Initialize the checksum accumulator */
      Checksum = 0;

      /* Process the file in BUFFER_SIZE chunks */
      do {
            if ( (WorkingLength = read(Handle, WorkingBuffer,
                  BUFFER_SIZE)) == -1 ) {
                  printf(“Error reading file %s\n”, argv[1]);
                  exit(1);
            }
            /* Checksum this chunk */
            WorkingPtr = WorkingBuffer;
            LengthCount = WorkingLength;
            while ( LengthCount–– ) {
            /* Add each byte in turn into the checksum accumulator */
                  Checksum += (unsigned int) *WorkingPtr++;
      }
      } while ( WorkingLength );

      /* Report the result */
      printf(“The checksum is: %u\n”, Checksum);
      exit(0);
}

That brings us to the fourth reason: avoiding an internal-buffered implementation like Listing 1.5 because of the difficulty of coding such an approach. True, it is easier to let a C library function do the work, but it’s not all that hard to do the buffering internally. The key is the concept of handling data in restartable blocks; that is, reading a chunk of data, operating on the data until it runs out, suspending the operation while more data is read in, and then continuing as though nothing had happened.

In Listing 1.5 the restartable block implementation is pretty simple because checksumming works with one byte at a time, forgetting about each byte immediately after adding it into the total. Listing 1.5 reads in a block of bytes from the file, checksums the bytes in the block, and gets another block, repeating the process until the entire file has been processed. In Chapter 5, we’ll see a more complex restartable block implementation, involving searching for text strings.

At any rate, Listing 1.5 isn’t much more complicated than Listing 1.4—and it’s a lot faster. Always consider the alternatives; a bit of clever thinking and program redesign can go a long way.

Know How to Turn On the Juice

I have said time and again that optimization is pointless until the design is settled. When that time comes, however, optimization can indeed make a significant difference. Table 1.1 indicates that the optimized version of Listing 1.5 produced by Microsoft C outperforms an unoptimized version of the same code by more than 60 percent. What’s more, a mostly-assembly version of Listing 1.5, shown in Listings 1.6 and 1.7, outperforms even the best-optimized C version of List1.5 by 26 percent. These are considerable improvements, well worth pursuing—once the design has been maxed out.

LISTING 1.6 L1-6.C

/*
* Program to calculate the 16-bit checksum of the stream of bytes
* from the specified file. Buffers the bytes internally, rather
* than letting C or DOS do the work, with the time-critical
* portion of the code written in optimized assembler.
*/
#include <stdio.h>
#include <fcntl.h>
#include <alloc.h>   /* alloc.h for Borland,
                         malloc.h for Microsoft  */

#define BUFFER_SIZE  0x8000   /* 32K data buffer */

main(int argc, char *argv[]) {
      int Handle;
      unsigned int Checksum;
      unsigned char *WorkingBuffer;
      int WorkingLength;

      if ( argc != 2 ) {
            printf(“usage: checksum filename\n”);
            exit(1);
      }
      if ( (Handle = open(argv[1], O_RDONLY | O_BINARY)) == -1 ) {
            printf(“Can’t open file: %s\n”, argv[1]);
            exit(1);
      }

      /* Get memory in which to buffer the data */
      if ( (WorkingBuffer = malloc(BUFFER_SIZE)) == NULL ) {
            printf(“Can’t get enough memory\n”);
            exit(1);
      }

      /* Initialize the checksum accumulator */
      Checksum = 0;

      /* Process the file in 32K chunks */
      do {
            if ( (WorkingLength = read(Handle, WorkingBuffer,
            BUFFER_SIZE)) == -1 ) {
                  printf(“Error reading file %s\n”, argv[1]);
                  exit(1);
            }
            /* Checksum this chunk if there’s anything in it */
            if ( WorkingLength )
                  ChecksumChunk(WorkingBuffer, WorkingLength, &Checksum);
            } while ( WorkingLength );

            /* Report the result */
            printf(“The checksum is: %u\n”, Checksum);
            exit(0);
}


Previous Table of Contents Next

Graphics Programming Black Book © 2001 Michael Abrash