The Optimization Myth

tllcurv.gif (1047 bytes) brlcurv.gif (1043 bytes)

Home

Free Downloads

Product FAQs (NT)

Product FAQs (OpenVMS)

Product Alerts

Technical References

Links

The Optimization Myth

Disk optimization is an attempt to speed up file access by forcing certain files to be permanently located in certain positions on the disk. The concept of disk optimization supposedly accelerates file access even when all the files are contiguous and all the free space is grouped together. The theory goes that if you put the most frequently accessed files in the middle of the disk, the disk heads will generally have to travel a shorter distance than if these files were located randomly around the disk.

There are some major holes in this theory.

Hole number one : Extensive analysis of real-world computer sites shows that it is not commonplace for entire files to be accessed all at once. It is far more common for only a few blocks of a file to be accessed at a time. Consider a database application, for example. User applications rarely, if ever, search or update the entire database. They access only the particular records desired. Thus locating the entire database in the middle of a disk is wasteful at best and possibly destructive as far as performance is concerned. Further, on an NTFS device, the smallest files are stored entirely within the MFT; instead of a pointer to the file, the file itself is present. There is no way they can be optimized or even moved.

Hole number two : Consider the typical server environment. Dozens or even hundreds of interactive users might be accessing the served disks at any moment, running who knows what applications, accessing innumerable files willy-nilly in every conceivable part of a disk. How can one even hope to guess where the disk's read-write head might be at any given time? With this extremely random mode of operation, how can a disk optimizer state flatly that positioning such-and-such a file at such-and-such an exact location will reduce disk access times? It seems to me that file positioning is equally as likely to worsen system performance as to improve it. Even if the two conditions balance out at zero, the overhead involved gives you a net loss.

Hole number three : When you force a file to a specific position on the disk by specifying exact logical cluster numbers, how do you know where it really is? You have to take into account the difference between logical cluster numbers (LCNs) and physical cluster numbers (PCNs). These two are not the same thing. LCNs are assigned to PCNs by the disk's controller. Disks often have more physical clusters than logical clusters. The LCNs are assigned to most of the physical clusters and the remainder are used as spares and for maintenance purposes. You see, magnetic disks are far from perfect and disk clusters sometimes "go bad." In fact, it is a rarity for a magnetic disk to leave the manufacturer without some bad clusters. When the disk is formatted, the bad clusters are detected and "revectored" to spares. Revectored means that the LCN assigned to that physical cluster is reassigned to some other physical cluster. Windows NT will also do this revectoring on the fly while your disk is in use. The new cluster after revectoring might be on the same track and physically close to the original, but then again it might not. Thus, not all LCNs correspond to the physical cluster of the same number, and two consecutive LCNs may actually be widely separated on the disk.

Hole number four : What if you have more than one partition on a disk? If users are accessing both partitions, optimizing them will almost *guarantee* more head motion.

Hole number five : With regular defragmentation, a defragmenter such as Diskeeper needs to relocate only a tiny percentage of the files on a disk; perhaps even less than one percent. "Optimization" requires moving virtually all the files on the disk, every time you optimize. Moving 100 times as many files gives you 100 times the opportunity for error and 100 times the overhead. Is the result worth the risk and the cost?

Hole number six : What exactly is the cost of optimizing a disk and what do you get for it? The costs of fragmentation are enormous. A file fragmented into two pieces can take twice as long to access as a contiguous file. A three-piece file can take three times as long, and so on. Some files fragment into hundreds of pieces in a few days' use. Imagine the performance cost of 100 disk accesses where only one would do! Defragmentation can return a very substantial portion of your system to productive use.

Now consider optimization. Suppose, for the sake of argument, that disk data block sequencing really did correspond to physical block locations and you really could determine which files are accessed most frequently and you really knew the exact sequence of head movement from file to file. By carefully analyzing the entire disk and rearranging all the files on the disk, you could theoretically reduce the head travel time. The theoretical maximum reduction in average travel time is one-quarter the average head movement time, after subtracting the time it takes to start and stop the head. If the average access time is 10 milliseconds and 8 milliseconds of this is head travel time, the best you can hope for is a 2 millisecond reduction for each file that is optimized. On a faster disk, the potential for reduction is proportionately less. And taking rotational latency into account, your savings may be even less.

Each defragmented file, on the other hand, saves potentially one disk access (10 milliseconds) per fragment. That's five times the optimization savings, even with the bare minimum level of fragmentation. With badly fragmented files, the difference is astounding.

On top of all that, what do you suppose it costs your system to analyze and reposition every file on your disk? When you subtract that from the theoretical optimization savings, it is probably costing you performance to "optimize" the files.

The fact is that it takes only a tiny amount of fragmentation, perhaps only one day's normal use of your system, to undo the theoretical benefits of optimizing file locations. While "optimization" is an elegant concept to the uninitiated, it is no substitute for defragmentation, it is unlikely to improve the performance of your system at all, and it is more than likely to actually worsen performance in a large number of cases.

In summary, file placement for purposes of optimizing disk performance is a red herring. It is not technologically difficult to do. It is just a waste of time.

This article was excerpted from Chapter 6 of the book "Fragmentation - the Condition, the Cause, the Cure" by Craig Jensen, CEO Executive Software. It has been modified for application to Windows NT. The complete text of the book is available at .

-----------

Lance Jensen
Technical Support
Executive Software International

If you have any comments about this article or any requests for new technical articles e-mail

Executive Software Europe