tllcurv.gif (1047 bytes)  

 

 

 

 

 

Home

Free Downloads

Product FAQs (NT)

Product FAQs (OpenVMS)

Product Alerts

Technical References

Links

brlcurv.gif (1043 bytes)

Fragmentation

Fragmentation is the most significant factor in system performance.

An average fragments per file value of 1.2 means that there are 20%more pieces of files on the disk than there are files, indicating perhaps 20% extra computer work needed. It shouldbe pointed out that these numbers are merely indicators. Some files are so small that they reside entirely withinthe MFT. Some files are zero-length. If only a few files are badly fragmented while the rest are contiguous, andthose few fragmented files are seldom accessed, then fragmentation may have no performance impact at all. On theother hand, if your applications are accessing the fragmented files heavily, the performance impact could be muchgreater than 20%. You have to look further to be sure.

For example, if there were 1,000 files and only one of those filesis ever used, but that one is fragmented into 200 pieces (20% of the total fragments on the disk), you would havea serious problem, much worse than the 20% figure would indicate. In other words, it is not the fact that a fileis fragmented that causes performance problems, it is the computer's attempts to access the file that degrade performance.

To explain this properly, it is first necessary to examine how filesare accessed and what is going on inside the computer when files are fragmented.

What's Happening to Your Disks?

Tracks on a disk are concentric circles, divided into sectors. Filesare written to groups of sectors called "clusters". Often, files are larger than one cluster, so whenthe first cluster is filled, writing continues into the next cluster, and the next, and so on. If there are enoughcontiguous clusters, the file is written in one contiguous piece. It is not fragmented. The contents of the filecan be scanned from the disk in one continuous sweep merely by positioning the head over the right track and thendetecting the file data as the platter spins the track past the head.

Now, suppose the file is fragmented into two parts on the same track.To access this file, the read/write head has to move into position as described above, scan the first part of thefile, then suspend scanning briefly while waiting for the second part of the file to move under the head. Thenthe head is reactivated and the remainder of the file is scanned.

As you can see, the time needed to read the fragmented file is longerthan the time needed to read the unfragmented (contiguous) file. The exact time needed is the time to rotate theentire file under the head, plus the time needed to rotate the gap under the head. A gap such as this might adda few milliseconds to the time needed to access a file. Multiple gaps would, of course, multiply the time added.The gap portion of the rotation is wasted time due solely to fragmentation. Then, on top of that, you have to addall the extra operating system overhead required to process the extra I/Os.

Now, what if these two fragments are on two different tracks? We haveto add time for movement of the head from one track to another. This track-to-track motion is usually much moretime-consuming than rotational delay, since you have to physically move the head. To make matters worse, the relativelylong time it takes to move the head from the track containing the first fragment to the track containing the secondfragment can cause the head to miss the beginning of the second fragment, necessitating a delay of nearly one completerotation of the disk, waiting for the second fragment to come around again to be read. Further, this form of fragmentationis much more common than the gap form.

But the really grim news is this: files don't always fragment intojust two pieces. You might have three or four, or ten or a hundred fragments in a single file. Imagine the gymnasticmaneuvers your disk heads are going through trying to collect up all the pieces of a file fragmented into 100 pieces!

On really badly fragmented files, there is another factor: The MasterFile Table record can only hold a limited number of pointers to file fragments. When the file gets too fragmented,you have to have a second MFT record, maybe a third, or even more. For every such file accessed, add to each I/Othe overhead of reading a second (or third, or fourth, etc.) file record segment from the MFT.

On top of all that, extra I/O requests, due to fragmentation, areadded to the I/O request queue along with ordinary and needful I/O requests. The more I/O requests there are inthe I/O request queue, the longer user applications have to wait for I/O to be processed. This means that fragmentationcauses everyone on the system to wait longer for I/O, not just the user accessing the fragmented file.

Fragmentation overhead certainly mounts up. Imagine what it is likewhen there are 100 users on a network, all accessing the same server, all incurring similar amounts of excess overhead.

What's Happening to Your Computer?

Now, let's take a look at what these excess motions and file accessdelays are doing to the computer.

Windows NT is a complicated operating system. This is a good thingbecause the complexity results from the large amount of functionality built in to the system, saving you and yourprogrammers the trouble of building that functionality into your application programs, which is what makes WindowsNT a truly great operating system. One of those functions is the service of providing an application with filedata without the application having to locate every bit and byte of data physically on the disk. Windows NT willdo that for you.

When a file is fragmented, Windows NT does not trouble your programwith the fact, it just rounds up all the data requested and passes it along. This sounds fine, and it is a helpfulfeature, but there is a cost. Windows NT, in directing the disk heads to all the right tracks and clusters withineach track, consumes system time to do so. That's system time that would otherwise be available to your applications.Such time, not directly used for running your program, is called overhead.

What's happening to your applications while all this overhead is goingon? Simple: Nothing. They wait.

The users wait, too, but they do not often wait without complaining,as computers do. They get upset, as you may have noticed.

The users wait for their applications to load, then wait for themto complete, while excess fragments of files are chased up around the disk. They wait for keyboard response whilethe computer is busy chasing up fragments for other programs that run between the user's keyboard commands. Theywait for new files to be created, while the operating system searches for enough free space on the disk and, sincethe free space is also fragmented, allocates a fragment here, a fragment there, and so on. They even wait to login, as the operating system wades through fragmented procedures and data needed by startup programs. Even backuptakes longer - a lot longer - and the users suffer while backup is hogging the machine for more and more of "their"time.

Fragmentation vs. CPU Speed

A system that does a lot of number crunching but little disk I/O willnot be affected much by fragmentation. But on a system that does mainly disk I/O (say a mail server), severe fragmentationcan easily slow a system by 90% or more. That's much more than the difference between a 486/66 CPU and a 250MHzPentium II!

Of course, for the vast majority of computers, the impact of fragmentationwill fall somewhere in the middle of this range. In our experience, many Windows NT systems that have run for morethan two months without defragmenting have, after defragmentation, at least doubled their throughput. It takesquite a large CPU upgrade to double performance.

Fragmentation vs. Memory

The amount of memory in a computer is also important to system performance;just how important depends on where you are starting from. If you have 16 megabytes of RAM, it's almost a certaintythat adding more will tremendously boost performance, but if you have 256 megabytes, most systems would get nobenefit from more. Raising the RAM from 32 to 96 megabytes on the author's machine, which does much memory-intensivework, almost tripled performance. We see this as the high end of possible benefit from adding memory. The typicalsite, in our experience, will see about a 25% boost from doubling the RAM. Again, we generally see more performanceimprovement from eliminating fragmentation.

(This article was primarily excerpted from Chapter 4 of the book Fragmentation- the Condition, the Cause, the Cure , by Craig Jensen, CEO of Executive Software. It has been modified forapplication to Windows NT. The complete text of the book is available at this web site.)

 

If you have any comments about this article orany requests for new technical articles e-mail

 

Executive Software Europe