August 7, 2000

By Karen Kenworthy

IN THIS ISSUE

In this day of electronic marvels, it's easy to forget about machines. Gizmos with moving parts, made of steel and plastic. They seem like something from another age. But they're just as vital today as they were before electrons were first coaxed down a wire.

Look no further than your computer. It probably depends on a mechanical fan to keep it cool. Without it, the computer's electronic components would overheat and fail. And your favorite electronic companion almost certainly has two or three other very important mechanical devices called "drives." Whether diskette, CD-ROM, DVD or hard disks, all of these have at least one thing in common: Each drive has at least two motors. In other words, each drive is a machine.

One of the drive's motors spins the flat, round "platter" found inside the drive. The surface of this platter holds the drive's data. The other motor moves the drive's sensitive heads, allowing them to be positioned anywhere over the platter's surface. These heads contain tiny sensors, allowing them to read the platter's data. And some heads also have lasers or magnetic coils that can write new data onto the platter.

Without these increasingly sophisticated machines, our computers couldn't do much. In one form or another, drives provide most of our computers' memory, allowing them to remember what we've done, and how we did it. But have you ever wondered what goes on inside these machines? How do drives perform their magic?

Sectors

Let's start by taking a close look at those platters. The data stored on the platters of a hard disk drives, and on diskettes, is arranged in concentric circles called "tracks." A diskette typically has 80 of these tracks on each side of its single platter. Modern hard drives often have two or more platters, each holding a few thousand tracks on each side.

The data stored on CD-ROM and DVD platters is laid out in tracks too. But only one side of each platter, or disc, contains data. And that one useful side contains just one data track. Like the tracks of "old-fashioned" phonograph records, the single track on a CD-ROM and DVD disc forms a tight spiral. But unlike their analog, vinyl ancestors, these digital tracks begin near the center of the disc, and spiral outwards, toward the edge.

Regardless of its shape, each track is divided into small segments called "sectors." The sectors found on hard drives and diskettes hold 512 bytes of data. The sectors that make up the single track on each CD-ROM and DVD disc are larger, holding 2048 bytes of data each.

Sectors contain more than just our data. In addition, all sectors contain extra bits of information that allow the drive to detect, and often correct, errors that occur while reading data. Sectors also contain their own "address," a number that represents the sector's location on the platter. These help keep the drive's heads from getting lost.

Sectors are important to you and I because they are the smallest unit of data a drive can read or write. To examine or change a single byte of data, the entire sector containing that byte must be read or written. Of course, most of the time much larger blocks of data are accessed, requiring the drive to read or write several sectors to satisfy our requests.

To speed up disk access, most modern drives keep copies of the last 1,000 or so sectors they've read or written in a special high-speed memory circuit, called a "buffer." Whenever possible, the drive retrieves requested data from the temporary buffer, rather than the platter where the data is permanently stored. Windows also uses your computer's RAM to store recently accessed sectors. Depending on the amount of RAM available, Windows' buffer may hold copies of several tens of thousands of sectors.

In years gone by, you and I could tell Windows how much RAM to use for this purpose, by configuring a special program called SMARTDRIVE. But today, Windows automatically converts all the RAM we're not using at a given moment into a temporary buffer. Throughout the day, as our RAM requirements rise and fall, Windows shrinks and grows this buffer, to always make the best use of all our RAM.

Clusters

But how does Windows keep track of all these sectors? Remember our recent talks about 32-bit computer processors, and the current crop of 32-bit software that runs on them? Intel's Pentium processors, similar processors from AMD and Cyrix, Windows 98, Windows 2000, even the new Windows Me (Millennium Edition) are all designed to manage data 32-bits at a time. Ask them to do more, and they'll break the job down into 32-bit sized chunks first.

This reliance on 32-bit numbers affects how RAM is allocated and used. But its influence doesn't stop there. For example, the "9x" versions of Windows (and sometimes Windows NT and Windows 2000) also use 32-bit numbers to keep track of information stored on a drive.

[WARNING: The following text contains several large numbers. If large numbers make you drowsy, do not read this text while driving, or operating heavy equipment. Do not combine this text with other medications.]

Now you might think, given the importance of sectors, that Windows would use a 32-bit number to uniquely identify each sector on a drive. But as you'll recall from our earlier discussions, the largest 32-bit binary number is approximately four billion in our more familiar decimal number system (4,294,967,295 to be exact). So Windows could only use such a scheme with drives containing four billion sectors or less.

Hard disk sectors hold 512 bytes each. So this would mean the largest hard drive Windows could use would be limited to "just" 2,199,023,255,040 (4,294,967,295 x 512) bytes. In other words, hard disks would be limited to 2 terabytes or 2TB.

Now 2TB sounds like a lot of disk space. And it is. But it's only 10 times the size of the largest PC hard drives already available. What will happen in a year or two, when much larger drives become available? Will we be faced with a 2TB barrier? Fortunately, the answer is no, thanks to a trick Windows learned many years ago, in the days of 16-bit Windows, and the 16-bit processors that ran it.

Back then, Windows used 16-bit numbers to keep track of disk space. If individual sectors were counted, the largest useable disk drive would have been limited to just 65,536 (the largest possible 16-bit binary number) sectors, or just 32 Megabytes (32MB).

Before the youngsters among us stop laughing, I want to point out that there *was* a time when PC hard drives were limited to just that size. It was in the days of DOS, before Windows. To overcome this limitation, later versions of DOS, and all versions of Windows, kept track of disk space by counting units larger than a single sector. These larger units contained two or more sectors, and are called "clusters."

Drives were still limited to 65,536 clusters. But clusters could contain as many as 32 sectors, or 16 KB. Doing the math, during the glory days of 16-bit Windows, a disk drive could contain as many as 1,073,741,824 bytes, a whopping 1 gigabyte!

Windows is no longer limited to 16-bit numbers. But Windows still keeps track of disk space using clusters rather than individual sectors. As a result, the largest possible drive these days can contain as many as 64TB (using 16KB clusters). Drives can be even larger, if bigger clusters are used.

New Slack Checker

Unfortunately, all this high capacity comes at a price. Clusters are sometimes called "Allocation Units," because they are the smallest unit of disk space Windows can allocate to a file. This means that files always occupy a whole number of clusters, and their sizes grow one whole cluster at a time.

Consider what this means when Windows creates a small file. Although the file might only contain a few bytes of data, it is allocated a full cluster of disk space, perhaps as much as 16KB. If the file grows, no more space is assigned until the file's size exceeds its original allocation. But once its size gets even one byte larger than the original cluster given to it, another whole cluster is added.

One result of this method of tracking and allocating disk space is that most files never use all the disk space they've been given. In fact, on average, each file on your hard disk drive wastes approximately 1/2 cluster of space. That's because, on average, the last cluster assigned to a file will be only half full, as the file grows or shrinks over time.

With drives sometimes containing tens of thousands of files, and with cluster sizes as large as 16KB, many MB of your disk space can be tied up in allocated, but currently unused, portions of disk clusters. To find out exactly how much disk space is wasted this way, I wrote a program called the Disk Slack Checker.

This Power Tools examines your drives, determines the cluster size Windows is using for each, counts your files, and computes the amount of space being wasted. It also displays other useful information, such as the drive's total capacity and remaining free space, plus the type of file system used by Windows to manage the drive.

If you'd like to check your own drives, download the latest version of the Disk Slack Checker from my Web site at https://www.karenware.com/powertools/ptslack. It has just been updated to handle the largest available drives, and to display its findings more clearly. As always, the program is free, and so is its Visual Basic source code, for those of you who like to see exactly how the bits are fiddled. :)

In the meantime, look for me this week on the side of the road. My old van is in the shop again, this time after a minor collision with a much sturdier pickup truck. But wherever you find me, be sure to have and say Hi! I'll be looking for you. :)