Power Tools Gear
July 10, 2000
By Karen Kenworthy
IN THIS ISSUE
You wouldn't know by looking at me, but I'm old enough to remember a lot of the early history of computing. I read my first programming manual 37 years ago, when I was only 11 years of age. A little while later I saw my first real, live computer -- a minicomputer that communicated with the world using a Teletype typewriter console. Even my college computer had only 16KB of RAM, and no console. Messages were "displayed" on the computer's printer, and our responses were punched into punch cards and fed back one by one.
There have been a lot of changes since then. But some things stay the same.
The Good Old Days
Most of the first PCs were equipped with "just" 64KB of RAM. Now today, this seems like a tiny amount of memory. And to be honest, it wasn't a lot of memory even back then. Minicomputers computers of the day often had 128KB of RAM, while mainframe computers typically had 500KB, or more, RAM at their disposal. There were rumors of super computers with millions of bytes of RAM, but at more than $1 per byte of mainframe memory, few people had ever seen such a beast.
Still, 64KB was more than enough to run early versions of DOS, plus the few PC application programs available at the time. For a computer small enough to fit on a desktop, this was no small feat. Besides, PCs with as much as 256KB of RAM were just around the corner. Who would ever need more RAM than that?
Of course, RAM wasn't the only evolving story in those days. The original PC used floppy disks for permanent storage. But shortly after the original PC appeared, IBM and others released models that contained something called a Hard Disk. This device cost a couple of thousand dollars. But it was worth it.
One early hard disk could store 5 million bytes of data, and could find any one of those bytes in as little as one-tenth of a second! With floppy disks of the day holding a mere 360KB, a hard disk with almost 14 times that capacity surely would last forever. Who would ever need more storage than that?
Of course, it didn't take long before someone, somewhere spent several thousand dollars to buy the necessary RAM chips and extra circuit boards, needed to fully load their PC. Other enterprising companies and individuals hungered for more disk space. They fed that hunger with drives designed for larger minicomputers and other advanced machines of the day. Sure, those drives cost thousands of dollars too. But when you must have speed and space, what can you do?
And then those power users hit the wall.
The processor chip IBM chose for the first PC, the Intel 8088, used a strange variation of 16-bit addressing. Normally, computers that use just 16 bits to specify memory locations are limited to 64KB of RAM. That's because 65,536 (64K) is the largest possible 16-bit binary number. But by combining two 16-bit addresses, overlapped in such a way that their combined length was 20 bits, the 8088 chip could access a whopping 1MB of RAM. But no more. What's worse, the design of the original PC set aside 384KB of this address space for communicating with expansion cards, not RAM. As a result, the PC's memory was limited to 640KB.
Disk sizes became another early PC bottleneck. If you were willing to spend enough money, hard disks holding as many as 250 million bytes could be had. But the program the early PCs used to read and write their drives was limited to drives no larger than 32MB. To make matters worse, this program was stored in Read-Only Memory (ROM), so it was difficult to change. And before long, so many programs depended on the exact behavior of this program (the BIOS, or Basic Input/Output System), that changing it became practically impossible.
Eventually these limitations were overcome. But they were just the first of many barriers encountered during the evolution of the PC. Over the years various hardware and software design compromises have produced barriers to computer enhancement of 16MB of RAM, and 540 MB then 8 GB of hard disk space, among others.
We've all know that Windows 95, and later versions, is a "32-bit" operating system. And it's designed to run powerful "32-bit" applications. And the CPU's found in our PCs are "32-bit" microprocessors. Most PCs even have "32-bit" PCI, AGP and PC Card busses, to communicate with their adapter and video cards.
As we discovered a few months ago (December 27th, 1999 to be exact, or http://www.karenware.com/newsletters/1999/1999-12-27.asp), this means that our software and hardware uses 32-bit binary numbers in calculations. They also use 32-bit binary numbers when keeping track of memory locations, and even sectors on a hard disk.
The largest 32-bit binary number is 4,294,967,295, or a little over 4 billion (4 thousand million, to my British friends). But if one of the 32-bits is set aside to indicate the number's "sign" (whether it's positive, or negative), the largest number that can be stored in the remaining 31 bits is 2,147,483,647, or just over 2 billion. So, a 32-bit barrier is really a 4, or 2, billion barrier, depending on whether signed or unsigned binary numbers are used.
What do those two mathematical facts mean to us today? For starters, they mean that most of today's PCs can access no more than 4 GB of RAM. That's because today's CPUs, and other computer circuits, use 32-bit unsigned numbers to address each byte. Few PCs have sockets for enough memory modules to reach that capacity, but already some do. And more will, as memory modules reach 1 GB each, and more.
Hard disk capacity is measured in "sectors" -- blocks of disk space containing 512 bytes each. Our computers keep track of individual sectors using unsigned 32-bit numbers, allowing drives to contain as many as 4 billion sectors each. Converting this to bytes (512 x 4 billion), we discover that drives can hold as many as 2TB (2 Terabytes, or 2,000GB). Now PC drives that large don't exist yet, and won't for a while to come. But with drives already reaching 50GB, and approaching 100GB in the near future, it's not hard to imagine the day when hard disk sizes will reach their 2TB design limit.
But enough about the future. There's one place where some people encounter a 32- bit barrier today. In fact, I ran into this one myself, just last week. It's a limit on the size of disk files.
While the sizes of hard drives are measured in sectors, the sizes of the files they contain are measured in bytes. Still, with 32-bit unsigned numbers, we ought to be allowed to create files as large as 4GB. It's possible to imagine files this big, but few of us have attempted to create one.
But Windows doesn't use 32-bit unsigned numbers to specify locations within files. When reading files, programs often jump about from place to place. It's common for a program to read a portion of a file near its beginning, then suddenly start reading the file near its end. Another moment later, the program may want to see the middle of the file.
It's easy to see how this can happen. Consider a file, for example, containing your collection of names and e-mail addresses. As you write new messages, a program must access this file to obtain each recipient's address. But because we seldom write to our friends in alphabetical order, it's unlikely we'll ask our email program to read our address file from beginning to end. Instead, it will read various portions of the files, seemingly at random.
Because programs often behave this way, Windows gives programs two ways to specify the portion of a file they want to read. One method, called "absolute addressing," requires the program to specify the exact location of the data it wants to see. To read or write the 100th byte of a file, for example, the program specifies byte 100 (pretty clever, eh?)
The other method, called "relative addressing," requires the program to specify the distance the desired data is from the last location read or written. This relative address, or offset, can be either positive or negative, allowing the program to move forwards or backwards through a file. Since 32-bit signed numbers are used, every byte of a file must be within about 2GB (the largest possible signed 32-bit number) of every other byte. As a result, the maximum size of a file (the longest "distance" from the first byte to the last) must be no more than 2GB.
Must be, that is, unless you're using Windows NT or Windows 2000. These versions of Windows use 64-bit signed numbers to keep track of their files, allowing files to grow as large as 9,223,372,036,854,775,807 bytes (a 64-bit signed binary number). That's more than 9 quintillion bytes, or 9EB (Exabytes), enough to keep us happy for a while (though not forever!).
Unfortunately, today's CPUs can't easily perform arithmetic on numbers that size. Then must break them into smaller sized chunks, and perform arithmetic on each chunk. These intermediate results are then combined, in a many like old- fashioned "long addition", to produce the final result.
This matters to me because the latest version of our Directory Printer (and future versions of some other Power Tools) must deal with very large file sizes. The Visual Basic language, in which my programs are written, does not provide any built-in support for 64-bit binary numbers. As a result, I've created my own routine to convert such numbers to a format Visual Basic does understand.
But if you're not a programmer, you probably don't care how 64-bit numbers are handled. You just want the job done. And that's exactly what Directory Printer v2.10 does. It now correctly reports and totals files and directories larger than 2 GB. If you'd like to give this new ability a try, or just want the newest version with a few new bug fixes and enhancements, visit my web site at http://www.karenware.com/powertools/ptdirprn.aspm and download your free copy. The programmers among us might want to download the program's Visual Basic source code too. Look for its Large2Dec function, which converts Windows' 64-bit LARGE_INTEGER structures into Visual Basic's Decimal data subtype.
And before you lose any sleep over the other 32-bit barriers we've seen, rest assured your favorite programmers and hardware engineers never sleep. Already, they're well on their way to creating the next generation of fully 64-bit hardware, operating systems, and applications.
Of course, that won't stop the rest of us from complaining in the years ahead about their short sightedness. How dare they impose such burdensome restrictions? But that's fun for another day. :)
In the meantime, look for me on the 'net this week. Or perhaps you'll see me sunning myself on the Ohio shores of Lake Erie, near the lovely town of Port Clinton (no relation to the man in the Whitehouse). Either way, if you see me, be sure to wave and say Hi!