January 25, 2005
By Karen Kenworthy
IN THIS ISSUE
Happy New Year!
Can you believe it? It's already the year 2005! Experts deny it, but I'm certain the earth is spinning faster and faster. Days were definitely longer, when I was growing up in the beautiful Texas Panhandle. And every year seems shorter than the one before.
I hope experts are looking into this. If nothing's done, before long the earth may spin so fast, we'll all be flung into outer space. :)
New Directory Printer!
While there's still time left in the day, I decided to upgrade one of the most popular of my Power Tools programs. The project started modestly, but before long turned into a big job. Along the way I wore out one keyboard, and ran my poor computer mouse ragged. I refurbished thousands of old bits, and still needed buckets of new 1s and 0s.
But I think it was worth it. Finally, the new Directory Printer v5.0 is ready!
Long-time readers will remember the Directory Printer. It began as a small program that could print lists of files found on our hard disks, diskettes, and CDs. In addition to each file's name, the program printed the file's size, and the dates it was created, last modified, and last accessed.
True to its name, the Directory Printer also printed details about the folders where our files are stored. Naturally, the number of files and sub-folders each folder contains were reported. The total size of the folder's contents could be printed too. Want to know when a folder was created, last modified, or last accessed? The program could report those details too.
Contrary to its name, the Directory Printer could also write the information it uncovers to a text file. These files can be stored in a safe place, providing an historical record of a drive's contents. They can also be imported into spreadsheets and databases, letting you analyze, sort, and compare the information the Directory Printer collects.
Whew! The early Directory Printer did a lot. But like every successful program, it has never been finished. Over the years the little tool learned to reveal several "attributes" of files and folders, including:
Read-Only. Files with this attribute can be read, but not changed or deleted. Files stored in Read-Only folders can be read, but
Hidden. These files and folders are not normally displayed by Windows. In theory, this provides a certain amount of protection to important files and folders.
System. This attribute indicates a file or folder is used by Windows. Should one of them disappear, or be improperly altered, Windows may behave erratically or not run at all.
Compressed. Files and folders with this attribute have been compressed by Windows, to save disk space.
Encrypted. This attribute identifies files that have been encrypted by Windows, at our request. A password is required before any program can read or modify their contents.
Executable. Files with this attribute are programs, or files containing program fragments. The data stored within them are computer instructions - - numbers that order computers to add, subtract, read and write disks, display cute images, make loud beeping sounds then crash, and do all the other endearing things that make us love our binary buddies.
After all that, a lot of programs might have rested on their laurels. But not the hard-charging Directory Printer! One of its newest tricks is making hash.
No, the program hasn't taken up cooking with corned beef. These hashes are what mathematicians call "cryptographic hashes", or sometimes "message digests".
We've talked about these hashes before. They are the result of complex mathematical formulas that transform data in a very special way. Feed a series of 0s and 1s to a "hash algorithms", and produces a short sequence of its own. This new string of 0s and 1s is called the "hash value" of the original data.
Now you may be wondering: what's so special about these formulas? After all, any child can write rules to change data from one form to another. But hash algorithms have two important features that make them very unusual, and very valuable.
First, each formula always produces the same number of bits, no matter how many bits it's asked to process. For example, one popular hash algorithm is called SHA-1 (Secure Hash Algorithm version 1). Feed it one bit, or a trillion bits. It always provides 160 bits in return.
Second, the bits produced by hash algorithms are exquisitely sensitive to the data they process. The tiniest change in the original data produces huge changes in the end result.
To see this in action, let's use a formula called MD5 (Message Digest version 5). The hash values it calculates always contain exactly 128 bits. As an experiment, I fed it two nearly identical streams of data.
The first stream contained exactly eight billion binary 1s. Here's the hash value it computed:
1000100011111110 0100100011110010 0110100110111011 0000010011011011 0100100000010111 0010101001001100 1111101001001100 1110001010001110
Hmm... That's a little hard to read. Let's try again, this time showing the result in the (slightly) more human-friendly hexadecimal numbering system:
88FE 48F2 69BB 04DB 4817 2A4C FA4C E28E
Not great, but it will have to do.
Next, I asked the formula to process a slightly different block of data. This time, the input contained "just" 7,999,999,999 1s, followed by a single 0. In other words, this second test differed from the first by just one bit - one out of eight billion.
This time the MD5 algorithm calculated this hash value:
See any similarities between the two hashes? I don't either. And that's the point. Although the two blocks of data fed into the equation differed by only one bit in eight billion, the results have almost nothing in common.
Together, these two features make hash values convenient "fingerprints" of files. If two files contain the same data, their hash values will be identical too. Likewise, if a file doesn't change over time, its hash value won't change either.
But make any change in a file's contents - no matter how slight - and the new data's hash value will change. And even if two files have exactly the same name, size, attributes, dates, and other characteristics -- if their hash values differ, their contents do too.
Computer crime investigators use hash values to test and prove the integrity of evidence. When data is recovered from a suspect's computer, a hash value of each file is computed. Later, a witness can demonstrate a file hasn't changed while in police custody, by showing the hash value computed at the time of the trial exactly matches the one computed when the data was seized.
Some careful folks compute the hash values of backup files containing sensitive data. Later, they can verify the archived data's integrity by re-computing data's hash value. If the new and old values are the same, the backup data is intact. If not, the data has suffered bit-rot, with at least one 0 becoming a 1, or vice versa. :(
Now you and I can use hash values too. Just ask the Directory Printer to compute and print hash values for all your important files. Later, if you suspect a file has been altered, ask the Directory Printer to perform the task again. If the hash reported today matches the original value, the file is intact. But if the hash values differ, the file's contents have changed.
Incidentally, there are six hash algorithms in common use today. I've already mentioned two of them: MD5 which produces hash values containing 128 bits, and SHA-1 which produces hashes that are 160 bits long.
The others are: SHA-224, SHA-256, SHA-384 and SHA-512. In each case, the number following the letters "SHA" indicates the length of the hash values produced by the algorithm, in bits. For example, the SHA-512 formula yields hash values containing 512 bits (64 bytes).
For most purposes, the shorter hash values serve us just as well as their longer colleagues. [Nerdy Note: Exceptions include hash values used in some types of cryptography. There, longer hash values provide greater security] Shorter hashes are easier to read and compare. But the new Directory Printer lets you chose which hash values you'd like to see. Now you can select any, or all, of the six SHA and MD5 variations!
Design It Yourself
The Directory Printer has long allowed us to select the file information we'd like to see. For example, we can ask the program to report just a file's name, and size. Or show each file's attributes, and date of creation.
But when it came to folders, the old Directory Printer had a mind of its own. It insisted on selecting what it would print, choosing such statistical staples as a folder's name, the number of files it contains, and the total size of those files.
The new Directory Printer breaks that monopoly, placing the selection of folder information where it always belonged. Now we can customize all of the printed reports, and disk files, the program creates. We can add and delete file and folder details as we see fit.
As the number of choices has grown, it's become more and more important to keep the Directory Printer's windows from become cluttered. That's why one new tab on the program's main window, labeled "Other Settings", lets you hide choices you never use, or reveal less common choices that appeal to you. I've also rearranged some parts of the program's windows, hopefully making it easier to tell the program exactly what you want it to do.
There's more to say about the new Directory Printer. But wouldn't you know? We're out of time. :)
Until our next get-together, take the new Directory Printer for a spin by visiting its home page at:
As always, the program is free for personal/home use. If you're a programmer, you can download the updated Visual Basic source code too!
Better yet, get the latest version of every Power Tool on a brand-new, shiny CD. You'll even get three bonus Power Tools, not available anywhere else. The source code of every Power Tool, the text of every issue of my newsletter, and some of my articles written for Windows Magazine, are also included. And owning the CD grants you a special license to use all my Power Tools at work.
Best of all, buying a CD is the easiest way to support the KarenWare.com web site, Karen's Power Tools, and this newsletter! To find out more, visit:
Look at the time! It will be daylight in a just a little while. Where did the night go? Guess it's time for a nap. But I'll be awake again soon, hard at work on another program. Until we meet again, if you see me on the 'net, be sure to wave and say "Hi!"