Comm Corner Logo
Comm Corner
Data Compression: 
An Overview (Part 2)
by John Woody


Alamo PC Organization: HOME > PC Alamode Magazine > Columns > Comm Corner 
This article completes the theory overview of data compression started in January 1997. Digital information can use up all of our available hard disk space in a short time once we begin downloading data from BBS's or the Internet. Data compression is a fact of life in downloads from the Internet. Utilities to make those compressed data files usable are a must . Additionally, data compression can make storage of all that data easier by reducing it into compressed archives.

 Last month, we covered the basic theory, elements of the syntax found in most programs, archives, and how data compression can enhance our own computer techniques. In this sequence, we will touch on the general theory of decompression and compression and then cover a summary of the basic syntax used in the DOS and Windows versions of PKWare's . ZIP programs.

 PKWare PKZIP204G is by far the most well known data compression program. It has become a commercial program as well as shareware program. It is available in DOS and Windows versions. Other programs include

Each of these programs started as shareware or freeware. Some have now make the transition to full commercial status.

 All of the programs seem to be overly complicated. They do require that one get familiar with the details of how they work. This overview will attempt to cover the basics of PKWare's DOS and Windows versions. The article will touch on how to set the PKWare utilities up so that they work from any directory as a DOS or as a Windows program.

 The Nico Mak Computing program, WinZip version 6.2, is the third program in this series which should be addressed. I will touch on it briefly.

 Remember that the syntax of each program is a little bit different and requires a review of its Help section or users manual to gain knowledge of the basics. Now let us continue the theory overview.
 
 

Decompressing Data

Compressing and decompressing data is accomplished by different data compression programs in different ways. Some are integrated and contain both functions with only a single command difference to compress or decompress. LHARC is an example of an integrated program in that the same program packer command starts both with only a command letter to differentiae them, ie., LHA a FILENAME compresses the file and LHA e FILENAME decompresses it.

 The a and e command are the only difference in the function. Other programs such as PKZIP/PKUNZIP use separate program utilities to complete the function. Both types of compression programs contain optional commands to manage the archived files. Programs with separate program utilities usually only need to include action commands in the compression function. The decompress utility is needed only to decompress the archive.

 In the archive with multiple files, the decompression program or utility decompresses every file in the archive unless control commands are included in the decompression function. These control commands to list, delete, repair, or convert data files among other functions.

 The decompression programs or utilities also provide controls for placement of the decompressed files in sub-directories other than the current sub-directory if required. Target sub-directories are indicated by the path commands. Path information is usually placed right after the packer command, ie., ARJ e ARC C:\ EXAMPLE\ *.TXT. In this case, the target directory is C:\EXAMPLE\, where all of the .TXT files will be placed. Additionally, new sub-directories may be created from the decompression program or utility. Remember that each program has different commands for all of these functions. Existing files are protected automatically as the decompression program or utility unpacks the archive. Files in the target directory are not overwritten until some action is taken. The decompression utility responds by skipping the file, or indicating that a file of the same name exists and gives you the chance to overwrite it. Some programs compare the ages of files, overwriting older files. Some programs also let you specify another file name if necessary. All of the programs have safety prompts built in.

 Some of the data compression programs decompress entire directories. Individual files can be decompressed with or without directory or path information. ARJ automatically incorporates the directory and path information into the archive. LHA, ICE, and PKZIP each require special commands to include the path names during decompression.

Directory structures stored in archives require that we understand relative and absolute path information. Relative path positions are the same as giving directions from where you stand at the time, and absolute path positions are referenced from a fixed point such as the water tower. The root directory is the equivalent of the water tower. One can always get back to the root directory.

Compressing Data

Compressing individual files requires only that the data compression packer program is executed, ie., PKZIP FILENAME. All of the data compression programs work the same way. Each has a list of commands and switches which make it function properly. These commands and switches may be listed by typing the packer program name and pressing <enter>, ie., PKZIP <enter>. Files may be added, moved, updated, or freshened using the command switches in all of the programs. Some of the data compression programs can compress whole directories. PKZIP, ARJ, and LHA have the capability to represent whole directories. This is extremely handy when hard drive space is tight and you want to add new program or data files. Old programs or data files can be archived (compressed). This can add usable disk space to that "small" hard drive.

When storing data files on floppy disks, data compression is almost a must to make sure that all the data gets to one disk. This is the case, even with a 100 MB ZIP drive. And remember, most tape backup programs include data compression as part of the tape backup process. Data compression can be accomplished from an operating system environment completely different from what the original data was composed in. Windows word processor files can be compressed from the DOS line using the DOS version of the data compression program, ie., a letter written in WordPerfect 6.1 can be compressed using the DOS PKWare PK204G PKZIP compression utility. And the compressed file can be decompressed in the same manner. This may take extra steps in completing the task, however, it assures that we at least have a program available for the occasional data compression or decompression.

PKWare PKZIP/PKUNZIP Syntax for DOS

PKWare PKZIP is one of the programs in which the compression and decompression utilities are separate programs. PK204G.ZIP is the name of the self-extracting file which can be downloaded from the Alamo BBS or from the Internet. PK204G is complete with the PKZIP and PKUNZIP utilities as well as the Command switches necessary to complete its operation. There is a Tutorial in the PK204G setup program.

Remember the four elements of each data compression program; namely,

The program and archive name must be part of the command execution and the other elements are optional enhancements.

 The data compression program sub-directory should be included in the AUTOEXEC.BAT Path line. Name the sub-directory ZIP or PKWARE, then make sure that sub-directory name is in the PATH line. Next add a line the AUTOEXEC.BAT like:

 SET PKZIP.CFG C:\SUB-DIRECTORY NAME (ZIP OR PKWARE).

 Now we are ready to conduct PKZIP data compression functions. To make a ZIP file named TEST all we have to do is type: PKZIP TEST. The program does all of the functions. This is the simplest command which can be executed. Two functions take place. First, the first file is always a .ZIP file, and second, unless specified, all the files in a sub-directory will be compressed. Now let us describe the function of compressing one data file. This procedure would assume that the PKZIP utility is available in the directory or in the system PATH and the filename of the data to be compressed is DATA.WPD in a sub-directory on HD C:\ TEMP and we will name the compressed file TEST. The syntax for these commands is:

 <PKZIP TEST DATA.WPD>

 We have now compressed the file DATA.WPD into a file called TEST.ZIP. We do not have to add the .ZIP after TEST as it will be added by the PKZIP utility. The zipped file name is TEST.ZIP. Now, let us use the first of the PKWare command options, ie., the view (-v) command. The syntax for this command is:

 <PKZIP TEST -v>

 The (-v) command is typical of these commands. It allows us to look at the compressed file and see what is in it. In this case we would have seen that one file was compressed and that its name was DATA.WPD.

There are thirty four commands available inside the PKZIP utility. There also nineteen commands available inside the PKUNZIP utility. These may be viewed in detail by printing the PK204G manual or by registering the program. It is the use of these commands and options which makes PKZIP and the other compression programs seem to be hard to use. They are NOT hard to use, but must be studied in some detail to be full benefit from them.

Now we can extend the PKZIP utility to doing more than one file compression at a time. Let us assume that we have four data files, DATA1.WPD, DATA2.WPD, DATA3.WPD, and DATA4.WPD to be compressed in an archive. We have placed these files in a sub-directory on HD C:TEMP. This compressed file will be named TEST2. The syntax for this operation is:

 <PKZIP TEST2 *.WPD>

 Notice that PKZIP lets us use the standard DOS wild card <*>command to find the files ending with .WPD in this directory. The zipped file name is TEST2.ZIP and contains all four .WPD data files. We can view inside this compressed file again using the (-v) command. It will allow us to view all four files or we can use the command to view one of the files. The syntax for viewing one of the compressed files is:

 <PKZIP TEST2 -v DATA1.WPD>

 or the syntax for viewing all of them is:

 <PKZIP TEST2 -v>

 Decompressing archived files with PKWare's program is accomplished the same way, but with the PKUNZIP utility. Using the above examples we can decompress (unzip) TEST.ZIP by using the following syntax:

 <PKUNZIP TEST>

 This restores the TEST.ZIP to its original file DATA.WPD.

 Continuing, we can decompress the TEST2.ZIP archive containing four data files by typing the syntax:

 <PKUNZIP TEST2>

 This command will restore all four data files.

The PKWare Windows utilities work the same way within the Windows environment. They work from Windows 3.xx File Manager. There is another compression utility which works even better in the Windows 3.xx or Windows 95 environment. This program is WINZIP from Nico Mak Computing.

WINZIP 6.2

WINZIP is a program which utilizes compression programs in the Windows graphical environment. It is almost a must have program for those of us who use the Internet on a daily basis. Nico Mak Computing has packaged the shareware program so that it loads directly into the Windows 3.xx File Manager or into Windows 95. It has both 16 bit and 32 bit versions.

 It can be set up to be the browser program helper in Netscape. It contains toolbar buttons for Opening, Adding, Extracting, Viewing, Checking Out, and starting New files. There are two options, Wizard or Classic, for performing the functions. Wizard is the fast method and Classic allows you to see the full details of the compressed file.

The latest version, 6.2, has many features which directly relate to Internet functions. Version 6.2 now let one open and extract UUencoded, XXcoded, BinHex and MIME files. These are all methods of compressing data files from Unix and other operating systems other than DOS or Windows. This capability is needed for users of browser and other FTP downloads of documents from the Internet.

WINZIP 6.2 also has a Favorite Toolbar button which holds the File/Favorite Zip Folders in a menu entry list so that you can easily find where the download was directed to. It allows for easy access to Archives or recently downloaded files.

 It also has a self-extraction utility included so that you can compress files and send them to other individuals who do not have compression utilities. It is called WinZip Self-Extractor Personal Edition and works in both 16 and 32 bit versions.
 
 

Conclusion

This article concludes a summary of the theory of data compression. Additionally, we have attempted to give an overview of the syntax used in one of the popular data compression programs. NOTE that I have touched very little on direct use in telecommunications or computer data transmission via BBS or the Internet. Data Compression is important in its own right. It goes without saying that before long when one gets on the Internet that they will have to use data compression utilities. Downloads from the Internet are nearly always compressed and require decompressing to use.
 
  I recommend highly that each program be obtained and registered and maintained in its latest form.

It is just good computer practice to obtain and learn at least one data compression program. The PKWARE PK204G.ZIP utilities are highly used. This utility can be downloaded as Shareware from most BBS's, as can most of the other programs. There are Windows 3.11 and Windows 95 versions of most of them. We all need to learn at least one of them well.
 
 

John Woody is a telecommunications consultant specializing in small business communications networks and Internet business training.