Alamo PC Organization: HOME > PC Alamode Magazine > Product Reviews

cat

 

Book Review of:
XML 
Principles, Tools, and Techniques 

From the November, 1999 issue of PC Alamode Magazine
by Russell Albach
When you access a web site, and a page begins to display text, graphics and maybe sound, what you are actually viewing is what the web site is telling your browser to show you. The actual page looks like it belongs in a display on ancient writings. Open the Edit menu, select show source and you will see the page as it appears to your browser. All the information you see is instructions for your browser. These instructions are written in what is referred to as HTML (HyperText Markup Language). I still don't see this as a language, but rather as a format. You can see something similar in your word processor. Open the edit menu and select show codes. This displays the formatting codes such as hard return, paragraph, etc. in addition to the text. 

The "magic" in a web page is in the browser, hence the "browser wars" we are now suffering through. Due to different code and formats, what displays as designed in one browser does not in another. HTML was; and is, supposed to level the playing field by providing a common structure to work with. So many variations on HTML are being used, the rules keep changing. 

If you look in the cache sub directory of your browser, you will see many files with acronyms like .cla (java applet), .css (cascading style sheet), .ram (Real Audio), .swf (ShockWave file), etc. These files are part of applications sometimes referred to as plug-ins, and are used to perform certain tasks through the browser. Due to variations in formatting, and the increasing incompatibility of the browsers, there is no consistency from one browser to another. This makes it very difficult for programmers and web designers to produce products for the web, and keeps the Internet in a constant state of flux. What is needed is a format that is more flexible than HTML. Fortunately, W3C is a group that is working to do just that. W3C is an acronym for World Wide Web Consortium and is comprised of individuals and corporations whose mission is development of the web to it's fullest potential. One of the areas it is working on is a modification of the language used on the web. XML (eXtensible Markup Language) is the new 'standard' being developed for structured documents on the web. 

To illustrate the problem, let's use a bookstore to represent the web. Pick up a book (novel, biography, fiction, etc.) with text only, no illustrations or photos. Every book of text has the same characteristics such as dark print on light paper, reading from top left to bottom right. Let this represent the FORMAT. The ideas and words differ from one book to another, but the FORMAT is the same. Now pick out a photo essay on a pictorial tour of China. This has photos with little or no text, and this type of book differs from one to another in layout. This represents CONTENT, which has no relation to FORMAT. Combine the two and you get a magazine, or a periodical, or a newspaper. In other words, combining FORMAT and CONTENT can give something as organized as National Geographic, or as disorganized as a local newspaper. Right now, combining both is making a mess of the internet, and defining and refining XML might fix this. Confused? You should be. The people working on this seem to have problems agreeing on how to develop and implement a 'standard'. 

XML — Principles, Tools, and Techniques is part of the World Wide Web Journal, and goes into detail about the problems mentioned above. This journal is a compilation of a series of articles about the web. Because it is written by many different contributors, you get varied viewpoints, demonstrating the problems involved in creating these 'standards'. While this journal is not light reading, or something to put you to sleep at night, it does have some entertaining parts. An example is a subtitle by David Siegel for his contribution "The Web Is Ruined And I Ruined It". In some ways this journal seems like a complaint box, each contributor explaining why that won't work, but this will. On the other hand, there is a lot of useful information. Technical papers discuss XML, Chemical Markup Language, codifying medical records in XML, and JUMBO (Java Universal Markup Language), an object-based XML browser. The explanations for all the acronyms alone is worth the price, with the examples of code merely a bonus. Yes, there is a lot of sample code included, but not for use on your site. Rather it is used to illustrate and explain each of the techniques mentioned. The W3C Journal is like medical journals, legal journals, engineering, etc. It is designed to keep you updated on current happenings in a specific field. The Journal is published quarterly, and you can subscribe from the publisher, or purchase individual copies from larger local bookstores. 

If you are interested in where the Internet is, and where it is going, this is a good road map. It is the thoughts and practices of the people directing it. 

Subscriptions to the Journal cost $99.95 per year for four issues, one per quarter, postage included. This individual issue carries a retail price of $29.95, and copies are generally available in the larger bookstores. You can obtain examples of books from the publisher via ftp: ftp:oreilly.com, (login: anonymous. password: your e-mail address), For e-mail mailing list updates: listproc@online.oreilly.com, put the following information in the first line of your message (Not in the Subject Field): 
subscribe oreilly-news "your name" of "organization" O'Reilly & Associates, Inc.
 


I am a Petroleum Landman and find computer useful in that area. However, I prefer working with people. Have you ever tried to outsmart an inanimate object? You can reach me at arben2@swbell.net