XML: The Parent/Child Relationship

As you'll recall from previous chapters, XML documents are highly structured and must follow strict rules to be well-formed or valid. This structure imposes a specific hierarchical order for XML elements. You will see that all XML documents are organized into a "family tree" of parent/child elements.

Back to Basics

Remember that an XML document can have only one root element. Remember also that every element must nest properly—that is, a child element's closing tag must appear before the parent element's closing tag. The following code demonstrates a proper tree structure and a correct parent/child relationship among tags and elements:

<ROOT>
  <C1-ROOT.Child>
    <Ca-C1.Child></Ca-C1.Child>
    <Cb-C1.Child></Cb-C1.Child>
    <Cc-C1.Child></Cc-C1.Child>
  </C1-ROOT.Child>

  <C2-ROOT.Child>
    <Ca-C2.Child></Ca-C2.Child>
    <Cb-C2.Child></Cb-C2.Child>
    <Cc-C2.Child></Cc-C2.Child>
  </C2-ROOT.Child>
</ROOT>

This hierarchical tree shows how each subelement is a child of a higher element, with the root element on top. Now if some lines are added to connect the "branches," you can clearly see the tree structure of the document.

<ROOT>
|-<C1-ROOT.Child>
| |-<Ca-C1.Child></Ca-C1.Child>
| |-<Cb-C1.Child></Cb-C1.Child>
| |-<Cc-C1.Child></Cc-C1.Child>
|-</C1-ROOT.Child>
|
|-<C2-ROOT.Child>
| |-<Ca-C2.Child></Ca-C2.Child>
| |-<Cb-C2.Child></Cb-C2.Child>
| |-<Cc-C2.Child></Cc-C2.Child>
|-</C2-ROOT.Child>
</ROOT>

All true XML documents are structured in similar ways, although the actual size and complexity of the documents can vary enormously. To further illustrate this concept, let's look at an example of improperly nested elements:

<ROOT>
  <C1-ROOT.Child>
    <Ca-C1.Child>
    <Cb-C1.Child>
    </Ca-C1.Child>
    </Cb-C1.Child>
    <Cc-C1.Child></Cc-C1.Child>
</ROOT>
  </C1-ROOT.Child>

In the fifth line, the Ca element is closed inside what is now its child element, Cb. Also, the Root element closes before the C1 element closes (as shown in the eighth line). This structure would result in an error from an XML processor.

Although the examples on the previous page are good for illustrative purposes, an actual XML document might help you to see how the hierarchical structure works. Code Listing 5-3, which you can also find in the Chap05\Lst5_3.xml file on the companion CD, shows a simple XML document:

Code Listing 5-3.

<?xml version="1.0"?>
<EMAIL>
  <TO>Jodie@msn.com</TO>
  <FROM>Bill@msn.com</FROM>
  <CC>Philip@msn.com</CC>
  <SUBJECT>My document is a tree</SUBJECT>
  <BODY>This is an example of a tree structure</BODY>
</EMAIL>

Let's send this code through the command-line processor and output the results in "tree mode." The output is shown in Code Listing 5-4, which you can also find in the Chap05\Lst5_4.txt file on the companion CD.

Code Listing 5-4.

DOCUMENT
|---XMLDECL
|   |---ATTRIBUTE version "1.0"
+---ELEMENT EMAIL
    |---ELEMENT TO
    |   +---PCDATA "Jodie@msn.com"
    |---ELEMENT FROM
    |   +---PCDATA "Bill@msn.com"
    |---ELEMENT CC
    |   +---PCDATA "Philip@msn.com"
    |---ELEMENT SUBJECT
    |   +---PCDATA "My document is a tree"
    +---ELEMENT BODY
        +---PCDATA "This is an example of a tree structure"
NOTE
The use of the command-line processor is covered in the Introduction of this book.

As you can see in this code listing, this well-formed document follows the expected structure, and the result of the processor is a correct tree format. You must understand the parent/child relationships among elements in an XML document to understand how to get data from an XML document.

The XML Object Model

You just saw that when an XML processor parses a document, it creates a treelike structure of all the elements included in the document. Keep in mind that this structure exists only in the computer's memory until something is done with it. While this data can be output for display as is, for the data to be really useful, authors need to be able to access the data in a consistent way. This can be done if the author understands where the data fits in the document structure.

The XML object model meets this need. It provides an interface that allows an author to access the XML data. The object model exposes properties, methods, and the actual content (data) contained in an object. Since the structure of an XML document is in the form of a tree, you might expect that the object model would let you access the branches, called nodes, of the tree. And you would be correct. The object model lets authors view all parts of the tree, from the root level through its branches. (This chapter will provide an introduction to some parts of the XML object model; later chapters will present more details.)