October 30, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Using the XML DOM with Visual C++ and COM

  • September 1, 2000
  • By Tom Archer
  • Send Email »
  • More Articles »

The XML Document Object Model, or DOM, is a very powerful and robust programmatic interface that not only enables you to programatically load and parse an XML file, or document, it also can be used to traverse XML data. Using certain objects in the DOM, you can even manipulate that data and then save your changes back to the XML document. A full and comprehensive look at all the DOM's functionality would be impossible in the space provided here. So in this article I'll highlight some of the power of the DOM by using it to load an XML document and then iterate through the document's elements.

The key to understanding how to use the DOM is realizing that the DOM exposes XML documents as a hierarchical tree of nodes. As an example, take a look at the following sample XML document.

<?xml version="1.0"?>
<autos>
  <manufacturer name="Chevrolet">
    <make name="Corvette">
      <model>2000 Convertible</model>
      <price currency="usd">60,000</price>
      <horsePower>420</horsePower>
      <fuelCapacity units="gallons">18.5</fuelCapacity>
    </make>
  </manufacturer>
  <manufacturer name="Mazda">
    <make name="RX-7">
      <model>test model</model>
      <price currency="usd">30,000</price>
      <horsePower>350</horsePower>
      <fuelCapacity units="gallons">15.5</fuelCapacity>
    </make>
  </manufacturer>
</autos>

The DOM would interpret this document as follows:

  • <Autos> - This is a NODE_ELEMENT (more on this later) and is referred to as the documentElement
  • <Manufacturer>, <Make>, <Model>, <Price> <HorsePower> and <FuelCapacity> - Each one of these is also a NODE_ELEMENT. However, please note that only the top level NODE_ELEMENT, or root node is referred to as the documentElement.
  • currency="usd", units="gallons"- When a NODE_ELEMENT contains an attribute/value pair like this, the value is referred to as a NODE_TEXT

As you will see shortly, there a number of COM components that part of the XML DOM. Here's a list of the some of the more interesting components and their purpose.

  • XMLDOMDocument - The top node of the XML document tree
  • XMLDOMNode - This represents any single node in the XML document tree.
  • XMLDOMNodeList - This is the collection of all XMLDOMNode objects
  • XMLDOMNamedNodeMap
  • - The collection of all the XML document tree attributes

Accessing IE 5's XML Support with Visual C++

I'm a firm believer in a tutorial-style, "let's walk through the code" approach so let's get started seeing just what the COM can do for us by cranking up the Visual C++ development environment and writing some code to load an XML document and navigate through its elements.

While we can do this utilizing MFC or ATL, we'll keep things simple (for me at least :) and use MFC. Therefore, perform the following steps to create the test project and incorporate IE5 XML support into your application.

  1. Create a new Visual C++ project called XMLDOMFromVC. (This project is included with this article's source code.)
  2. In the MFC AppWizard, define the project as being a dialog-based application.
  3. Once the AppWizard has completed its work, add a call to initialize OLE support by inserting a call to ::AfxOleInit in the application class's InitInstance function. Assuming you named your project the same as mine, your code should now look like this (with the AfOleInit call shown here):

  4. BOOL CXMLDOMFromVCApp::InitInstance()
    {
     AfxEnableControlContainer();
    
     // .. other code
    
     ::AfxOleInit();
    
     // Since the dialog has been closed, return FALSE
     // so that we exit the app, rather than start the
     // application's message pump.
     return FALSE;
    }
    
  5. At this point, you'll need to import the Microsoft XML Parser typelib (OLE type library). The simplest way to do this is to use the C++ #import directive. Simply open your project's stdafx.h file and add the following lines before the file's closing #endif directive.
  6. #import <msxml.dll> named_guids
    using namespace MSXML;
    
  7. At this point, we can start declaring some variable to use with the DOM. Open your dialog class' header file (XMLDOMFromVCDlg.h) and add the following smart pointer member variables where the IXMLDOMDocumentPtr is the pointer to the XML document itself and the IXMLDOMElement is a pointer to the XML document root (as explained above).
    IXMLDOMDocumentPtr m_plDomDocument;
    IXMLDOMElementPtr m_pDocRoot;
    
  8. Once you've declared the XML smart pointers, insert the following code in your dialog class' OnInitDialog member function (just before the return statement). This code simply initializes the COM runtime and sets up your XML document smart pointer (m_plDomDocument).
    // Initialize COM
    ::CoInitialize(NULL);
    
    HRESULT hr = m_plDomDocument.CreateInstance(CLSID_DOMDocument);
    if (FAILED(hr))
    {
     _com_error er(hr);
     AfxMessageBox(er.ErrorMessage());
     EndDialog(1);
    }
    

Now that you've done the preliminary work for include XML support into your Visual C++ applications, let's do something useful like actually loading an XML document. To do that, simply add the following code to your dialog (just after the initialization code entered above). I've sprinkled comments through the code to explain what I'm doing each step of the way. I would recommend putting this code into your dialog's OnInitDialog member function.

// specify xml file name
CString strFileName ("XMLDOMFromVC.xml");

// convert xml file name string to
 something COM can handle (BSTR)
_bstr_t bstrFileName;
bstrFileName = strFileName.AllocSysString();

// call the IXMLDOMDocumentPtr's load 
function to load the XML document
variant_t vResult;
vResult = m_plDomDocument->load(bstrFileName);
if (((bool)vResult) == TRUE) // 
success!
{
 // now that the document is loaded, we need
 to initialize the root pointer
 m_pDocRoot = m_plDomDocument->documentElement;
 AfxMessageBox("Document loaded successfully!");
}
else
{
 AfxMessageBox("Document FAILED to load!");
}

Don't believe it's that easy? Add the following call to have the contents of your entire XML document displayed in a message box.

AfxMessageBox(m_plDomDocument->xml);

Now, build and run the application and you should see results similar to Figure 1.

Figure 1. Loading and displaying an XML document can take just a few lines of code with the DOM.

Ok. Ok. This doesn't really count as reading through an XML document, but I wanted to show you that you had successfully loaded a document and that you can easily get the entire document's contents with a single line of code. In the next section, we'll see how to manually iterate through XML elements.

Iterating Through an XML Document

In this section, we'll learn about a couple of method and properties that you'll use quite often when iterating through a document's elements: IXMLDOMNodePtr::firstChild and IXMLDOMNodePtr::nextSibling.

The following reentrant function shows a way by which you can do this quite easily. In fact, if you insert this code into the dialog's OK button handler it will display each element in your document:

void CXMLDOMFromVCDlg::OnOK() 
{
 // send the root to the DisplayChildren function
 DisplayChildren(m_pDocRoot);
}

void CXMLDOMFromVCDlg::DisplayChildren(IXMLDOMNodePtr pParent)
{
 // display the current node's name
 DisplayChild(pParent);

 // simple for loop to get all children
 for (IXMLDOMNodePtr pChild = pParent->firstChild;
      NULL != pChild;
      pChild = pChild->nextSibling)
 {
  // for each child, call this function so that we get 
  // its children as well
  DisplayChildren(pChild);
 }
}

void CXMLDOMFromVCDlg::DisplayChild(IXMLDOMNodePtr pChild)
{
 AfxMessageBox(pChild->nodeName);
}

If you were to build and run the project at this point, you would definitely notice something peculiar. The first few message boxes will appear as you might expect. The first one displaying the value "autos", followed by by "manufacturerer" and then "make" and finally "model". However, at that point (after the message box displaying the value "Model") things will get a little strange. Instead of a message box displaying the value "price", the value "#text" will be displayed! The reason for this is simple.

Let's go back to our XML document. In particular, this line:

      2000 Convertible

As you can see in the line above, a value succeeds the model tag, These "values" are still treated as nodes in XML when using the IXMLDOMNodePtr::firstChild and IXMLDOMNodePtr::nextSibling methods. Therefore, how do you know what type of node you have? By using the IXMLDOMNodePtr::nodeType property. Simply modify your dialog's CXMLDOMFromVCDlg::DisplayChild member function similar to code below. When you've done that and run the code, you will see the expected values instead of the literal "#text".

void CXMLDOMFromVCDlg::DisplayChild(IXMLDOMNodePtr pChild)
{
 if (NODE_TEXT == pChild->nodeType)
 {
  AfxMessageBox(pChild->text);
 }
 else
 {
  AfxMessageBox(pChild->nodeName);
 }
}

You no doubt also noted the "magic" constant used above (NODE_TEXT). All the node types are defined with an enum in the msxml.tlh file that was generated with the #import directive you used earlier. This enum structure is listed below:

enum tagDOMNodeType
{
    NODE_INVALID = 0,
    NODE_ELEMENT = 1,
    NODE_ATTRIBUTE = 2,
    NODE_TEXT = 3,
    NODE_CDATA_SECTION = 4,
    NODE_ENTITY_REFERENCE = 5,
    NODE_ENTITY = 6,
    NODE_PROCESSING_INSTRUCTION = 7,
    NODE_COMMENT = 8,
    NODE_DOCUMENT = 9,
    NODE_DOCUMENT_TYPE = 10,
    NODE_DOCUMENT_FRAGMENT = 11,
    NODE_NOTATION = 12
};

Summary

In this article, you discovered the XML DOM and learned how to access its features from Visual C++ / COM. The demo we built illustrated the following basic DOM functions:

  • Loading an XML document
  • Iterating through a document's nodes
  • Determining a node's type
  • Displaying NODE_TEXT node values

There is obviously much more to DOM than what you've seen here, but hopefully what you've learned will whet your appetite to dig into the documentation and to see all the great things you can do with XML documents using the DOM.

Download the source and project files (15k).

About the author: Tom Archer runs the CodeGuru Web site as well as a brand new site dedicated to Windows DNA and .NET programming called sourceDNA. In addition, he also writes books and magazine articles and occasionally speaks at major Visual C++ conferences. When Tom isn't doing all that, he somehow tries to find time for his two favorite hobbies: traveling and meeting new people.






Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 



Rocket Fuel