October 1, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

A Flexible, Compile Time, Configurable XML Parser

  • October 1, 2004
  • By Radu Braniste
  • Send Email »
  • More Articles »

Implementation

NativeXMLParser takes two template parameters:

template <class MODEL_TYPE, class SPECIAL_CONFIGURATION_TYPE>
class XMLParser;

MODEL_TYPE is the model (SAX, DOM, or something else) and the implementation simply takes advantage of the template Strategy pattern to vary the nature of the parser:

void parseXMLEvt(const std::string& s)
{
   typedef XMLParser<XMLEvtHandler, SpecialDataHandler> XI;
      //use XI
}
void parseXMLDOM(const std::string& s)
{
   typedef XMLParser<XMLDOM, SpecialDataHandler> XI;
      //use XI
}

XMLDOM and XMLEvtHandler are just specializations of an interface with the following signature:

struct I
{
   void setName(const std::string& name)
   {}
   void setContent(const std::string& content)
   {}
   XMLBit* getRoot()
   {}
   void addNewXMLBit()
   {}
   void setParent()
   {}
   void setParentToBit()
   {}
};

SPECIAL_CONFIGURATION_TYPE is by far the most interesting thing. As mentioned before, the functionality of this parser can be easily adjusted and SPECIAL_CONFIGURATION_TYPE does the trick.

The idea is simple: Having defined a set of properties to be adjusted at compile time, it is natural to look for a compile-time, type-safe mechanism to download them. Actually, the mechanism has been known for a long time [3] and what has been used in this implementation is a loop traversing the properties and downloading values, accordingly to the property specialization.

This is the set of properties:

namespace XMLPARSER_SPECIAL
{
   enum COMMANDS{
      CDATA=XML_DICTIONARY::CDATA_START/*'['*/,
      COMMENT=XML_DICTIONARY::COMMENT_BIT/* '-'*/, CONTENT
   };
};

and an index is used to quickly loop through them:

namespace XMLPARSER_SPECIAL_INDEX
{
   enum COMMANDS{
      CDATA, COMMENT, CONTENT, LAST_COMMAND
   };
};

Let's see how this works:

template < template <int Y> class E,  int T  >
struct XMLParserHandler
{
   template<class V>
   static int idx(V& v)
   {
      typedef  E<T> B;
      SpecialData sd;
      SpecialHandler<B::value>::getName(sd.name);
      SpecialHandler<B::value>::getEnd(sd.end);
      sd.offset = SpecialHandler<B::value>::getStartOffset();
      v[T] = sd;
      return XMLParserHandler< E, T+1>::idx(v);
   }
};
template <  template <int Y> class E>
struct XMLParserHandler<E, XMLPARSER_SPECIAL_INDEX::LAST_COMMAND  >
{
   template<class V>
   static int idx(V& v)
   {
      return 0;
   }
};

XMLParserHandler is a loop in disguise that's executed recursively until the exit condition is met (XMLParserHandler specialization returning zero). Interesting things happen during the looping process:

  • T is incremented.
  • There is a E<T> that changes at every iteration.
  • A SpecialHandler depending on E<T> collects information in a very simple structure of type SpecialData.
  • Finally, a container of type V stores a copy of the SpecialData instance.

The last step is to understand who E and SpecialHandler are expected to be:

template < int I>
struct SpecialHandler
{
   static void getName(std::string& name)
   { }
   static void getEnd(std::string& end)
   { }
   static int getStartOffset()
   {
      return 0;
   }
};
template < int Y>
struct Int2Type
{
   enum {
      value= Y
   };
};
template <int Y>
struct EXPER_ALL : public Int2Type<Y>
{};

SpecialHandler is specialized for all COMMANDS (actually XMLPARSER_SPECIAL_INDEX::COMMANDS). EXPER is a simple way to express whether or not a property should be included. For example:

As expected, EXPER_ALL considers all properties (COMMANDS) to be taken into account. But:

template <int Y>
struct EXPER_NO_CDATA: public EXPER_ALL<Y>
{};
template <>
struct EXPER_NO_CDATA<XMLPARSER_SPECIAL_INDEX::CDATA> :
       public EXPER_ALL<XMLPARSER_SPECIAL_INDEX::LAST_COMMAND>
{};

EXPER_NO_CDATA, as the name says, doesn't consider CDATA; that's why CDATA index is switched to LAST_COMMAND (a command that's never actually executed). Based on this extensible mechanism, we now can write code like this:

void parseXMLEvt(const std::string& s)
{
   typedef XMLParser<XMLEvtHandler, SpecialDataHandler> XI;
      XI::parse<XMLParserHandlerInit<EXPER_NO_CDATA> >(s);
}
void parseXMLDOM(const std::string& s)
{
   typedef XMLParser<XMLDOM, SpecialDataHandler> XI;
      XI::parse<XMLParserHandlerInit<EXPER_NO_CDATA> >(s);
}

EXPER family contains the following variations:

EXPER_ALL
EXPER_NO_CDATA
EXPER_CONTENT_ONLY
EXPER_BASIC (no CDATA, no comments, no content)
EXPER_NO_CONTENT
EXPER_NO_COMMENT

The code was tested with both VC++ 7.1 and g++ 3.3 compilers.

Download the Code

Download the code that accompanies this article here.

References

[1] Erich Gamma, Richard Helm, Ralph Johnson, and John M. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley 1994.

[2] Dov Bulka and David Mayhew. Efficient C++: Performance Programming Technique, Addison-Wesley 1999.

[3] T. Veldhuizen. "Using C++ template metaprograms," C++ Report Vol. 7 No. 4 (May 1995).





Page 2 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel