A Flexible, Compile Time, Configurable XML Parser, Page 2
Implementation
NativeXMLParser takes two template parameters:
template <class MODEL_TYPE, class SPECIAL_CONFIGURATION_TYPE> class XMLParser;
MODEL_TYPE is the model (SAX, DOM, or something else) and the implementation simply takes advantage of the template Strategy pattern to vary the nature of the parser:
void parseXMLEvt(const std::string& s)
{
typedef XMLParser<XMLEvtHandler, SpecialDataHandler> XI;
//use XI
}
void parseXMLDOM(const std::string& s)
{
typedef XMLParser<XMLDOM, SpecialDataHandler> XI;
//use XI
}
struct I
{
void setName(const std::string& name)
{}
void setContent(const std::string& content)
{}
XMLBit* getRoot()
{}
void addNewXMLBit()
{}
void setParent()
{}
void setParentToBit()
{}
};
SPECIAL_CONFIGURATION_TYPE is by far the most interesting thing. As mentioned before, the functionality of this parser can be easily adjusted and SPECIAL_CONFIGURATION_TYPE does the trick.
The idea is simple: Having defined a set of properties to be adjusted at compile time, it is natural to look for a compile-time, type-safe mechanism to download them. Actually, the mechanism has been known for a long time [3] and what has been used in this implementation is a loop traversing the properties and downloading values, accordingly to the property specialization.
This is the set of properties:
namespace XMLPARSER_SPECIAL
{
enum COMMANDS{
CDATA=XML_DICTIONARY::CDATA_START/*'['*/,
COMMENT=XML_DICTIONARY::COMMENT_BIT/* '-'*/, CONTENT
};
};
and an index is used to quickly loop through them:
namespace XMLPARSER_SPECIAL_INDEX
{
enum COMMANDS{
CDATA, COMMENT, CONTENT, LAST_COMMAND
};
};
Let's see how this works:
template < template <int Y> class E, int T >
struct XMLParserHandler
{
template<class V>
static int idx(V& v)
{
typedef E<T> B;
SpecialData sd;
SpecialHandler<B::value>::getName(sd.name);
SpecialHandler<B::value>::getEnd(sd.end);
sd.offset = SpecialHandler<B::value>::getStartOffset();
v[T] = sd;
return XMLParserHandler< E, T+1>::idx(v);
}
};
template < template <int Y> class E>
struct XMLParserHandler<E, XMLPARSER_SPECIAL_INDEX::LAST_COMMAND >
{
template<class V>
static int idx(V& v)
{
return 0;
}
};
XMLParserHandler is a loop in disguise that's executed recursively until the exit condition is met (XMLParserHandler specialization returning zero). Interesting things happen during the looping process:
- T is incremented.
- There is a E<T> that changes at every iteration.
- A SpecialHandler depending on E<T> collects information in a very simple structure of type SpecialData.
- Finally, a container of type V stores a copy of the SpecialData instance.
The last step is to understand who E and SpecialHandler are expected to be:
template < int I>
struct SpecialHandler
{
static void getName(std::string& name)
{ }
static void getEnd(std::string& end)
{ }
static int getStartOffset()
{
return 0;
}
};
template < int Y>
struct Int2Type
{
enum {
value= Y
};
};
template <int Y>
struct EXPER_ALL : public Int2Type<Y>
{};
SpecialHandler is specialized for all COMMANDS (actually XMLPARSER_SPECIAL_INDEX::COMMANDS). EXPER is a simple way to express whether or not a property should be included. For example:
As expected, EXPER_ALL considers all properties (COMMANDS) to be taken into account. But:
template <int Y>
struct EXPER_NO_CDATA: public EXPER_ALL<Y>
{};
template <>
struct EXPER_NO_CDATA<XMLPARSER_SPECIAL_INDEX::CDATA> :
public EXPER_ALL<XMLPARSER_SPECIAL_INDEX::LAST_COMMAND>
{};
EXPER_NO_CDATA, as the name says, doesn't consider CDATA; that's why CDATA index is switched to LAST_COMMAND (a command that's never actually executed). Based on this extensible mechanism, we now can write code like this:
void parseXMLEvt(const std::string& s)
{
typedef XMLParser<XMLEvtHandler, SpecialDataHandler> XI;
XI::parse<XMLParserHandlerInit<EXPER_NO_CDATA> >(s);
}
void parseXMLDOM(const std::string& s)
{
typedef XMLParser<XMLDOM, SpecialDataHandler> XI;
XI::parse<XMLParserHandlerInit<EXPER_NO_CDATA> >(s);
}
EXPER family contains the following variations:
EXPER_ALL EXPER_NO_CDATA EXPER_CONTENT_ONLY EXPER_BASIC (no CDATA, no comments, no content) EXPER_NO_CONTENT EXPER_NO_COMMENT
The code was tested with both VC++ 7.1 and g++ 3.3 compilers.
Download the Code
Download the code that accompanies this article here.
References
[1] Erich Gamma, Richard Helm, Ralph Johnson, and John M. Vlissides. Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley 1994.
[2] Dov Bulka and David Mayhew. Efficient C++: Performance Programming Technique, Addison-Wesley 1999.
[3] T. Veldhuizen. "Using C++ template metaprograms," C++ Report Vol. 7 No. 4 (May 1995).
