GuidesThe Story of a WML Generator

The Story of a WML Generator

This article presents a flexible, compile-time safe way of generating WML and WML-like dialects code. Two contrasting solutions are discussed—one relying on C# without generics and the other one on C++. A rarely used C# idiom is presented, increasing the reusability of the code, but C++ and templates related techniques confer the ultimate level of flexibility.

Throughout the article, WML is used as an example of an XML dialect imposing strict rules on the nesting of its elements, but otherwise any hierarchical based language fills the bill. That’s why just a subset of WML is presented and the emphasis is mostly put on the flexible design of the hierarchy of elements and the validation of specific rules governing their composition.

A Short History

Although WML (Wireless Markup Language[1])—a XML-based content language developed in 1999 for WAP—probably will be phased out in the future, it still represents a good standardized example of how XML can be used to create a minimalist language, easy to generate and manipulate.

Generating WML code might look like an easy task, but generating valid WML code is something different. And when additional flexibility is required—to quickly move from WML to a related XML dialect governed by slightly different rules, for example—the task becomes even more daunting.

WML was designed for low-bandwidth, small-display devices and such as it concentrated on wireless transaction efficiency: an application (a deck) has screens (cards) that can be downloaded in bulk on the device and processed as needed. A number of predefined elements can be embedded in cards following a set of rules allowing specific combinations only (for example, from WML DTD 1.2:

<!ELEMENT wml ( head?, template?, card+ )>
<!ELEMENT card (onevent*, timer?, (do | p | pre)*)>
<!ELEMENT go (postfield | setvar)*>

Taking care of all these rules and generating valid WML code is the first goal of your endeavor.

The First Step: C#

You can start with a first implementation attempt, written in C# [3]. Just a subset of WML was used and compositional rules are for demonstrational purpose only [4]:

namespace mlgen
{
   class GenericElement
   {
      public
      GenericElement(string element, string init)
      {
         root_ = init;
         element_ = element;
         inserted_ = false;
         used_ = false;
         openElement();
      }
      public GenericElement(string element) : this(element,
                                                   string.Empty)
      {}
      public override string ToString()
      {
         closeElement();
         return root_;
      }
      protected void genericInsert(GenericElement e)
      {
         whenInserted();
         root_ += e;
      }
      protected void genericAddProperty(string key, string value)
      {
         if (!inserted_)
         {
            root_ += " "
            root_ += key;
            root_ += "="";
            root_ += value;
            root_ += """;
         }
      }
      private void openElement()
      {
         root_ += "<";
         root_ += element_;
      }
      private void whenInserted()
      {
         if (!inserted_)
         {
            root_ += ">";
            inserted_ = true;
         }
         root_ += 'n';
      }
      private void closeElement()
      {
         if (used_)
            return;
         used_ = !used_;
         if (inserted_)
         {
            root_ += 'n';
            root_ += "</";
            root_ += element_;
            root_ += ">";
         }
         else
         {
            root_ += "/>";
         }
      }
      private string root_;
      private readonly string element_;
      bool inserted_;
      bool used_;
   };
   class BreakImpl : GenericElement
   {
      public
         BreakImpl() : base("br")
      { }
   };
   class HeadImpl : GenericElement
   {
      public
         HeadImpl() : base("head")
      { }
      public void insert(BreakImpl e)
      {
         genericInsert(e);
      }
   };
   class ParagraphImpl : GenericElement
   {
      public
         ParagraphImpl() : base("p")
      { }
      };
      class CardImpl : GenericElement
      {
         public
            CardImpl() : base("card")
         {}
         void setID(string id)
         {
            genericAddProperty("id", id);
         }
         public void insert(ParagraphImpl e)
         {
            genericInsert(e);
         }
      };
      class DeckImpl : GenericElement
      {
         public
            DeckImpl() : base("wml",
            "<?xml version="1.0"?>n<!DOCTYPEwml PUBLIC };
              -//PHONE.COM//DTD WML 1.1//EN"n
              "http://www.phone.com/dtd/wml11.dtd" >n" )
            { }
         public void insert(CardImpl e)
         {
            genericInsert(e);
         }
         public void insert(HeadImpl e)
         {
            genericInsert(e);
         }
      };
}

There are a GenericElement and elements sharing its behavior. GenericElement is responsible for the correct generation of XML code and the elements are responsible for the implementation of compositional rules, through the use of static polymorphism (“insert” is overloaded for the type of elements that are allowed to be inserted inside the hosting element; for example, a Deck can aggregate cards and headers only). The final WML representation is built through the overlay of the different elements composing the deck. Please note that GenericElement doesn’t implement an interface, as it is at least restrictive to impose a generic interface fitting to more than WML. But once the intent is to support one family of dialects, only an IGenericElement interface is perfectly plausible.

Of course, this is a simple and straightforward approach—no surprise or mystery. But, what happens if someone changes the rules of the game? A WML dialect where a Deck might take Anchors too, for example?

There is no simple answer, mainly due to the fact that the filtering aspect of the elements (“insert”) cannot be properly exposed in a factory (there is a variable number of insert methods). The same is true if you refactor away the insert method in Filter classes (a technique you’ll use later in the C+ implementation).

One interesting overload of the “using” keyword is that it can be used like a (very restricted) version of typedef—the so-called “using alias” [5]. This means that the implementation to be used can be selected from one place, at compile time:

namespace mlgen
{
   using Card = mlgen.CardImpl;
   using Head = mlgen.HeadImpl;
   using Break = mlgen.BreakImpl;
   using Deck = mlgen.DeckImpl;
   using Paragraph = mlgen.ParagraphImpl;
   public class GenExec
   {
      public static void doReadJads()
      {
         Card c = new Card();
         Paragraph p = new Paragraph();
         Break br = new Break();
         Head hd = new Head();
         hd.insert(br);
         Deck dk = new Deck();
         c.insert(p);
         dk.insert(c);
         dk.insert(hd);
            //p.insert(br);
         Console.WriteLine (dk);
      }
   }
}

The “using” section selects a specific implementation (in the mlgen namespace) and every element can be replaced easily if needed by a different implementation, respecting the Open-Closed principle [6]. Please note that the same idiom can be used for GenericElement—namely defining GenericElement as an alias. This allows different implementations of GenericElement without requiring the use of an interface.

But, “using alias” has restrictions—the alias can be used in the same translation unit only, such as requiring duplications in certain cases and making the idiom impracticable.

The C++ Dose

Moving the C# code to C++ is rewarding, but increases complexity [6]—wml1.h:

struct LateInsertCheck
{
   static void check(bool isClosed)
   {
      assert(!isClosed "inserted after closed");
   }
};
template <class F, class P = LateInsertCheck>
class GenericElement
{
public:
   operator const std::string& () const
   {
      closeElement();
      return root_;
   }
   template <class T>
   void insert(const T& e)
   {
      F::filter(e);
      P::check(used_);
      genericInsert(e);
   }
protected:
   GenericElement(const std::string& element,
                  const std::string& init):
   element_(element), inserted_(false), used_(false)
   {
      root_ += init;
      openElement();
   }
   GenericElement(const std::string& element):
   element_(element), inserted_(false), used_(false)
   {
      openElement();
   }
   ~GenericElement()
   { }
   void genericAddProperty(const std::string& key,
                           const std::string& value)
   {
      P::check(used_);
      P::check(inserted_);
      if (!inserted_)
      {
         root_ += " ";
         root_ += key;
         root_ += "="";
         root_ += value;
         root_ += """;
      }

} template <class T> void genericInsert(const T& e) { whenInserted(); root_ += e; } private: void openElement() { root_ += "<"; root_ += element_; } void whenInserted() { if (!inserted_) { root_ += ">"; inserted_ = true; } root_ += 'n'; } void closeElement() const; { if (used_) return; used_ = !used_; if (inserted_) { root_ += 'n'; root_ += "</"; root_ += element_; root_ += ">"; } else { root_ += "/>"; } } private: mutable std::string root_; const std::string element_; bool inserted_; mutable bool used_; } template <class FG, class T> class BreakImpl : public FG , public T { public: BreakImpl(): FG("br") { } } template <class FG, class T> class HeadImpl : public FG, public T { public: HeadImpl() : FG("head") { } }; template <class FG, class T> class AnchorImpl : public FG, public T { public: AnchorImpl(const std::string& href, const std::string& body) : FG("a") { this->genericAddProperty("href",href); this->genericInsert(body); } }; template <class FG, class T> class ParagraphImpl : public FG, public T { public: ParagraphImpl() : FG("p") { } }; template <class FG,class T> class CardImpl : public FG, public T { public: CardImpl() : FG("card") {} void setID(const std::string& id) { //required in strict mode - to solve ambiguity this->genericAddProperty("id", id); } }; template <class FG,class T> class DeckImpl : public FG, public T { public: DeckImpl() : FG("wml", "<?xml version="1.0"?>n<!DOCTYPE wml PUBLIC "-//PHONE.COM//DTD WML 1.1//EN"n "http://www.phone.com/dtd/wml11.dtd" >n" ) { } };

Filtering was moved outside elements and is attached as a policy to GenericElement. Additional run-time safety was added via LateInsertCheck—more details when discussing the run-time features. Elements are tag typed to avoid circular dependencies and extra complexity (TagTypes.h):

struct IBreakImpl{};
struct IHeadImpl{};
struct IAnchorImpl{};
struct IParagraphImpl{};
struct ICardImpl{};

The IxxxImpl structures are not intended to be manipulated polymorphically and one might take the same precautions as with GenericElement, which has a non-virtual protected destructor.

Please note in the code above constructs of type:

this->method();

This C++ lookup requirement is described in detail in [8].

Elements take as a template parameter the class they inherit from, thus allowing for different filters. Filters are defined separately (filters.h) and of course different implementations can coexist, the one to be used being selected at compile time. A filter acts mostly like an enabler—allowing or disallowing the compilation based on the filter parameter type.

struct BaseFilter
{};
struct ParagraphFilter
{
   static void filter(const IAnchorImpl&){}
};
struct HeadFilter
{
   static void filter(const IBreakImpl&){}
}
struct CardFilter
{
   static void filter(const IParagraphImpl&){}
};
template < class P>
struct DeckFilter
{
   static void filter(const IHeadImpl& e)
   {
      P::check(e);
   }
      static void filter(const ICardImpl& e)
   {
      P::check(e);
   }
};
struct HeadBeforeBodyCheck
{
   static bool headIsFirstInADeck ;
   static void check(const IHeadImpl& e)
   {
      assert (!headIsFirstInADeck && "head was used already");
      headIsFirstInADeck = true;
   }
      static void check(const ICardImpl& e)
   {
      assert (headIsFirstInADeck && "attempt to insert before head" );
   }
};
bool HeadBeforeBodyCheck::headIsFirstInADeck = false;

DeckFilter offers a naïve example of a composite filter—checking at run-time whether a Header was inserted before the Cards in a Deck.

Here is an usage example (wmlExecution1.h):

namespace WML_TYPEDEF_HELPER
{
   typedef CardImpl< GenericElement<CardFilter>, ICardImpl> Card;
   typedef HeadImpl<GenericElement<HeadFilter>, IHeadImpl> Head;
   typedef BreakImpl< GenericElement<BaseFilter>, IBreakImpl> Break;
   typedef DeckImpl< GenericElement<DeckFilter<HeadBeforeBodyCheck> >,
      IDeckImpl> Deck;
   typedef ParagraphImpl< GenericElement<ParagraphFilter>,
      IParagraphImpl> Paragraph;
   typedef AnchorImpl< GenericElement<BaseFilter>,
      IAnchorImpl> Anchor;
}
void doReadWML()
{
      typedef WML_TYPEDEF_HELPER::Card Card;
      typedef WML_TYPEDEF_HELPER::Head Head;
      typedef WML_TYPEDEF_HELPER::Break Break;
      typedef WML_TYPEDEF_HELPER::Deck Deck;
      typedef WML_TYPEDEF_HELPER::Paragraph Paragraph;
      typedef WML_TYPEDEF_HELPER::Anchor Anchor;
   Paragraph p;
   Break br;
   Head hd;
   hd.insert(br);
   Deck dk;
   dk.insert(hd);
   Anchor a("href", "html");
   //a.insert("html");
   Card c;
   c.setID("ID");
   p.insert(a);
      c.insert(p);
   ///p.insert(a);
   dk.insert(c);
      //dk.insert(hd);
   std::cout << static_cast<const std::string&>(dk) << std::endl;
}

WML_TYPEDEF_HELPER isolates the typedef declarations from the rest of the code, as an additional reusability unit.

Attempts like

a.insert("html");

or

dk.insert(p);

are rejected by the compiler.

On the other hand,

p.insert(a);

raises an assertion violation at run time: failed assertion ‘!isClosed && “inserted after closed”‘. This means that the element to be inserted has been already closed and used in a previous context.

Using

dk.insert(hd);

after

dk.insert(c);

has the same consequences as commenting out the first call to

dk.insert(hd);

namely: failed assertion headIsFirstInADeck && “attempt to insert before head”, meaning that the header can be used only once, before inserting any card.

Epilogue

This article presented two flexible implementations of a compile-time safe generator for WML-like XML subsets, coded in C# and C++. Easy to configure for various validation rules, the generator offers an interesting alternative to other similar products based on XSTL workflow chains.

Download the Code

Download the C# source code here and the C++ source code here.

References

[1] http://www.openmobilealliance.org/tech/affiliates/wap/wapindex.html

[2] http://www.wapforum.org/DTD/

[3] mlgen.zip contains the .NET solution

[4] Implementing the entire WML specification is left as an exercise for the reader

[5] http://msdn.microsoft.com/library/default.asp?url=/library/en-us/csref/html/vclrfusingdirective.asp

[6] http://www.objectmentor.com/resources/articles/ocp.pdf

[7] wmlgenerator.zip contains the C++ solution files

[8] http://www.comeaucomputing.com/techtalk/templates/#whythisarrow

Get the Free Newsletter!

Subscribe to Developer Insider for top news, trends & analysis

Latest Posts

Related Stories