January 19, 2021
Hot Topics:

Text Transformation with MGrammar and the Oslo SDK

  • By Jeffrey Juday
  • Send Email »
  • More Articles »

My job involves moving different shapes and sizes of data linking systems and business processes together. Normally, I use Integration or Data Warehousing tools. Until I started using the Oslo SDK CTP "M" language, I've never considered building my own Domain Specific Language (DSL) as a tool in my repertoire.

If you've been following my Oslo SDK articles http://www.codeguru.com/columns/experts/article.php/c15779/, you've been introduced to MSchema, MGraph, and the Repository, all components of the Oslo SDK Modeling backbone. MGrammar is the third component in the "M" language. However, instead of defining data structure like MSchema, MGrammar defines data transformation, in particular, human-readable Text data transformation. Continuing to use the sample model I've developed in the other articles, I'm going to show you how MGrammer can be employed to populate Repository data.

Oslo Overview

Oslo is composed of the following components displayed in Figure 1.

Figure 1: Oslo Architecture

Source: "Microsoft PDC 2008—A Lap Around Olso"

  • "M", a language for composing models
  • The Repository, a SQL Server database designed for storing models
  • Quadrant, a tool for editing and viewing model data

Currently, Quadrant is only available to PDC attendees. M and the Repository come with the Oslo SDK available on the Oslo Developer Center site http://msdn.microsoft.com/en-us/oslo/default.aspx.

Oslo's goal is to deliver a foundation for building and storing models of all types. Models are application metadata formatted for runtime consumption. Separate Microsoft initiatives aim to build runtimes and tooling into applications such as Visual Studio that are Oslo model aware.

As I mentioned earlier, the M Language is composed of MSchema, MGraph, and MGrammar. A complete introduction to M is beyond the scope of this article. MSchema and MGraph were covered in my prior articles this article will acquaint you with MGrammar.

MGrammar Overview

Unlike XML, Text is a natural human-consumable data medium. Although text can be semi-structured like XML, text is not standardized like XML is. Parsing text to store it in, for example, a relational database using traditional development tools, though not difficult, is difficult to do right. MGrammar bridges the gap between plain human-readable, composable text data and XML, making semi-structured text parsing more approachable.

In a typical MGrammar program, a developer defines the patterns to search in the text and defines how the pattern is translated into MGraph. MGraph looks a lot like inline C# collections. Oslo utilizes MGraph to populate MSchema models in the Repository.

In MGrammar, a developer defines a set of Rules for transforming text into MGraph. MGrammar has three types of Rules:

  • Token rules work and look a lot like Regular expressions
  • Syntax rules can be composed of Tokens and define the MGraph produced from text input.
  • Interleave rules define ignored text.

There are other features of MGrammar. However, a complete survey of the language is beyond the scope of this article and Rules are really the core of the language. So, I'm going to focus on Rules and, in particular, Token and Syntax rules. Using a sample, I'll illustrate how some basic Token and Syntax capabilities are employed to parse text.

Sample Overview

The sample leverages the models I built in my prior article http://www.codeguru.com/columns/experts/article.php/c15779/. Model code snippets appear below.

type Requirement : Item
   Description : Text?;
   ApplicationId : Integer64;

Requirements : Requirement* where
item.ApplicationId in MyApplications.Id;

type ServerConfiguration : Item
   Server : Text;
   ApplicationId : Integer64;

ServerConfigInfo : ServerConfiguration* where
item.ApplicationId in MyApplications.Id;

Page 1 of 3

This article was originally published on February 17, 2009

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date