July 30, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Named and Non-Capturing Groups in .NET Regular Expressions

  • March 21, 2005
  • By Tom Archer
  • Send Email »
  • More Articles »

Non-Capturing Groups

Groups are not always defined in order to create sub-matches. Sometimes groups get created as a side effect of the parenthetical syntax used to isolate a part of the expression such that modifiers, operators, or quantifiers can act on the isolated part of the expression. Irregardless of your reason for isolating a part of the pattern, once you do it using the parenthesis symbols, the regular expressions parser creates Group objects for that group within each Match object's group collection (Groups).

An example might better explain what I mean. Say you have a pattern to search a string for occurrences of the words "on", "an", and "in":

((A|a)|(O|o)|(I|i))n\s
If you tested this pattern with the following function, which simply displays all the groups of each match, you'd find that each match results in five groups:
void DisplayGroups(String* input, String* pattern)
{
  try
  {
    StringBuilder* results = new StringBuilder();

    Regex* rex = new Regex(pattern);

    // for all the matches
    for (Match* match = rex->Match(input); 
         match->Success; 
         match = match->NextMatch())
    {
      results->AppendFormat(S"Match {0} at {1}\r\n",
                            match->Value,
                            __box(match->Index));

      // for all of THIS match's groups
      GroupCollection* groups = match->Groups;
      for (int i = 0; i < groups->Count; i++)
      {
        results->AppendFormat(S"\tGroup {0} at {1}\r\n",
                              (groups->Item[i]->Value),
                              __box(groups->Item[i]->Index));
 
      }
    }
    MessageBox::Show(results->ToString());
  }
  catch(Exception* pe)
  {
    MessageBox::Show(pe->Message);
  }
}

Figure 1 shows the results of running the following code using the DisplayGroups function:

// Example usage of the DisplayGroups function
DisplayGroups(S"Tommy sat on a chair, in a room", 
              S"((A|a)|(O|o)|(I|i))n\\s");


Figure 1: Using Parentheses in Patterns Always Creates Groups

Therefore, for efficiency&#mdash;especially if you're processing huge amounts of data with your regular expressions, define the group as "non-capturing" by giving your group a blank name as follows:

(?:(?:A|a)|(?:O|o)|(?:I|i))n\s
If you run this pattern through the DisplayGroups function, you'll see that the only groups created represent the entire match (see Figure 2). (There is no way to eliminate that group.)


Figure 2: The Only Groups Created Represent the Entire Match

Looking Ahead

The previous two articles covered groups and group collections in regular expressions. However, one area that is closely tied to groups that I haven't touched on yet is captures. Therefore, the upcoming articles will explain what captures are and how they related to matches and groups.

About the Author

Tom Archer owns his own training company, Archer Consulting Group, which specializes in educating and mentoring .NET programmers and providing project management consulting. If you would like to find out how the Archer Consulting Group can help you reduce development costs, get your software to market faster, and increase product revenue, contact Tom through his Web site.





Page 2 of 2



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel