Compiled regular expressions in dot net

Compiled regular expressions (regex) can be a great way to improve the performance of your code, but it does have some dangers. This post describes the use-case for compiled regex, how you can code for it, and how to avoid the pitfalls that compiling regular expressions present.

Use-cases for compiled regex

Some use-cases for regular expressions are:

  • input validation
  • token replacement
  • mail-merge functionality

The main reason to compile a regex is:

  • boost performance
  • increase throughput
  • decrease load on the CPU at run-time
  • move the cost to application startup or compile time

How to do it (code example)

Simply add the option to the regular expression constructor to create a compiled regex object as follows:

using System.Text.RegularExpressions;
public class CompiledRegexExample 
{
  private static Regex validEmailRegex = 
    new Regex(@"^[a-zA-Z0-9._%-]+@(?:[a-zA-Z0-9-]+\.)+[a-zA-Z]{2,4}$",
                RegexOptions.Compiled | RegexOptions.IgnoreCase);
   
  public static bool IsValidEmail(string EmailAddressToValidate) {
    bool returnValue = false;
    if(validEmailRegex.IsMatch(EmailAddressToValidate)) 
       returnValue = true;
    return returnValue;
  }
}

Caveats

Be aware that when you compile regular expressions you will incur a performance hit on creation and as such the benefit only accrues when the regex is re-used. Definitely don’t compile regex that are created often or used infrequently, however like in the example above, where you have a regex that is being re-used often you can obtain the performance benefits by putting their declaration in a static class or module. Alternatively you can pre-compile the regular expressions in a new project and add a reference to it, then it will pre-compiled in your project and you avoid recompiling each time on application startup.

The other thing to consider before compiling a regular expression is whether you should be using regular expressions at all. Regular expressions are terse and can easily mask underlying complexity. Be sure to use regular expressions at the correct times and in the right way. Take time to understand the worst-case time required to match your expression before you use it. There are almost always alternative methods to achieve the same result as a regular expression.

You can compile regular expressions in any of the Microsoft dot net languages: F#, Visual Basic or C#, and can read more about it in the MSDN library here.

 

0 comments… add one

Leave a Comment