Sideway
output.to from Sideway


Built-in Componet


Regular Expression




Draft for Information Only

Content

Regular Expression Behaviours
 Benefits of the NFA Engine
 .NET Framework Engine Capabilities
 Scanning of Valid Email Format
 See also
 Source/Referencee

Regular Expression Behaviours

The .NET Framework regular expression engine is a backtracking regular expression matcher that incorporates a traditional Nondeterministic Finite Automaton (NFA) engine such as that used by Perl, Python, Emacs, and Tcl. This distinguishes it from faster, but more limited, pure regular expression Deterministic Finite Automaton (DFA) engines such as those found in awk, egrep, or lex. This also distinguishes it from standardized, but slower, POSIX NFAs. The following section describes the three types of regular expression engines, and explains why regular expressions in the .NET Framework are implemented by using a traditional NFA engine.

Benefits of the NFA Engine

When DFA engines perform pattern matching, their processing order is driven by the input string. The engine begins at the beginning of the input string and proceeds sequentially to determine whether the next character matches the regular expression pattern. They can guarantee to match the longest string possible. Because they never test the same character twice, DFA engines do not support backtracking. However, because a DFA engine contains only finite state, it cannot match a pattern with backreferences, and because it does not construct an explicit expansion, it cannot capture subexpressions.

Unlike DFA engines, when traditional NFA engines perform pattern matching, their processing order is driven by the regular expression pattern. As it processes a particular language element, the engine uses greedy matching; that is, it matches as much of the input string as it possibly can. But it also saves its state after successfully matching a subexpression. If a match eventually fails, the engine can return to a saved state so it can try additional matches. This process of abandoning a successful subexpression match so that later language elements in the regular expression can also match is known as backtracking. NFA engines use backtracking to test all possible expansions of a regular expression in a specific order and accept the first match. Because a traditional NFA engine constructs a specific expansion of the regular expression for a successful match, it can capture subexpression matches and matching backreferences. However, because a traditional NFA backtracks, it can visit the same state multiple times if it arrives at the state over different paths. As a result, it can run exponentially slowly in the worst case. Because a traditional NFA engine accepts the first match it finds, it can also leave other (possibly longer) matches undiscovered.

POSIX NFA engines are like traditional NFA engines, except that they continue to backtrack until they can guarantee that they have found the longest match possible. As a result, a POSIX NFA engine is slower than a traditional NFA engine, and when you use a POSIX NFA engine, you cannot favor a shorter match over a longer one by changing the order of the backtracking search.

Traditional NFA engines are favored by programmers because they offer greater control over string matching than either DFA or POSIX NFA engines. Although, in the worst case, they can run slowly, you can steer them to find matches in linear or polynomial time by using patterns that reduce ambiguities and limit backtracking. In other words, although NFA engines trade performance for power and flexibility, in most cases they offer good to acceptable performance if a regular expression is well-written and avoids cases in which backtracking degrades performance exponentially.

.NET Framework Engine Capabilities

To take advantage of the benefits of a traditional NFA engine, the .NET Framework regular expression engine includes a complete set of constructs to enable programmers to steer the backtracking engine. These constructs can be used to find matches faster or to favor specific expansions over others.

Other features of the .NET Framework regular expression engine include the following:

  • Lazy quantifiers: ??, *?, +?, {n,m}?. These constructs tell the backtracking engine to search the minimum number of repetitions first. In contrast, ordinary greedy quantifiers try to match the maximum number of repetitions first.

  • Positive lookahead: (?=subexpression). This feature allows the backtracking engine to return to the same spot in the text after matching a subexpression. It is useful for searching throughout the text by verifying multiple patterns that start from the same position. It also allows the engine to verify that a substring exists at the end of the match without including the substring in the matched text.

  • Negative lookahead: (?!subexpression). This feature adds the ability to match an expression only if a subexpression fails to match. This is particularly powerful for pruning a search, because it is often simpler to provide an expression for a case that should be eliminated than an expression for cases that must be included. For example, it is difficult to write an expression for words that do not begin with "non".

  • Conditional evaluation: (?(expression)yes|no) and (?(name)yes|no), where expression is a subexpression to match, name is the name of a capturing group, yes is the string to match if expression is matched or name is a valid, non-empty captured group, and no is the subexpression to match if expression is not matched or name is not a valid, non-empty captured group. This feature allows the engine to search by using more than one alternate pattern, depending on the result of a previous subexpression match or the result of a zero-width assertion. This allows a more powerful form of backreference that permits, for example, matching a subexpression based on whether a previous subexpression was matched.

  • Balancing group definitions: (?<name1-name2> subexpression). This feature allows the regular expression engine to keep track of nested constructs such as parentheses or opening and closing brackets.

  • Nonbacktracking subexpressions (also known as greedy subexpressions): (?>subexpression). This feature allows the backtracking engine to guarantee that a subexpression matches only the first match found for that subexpression, as if the expression were running independent of its containing expression. If you do not use this construct, backtracking searches from the larger expression can change the behavior of a subexpression. For example, the regular expression (a+)\w matches one or more "a" characters, along with a word character that follows the sequence of "a" characters, and assigns the sequence of "a" characters to the first capturing group, However, if the final character of the input string is also an "a", it is matched by the \w language element and is not included in the captured group.

  • Right-to-left matching, which is specified by supplying the RegexOptions.RightToLeft option to a Regex class constructor or static instance matching method. This feature is useful when searching from right to left instead of from left to right, or in cases where it is more efficient to begin a match at the right part of the pattern instead of the left. As the following example illustrates, using right-to-left matching can change the behavior of greedy quantifiers. The example conducts two searches for a sentence that ends in a number. The left-to-right search that uses the greedy quantifier + matches one of the six digits in the sentence, whereas the right-to-left search matches all six digits.

  • Positive and negative lookbehind: (?<=subexpression) for positive lookbehind, and (?<!subexpression) for negative lookbehind. This feature is similar to lookahead, which is discussed earlier in this topic. Because the regular expression engine allows complete right-to-left matching, regular expressions allow unrestricted lookbehinds. Positive and negative lookbehind can also be used to avoid nesting quantifiers when the nested subexpression is a superset of an outer expression. Regular expressions with such nested quantifiers often offer poor performance.

 

Scanning of Valid Email Format

Examples of Scanning of Valid Email Format.

Vb.NET ASPX File:
<%@ Page Language="VB" %>
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
       <title>Sample Page</title>
       <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
       <script runat="server">
           Sub Page_Load()
               Dim xstring As String = "dahhhvid.johgnes@prghjjosehware.com"
               Dim xmatchstr As String = ""
               lbl01.Text = xmatchstr
           End Sub
           Function showresult(xstring,xpattern)
               Dim xmatch As Match
               Dim xcaptures As CaptureCollection
               Dim ycaptures As CaptureCollection
               Dim xgroups As GroupCollection
               Dim xmatchstr As String = ""
               Dim xint As Integer
               Dim yint As Integer
               Dim zint As Integer
               xmatchstr = xmatchstr & "<br />Result of Regex.Match(string,""" & Replace(xpattern,"<","&lt;") & """): """
               xmatch = Regex.Match(xstring,xpattern)
               xmatchstr = xmatchstr & xmatch.value & """<br />"
               xcaptures = xmatch.Captures
               xmatchstr = xmatchstr & "->Result of CaptureCollection.Count: """
               xmatchstr = xmatchstr & xcaptures.Count & """<br />"
               For xint = 0 to xcaptures.Count - 1
                   xmatchstr = xmatchstr & "->->Result of CaptureCollection("& xint & ").Value, Index, Length: """
                   xmatchstr = xmatchstr & xcaptures(xint).Value & ", " & xcaptures(xint).Index & ", " & xcaptures(xint).Length & """<br />"
                   xgroups = xmatch.Groups
                   xmatchstr = xmatchstr & "->Result of GroupCollection.Count: """
                   xmatchstr = xmatchstr & xgroups.Count & """<br />"
                   For yint = 0 to xgroups.Count - 1
                       xmatchstr = xmatchstr & "->->Result of GroupCollection("& yint & ").Value, Index, Length: """
                       xmatchstr = xmatchstr & xgroups(yint).Value & ", " & xgroups(yint).Index & ", " & xgroups(yint).Length & """<br />"
                       ycaptures = xgroups(yint).Captures
                       xmatchstr = xmatchstr & "->->->Result of CaptureCollection.Count: """
                       xmatchstr = xmatchstr & ycaptures.Count & """<br />"
                       For zint = 0 to ycaptures.Count - 1
                           xmatchstr = xmatchstr & "->->->->Result of CaptureCollection("& zint & ").Value, Index, Length: """
                           xmatchstr = xmatchstr & ycaptures(zint).Value & ", " & ycaptures(zint).Index & ", " & ycaptures(zint).Length & """<br />"
                       Next
                   Next
               Next
               Return xmatchstr
           End Function
       </script>
    </head>
    <body>
       <% Response.Write ("<h1>This is a Sample Page of Scanning of of Valid Email Format</h1>") %>
       <p>
           <%-- Set on Page_Load --%>
           <asp:Label id="lbl01" runat="server" />
       </p>
    </body>
</html>
HTML Web Page Embedded Output:

See also

Source/Referencee


©sideway
close

References

  1. Active Server Pages, http://msdn.microsoft.com/en-us/library/aa286483.aspx, updated 20Dec2012
  2. ASP Overview, http://msdn.microsoft.com/en-us/library/ms524929%28v=vs.90%29.aspx, updated 21Dec2012
  3. ASP Best Practices, http://technet.microsoft.com/en-us/library/cc939157.aspx, updated 15Apr2013
  4. ASP Built-in Objects, http://msdn.microsoft.com/en-us/library/ie/ms524716(v=vs.90).aspx, updated 20Dec2012
  5. Response Object, http://msdn.microsoft.com/en-us/library/ms525405(v=vs.90).aspx, updated 20Dec2012
  6. Request Object, http://msdn.microsoft.com/en-us/library/ms524948(v=vs.90).aspx, updated 20Dec2012
  7. Server Object (IIS), http://msdn.microsoft.com/en-us/library/ms525541(v=vs.90).aspx, updated 20Dec2012
  8. Application Object (IIS), http://msdn.microsoft.com/en-us/library/ms525360(v=vs.90).aspx, updated 20Dec2012
  9. Session Object (IIS), http://msdn.microsoft.com/en-us/library/ms524319(8v=vs.90).aspx, updated 20Dec2012
  10. ASPError Object, http://msdn.microsoft.com/en-us/library/ms524942(v=vs.90).aspx, updated 20Dec2012
  11. ObjectContext Object (IIS), http://msdn.microsoft.com/en-us/library/ms525667(v=vs.90).aspx, updated 20Dec2012
  12. Debugging Global.asa Files, http://msdn.microsoft.com/en-us/library/aa291249(v=vs.71).aspx, updated 15Apr2013
  13. How to: Debug Global.asa files, http://msdn.microsoft.com/en-us/library/ms241868(v=vs.80).aspx, updated 15Apr2013
  14. Calling COM Components from ASP Pages, http://msdn.microsoft.com/en-us/library/ms524620(v=VS.90).aspx, updated 15Apr2013
  15. IIS ASP Scripting Reference, http://msdn.microsoft.com/en-us/library/ms524664(v=vs.90).aspx, updated 15Apr2013
  16. ASP Keywords, http://msdn.microsoft.com/en-us/library/ms524672(v=vs.90).aspx, updated 15Apr2013
  17. Creating Simple ASP Pages, http://msdn.microsoft.com/en-us/library/ms524741(v=vs.90).aspx, updated 15Apr2013
  18. Including Files in ASP Applications, http://msdn.microsoft.com/en-us/library/ms524876(v=vs.90).aspx, updated 15Apr2013
  19. ASP Overview, http://msdn.microsoft.com/en-us/library/ms524929(v=vs.90).aspx, updated 15Apr2013
  20. FileSystemObject Object, http://msdn.microsoft.com/en-us/library/z9ty6h50(v=vs.84).aspx, updated 14Mayr2013
  21. http://msdn.microsoft.com/en-us/library/windows/desktop/ms675944(v=vs.85).aspx, ADO Object Model, updated 15Apr2013
  22. ADO Fundamentals, http://msdn.microsoft.com/en-us/library/windows/desktop/ms680928(v=vs.85).aspx, updated 14Mayr2013
close

ID: 190800005 Last Updated: 8/5/2019 Revision: Ref:

IMAGE

Home (5)

Business

Management

HBR (3)

Information

Recreation

Hobbies (7)

Culture

Chinese (1097)

English (336)

Reference (66)

Computer

Hardware (149)

Software

Application (187)

Digitization (24)

Numeric (19)

Programming

Web (648)new

CSS (SC)

ASP.NET (SC)

Regular Expression (SC)

HTML

Knowledge Base

Common Color (SC)

Html Entity (Unicode) (SC)

Html 401 Special (SC)

OS (389)

MS Windows

Windows10 (SC)

.NET Framework (SC)

DeskTop (7)

Knowledge

Mathematics

Formulas (8)

Number Theory (206)

Algebra (20)

Trigonometry (18)

Geometry (18)

Calculus (67)

Complex Analysis (21)

Engineering

Tables (8)

Mechanical

Mechanics (1)

Rigid Bodies

Statics (92)

Dynamics (37)

Fluid (5)

Fluid Kinematics (5)

Control

Process Control (1)

Acoustics (19)

FiniteElement (2)

Biology (1)

Geography (1)


Copyright © 2000-2019 Sideway . All rights reserved Disclaimers last modified on 10 Feb 2019