Sideway
output.to from Sideway
Draft for Information Only

Content

Regular Expression Grouping Constructs
 .NET Capturing Grouping Constructs
  Matched Subexpressions
  Named Matched Subexpressions
  Balancing Group Definitions
 .NET Non-Capturing Grouping Constructs
  Noncapturing Groups
  Group Options
  Zero-Width Positive Lookahead Assertions
  Zero-Width Negative Lookahead Assertions
  Zero-Width Positive Lookbehind Assertions
  Zero-Width Negative Lookbehind Assertions
  Nonbacktracking Subexpressions
 Grouping Constructs and Regular Expression Objects
  Grouping Constructs Summary
 Examples
 See also
 Source/Referencee

Regular Expression Grouping Constructs

Grouping constructs delineate the subexpressions of a regular expression and capture the substrings of an input string. The grouping constructs can be used to

  • Match a subexpression that is repeated in the input string.
  • Apply a quantifier to a subexpression that has multiple regular expression language elements.
  • Include a subexpression in the string that is returned by the Regex.Replace and Match.Result methods
  • Retrieve individual subexpressions from the Match.Groups property and process them separately from the matched text as a whole.

Grouping constructs can be divided into two types

  • Capturing: matched subexpressions, named matched subexpressions, balancing group definitions.
  • Non-capturing: noncapturing groups, group options, zero-width positive lookahead assertions, zero-width negative lookbehind assertions, zero-width positive lookbehind assertions, zero-width negative lookbehind assertions, nonbascktracking subexpressions

.NET Capturing Grouping Constructs

Matched Subexpressions

The Matched Subexpressions, (subexpression), captures a matched subexpression. The parameter, subexpression, is any valid regular expression pattern. Captures that use parentheses are numbered automatically from left to right based on the order of the opening parentheses in the regular expression, starting from one. The capture that is numbered zero is the text matched by the entire regular expression pattern.

By default, the (subexpression) language element captures the matched subexpression. But if the RegexOptions parameter of a regular expression pattern matching method includes the RegexOptions.ExplicitCapture flag, or if the n option is applied to this subexpression, the matched subexpression is not captured.

The captured group can be accessed by

  • By using the backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax \number, where number is the ordinal number of the captured subexpression.
  • By using the named backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax \k<name>, where name is the name of a capturing group, or \k<number>, where number is the ordinal number of a capturing group. A capturing group has a default name that is identical to its ordinal number.
  • By using the $number replacement sequence in a Regex.Replace or Match.Result method call, where number is the ordinal number of the captured subexpression.
  • Programmatically, by using the GroupCollection object returned by the Match.Groups property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression.

Named Matched Subexpressions

The Named Matched Subexpressions, (?<name>subexpression) or  (?'name'subexpression), captures a matched subexpression. The parameter, name, is a valid group name, and the parameter, subexpression, is any valid regular expression pattern. The parameter, name, must not contain any punctuation characters and cannot begin with a number. The captured group can be accessed by the specified name or by number.

If the RegexOptions parameter of a regular expression pattern matching method includes the RegexOptions.ExplicitCapture flag, or if the n option is applied to this subexpression (see Group options later in this topic), the only way to capture a subexpression is to explicitly name capturing groups.

The named captured group can be accessed by

  • By using the named backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax \k<name>, where name is the name of the captured subexpression.

  • By using the backreference construct within the regular expression. The matched subexpression is referenced in the same regular expression by using the syntax \number, where number is the ordinal number of the captured subexpression. Named matched subexpressions are numbered consecutively from left to right after matched subexpressions.

  • By using the ${name} replacement sequence in a Regex.Replace or Match.Result method call, where name is the name of the captured subexpression.

  • By using the $number replacement sequence in a Regex.Replace or Match.Result method call, where number is the ordinal number of the captured subexpression.

  • Programmatically, by using the GroupCollection object returned by the Match.Groups property. The member at position zero in the collection represents the entire regular expression match. Each subsequent member represents a matched subexpression. Named captured groups are stored in the collection after numbered captured groups.

  • Programmatically, by providing the subexpression name to the GroupCollection object's indexer (in C#) or to its Item[String] property (in Visual Basic).

Balancing Group Definitions

A balancing group definition deletes the definition of a previously defined group and stores, in the current group, the interval between the previously defined group and the current group. This grouping construct has the format, (?<name1-name2>subexpression) or (?'name1-name2' subexpression). The parameter, name1, is the current group (optional). The paramete, name2 is a previously defined group. The parameter, subexpression, is any valid regular expression pattern. The balancing group definition deletes the definition of name2 and stores the interval between name2 and name1 in name1. If no name2 group is defined, the match backtracks. Because deleting the last definition of name2 reveals the previous definition of name2, this construct lets you use the stack of captures for group name2 as a counter for keeping track of nested constructs such as parentheses or opening and closing brackets.

The balancing group definition uses name2 as a stack. The beginning character of each nested construct is placed in the group and in its Group.Captures collection. When the closing character is matched, its corresponding opening character is removed from the group, and the Captures collection is decreased by one. After the opening and closing characters of all nested constructs have been matched, name2 is empty.

After you modify the regular expression in the following example to use the appropriate opening and closing character of a nested construct, you can use it to handle most nested constructs, such as mathematical expressions or lines of program code that include multiple nested method calls.

.NET Non-Capturing Grouping Constructs

Noncapturing Groups

The noncapturing group, (?:subexpression), does not capture the substring that is matched by a subexpression.

The parameter, subexpression, is any valid regular expression pattern. The noncapturing group construct is typically used when a quantifier is applied to a group, but the substrings captured by the group are of no interest.

If a regular expression includes nested grouping constructs, an outer noncapturing group construct does not apply to the inner nested group constructs.

Group Options

The Group Options, (?imnsx-imnsx: subexpression ), applies or disables the specified options within a subexpression.

The parameter subexpression is any valid regular expression pattern.

You can specify options that apply to an entire regular expression rather than a subexpression by using a System.Text.RegularExpressions.Regex class constructor or a static method. You can also specify inline options that apply after a specific point in a regular expression by using the (?imnsx-imnsx) language construct.

The group options construct is not a capturing group. That is, although any portion of a string that is captured by subexpression is included in the match, it is not included in a captured group nor used to populate the GroupCollection object.

Zero-Width Positive Lookahead Assertions

The Zero-Width Positive Lookahead Assertion, (?= subexpression ), defines a zero-width positive lookahead  assertion.

The parameter, subexpression, is any regular expression pattern. For a match to be successful, the input string must match the regular expression pattern in subexpression, although the matched substring is not included in the match result. A zero-width positive lookahead assertion does not backtrack.

Typically, a zero-width positive lookahead assertion is found at the end of a regular expression pattern. It defines a substring that must be found at the end of a string for a match to occur but that should not be included in the match. It is also useful for preventing excessive backtracking. You can use a zero-width positive lookahead assertion to ensure that a particular captured group begins with text that matches a subset of the pattern defined for that captured group. For example, if a capturing group matches consecutive word characters, you can use a zero-width positive lookahead assertion to require that the first character be an alphabetical uppercase character.

Zero-Width Negative Lookahead Assertions

The Zero-Width Negative Lookahead Assertions, (?! subexpression ), defines a zero-width negative lookahead assertion.

The parameter, subexpression, is any regular expression pattern. For the match to be successful, the input string must not match the regular expression pattern in subexpression, although the matched string is not included in the match result.

A zero-width negative lookahead assertion is typically used either at the beginning or at the end of a regular expression. At the beginning of a regular expression, it can define a specific pattern that should not be matched when the beginning of the regular expression defines a similar but more general pattern to be matched. In this case, it is often used to limit backtracking. At the end of a regular expression, it can define a subexpression that cannot occur at the end of a match.

Zero-Width Positive Lookbehind Assertions

The Zero-Width Positive Lookbehind Assertions, (?<= subexpression ), defines a zero-width positive lookbehind assertion:

The parameter, subexpression is any regular expression pattern. For a match to be successful, subexpression must occur at the input string to the left of the current position, although subexpression is not included in the match result. A zero-width positive lookbehind assertion does not backtrack.

Zero-width positive lookbehind assertions are typically used at the beginning of regular expressions. The pattern that they define is a precondition for a match, although it is not a part of the match result.

Zero-Width Negative Lookbehind Assertions

The Zero-Width Negative Lookbehind Assertions, (?<! subexpression ), defines a zero-width negative lookbehind assertion.

The parameter, subexpression, is any regular expression pattern. For a match to be successful, subexpression must not occur at the input string to the left of the current position. However, any substring that does not match subexpression is not included in the match result.

Zero-width negative lookbehind assertions are typically used at the beginning of regular expressions. The pattern that they define precludes a match in the string that follows. They are also used to limit backtracking when the last character or characters in a captured group must not be one or more of the characters that match that group's regular expression pattern. For example, if a group captures all consecutive word characters, you can use a zero-width positive lookbehind assertion to require that the last character not be an underscore ( _).

Nonbacktracking Subexpressions

The Nonbacktracking Subexpressions, (?> subexpression ), represents a nonbacktracking subexpression (also known as a "greedy" subexpression).

The parameter, subexpression, is any regular expression pattern.

Ordinarily, if a regular expression includes an optional or alternative matching pattern and a match does not succeed, the regular expression engine can branch in multiple directions to match an input string with a pattern. If a match is not found when it takes the first branch, the regular expression engine can back up or backtrack to the point where it took the first match and attempt the match using the second branch. This process can continue until all branches have been tried.

The (?>subexpression) language construct disables backtracking. The regular expression engine will match as many characters in the input string as it can. When no further match is possible, it will not backtrack to attempt alternate pattern matches. (That is, the subexpression matches only strings that would be matched by the subexpression alone; it does not attempt to match a string based on the subexpression and any subexpressions that follow it.)

This option is recommended if you know that backtracking will not succeed. Preventing the regular expression engine from performing unnecessary searching improves performance.

Grouping Constructs and Regular Expression Objects

Substrings that are matched by a regular expression capturing group are represented by System.Text.RegularExpressions.Group objects, which can be retrieved from the System.Text.RegularExpressions.GroupCollection object that is returned by the Match.Groups property. The GroupCollection object is populated as follows:

  • The first Group object in the collection (the object at index zero) represents the entire match.

  • The next set of Group objects represent unnamed (numbered) capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index values of these groups range from 1 to the number of unnamed capturing groups in the collection. (The index of a particular group is equivalent to its numbered backreference. For more information about backreferences, see Backreference Constructs.)

  • The final set of Group objects represent named capturing groups. They appear in the order in which they are defined in the regular expression, from left to right. The index value of the first named capturing group is one greater than the index of the last unnamed capturing group. If there are no unnamed capturing groups in the regular expression, the index value of the first named capturing group is one.

If you apply a quantifier to a capturing group, the corresponding Group object's Capture.Value, Capture.Index, and Capture.Length properties reflect the last substring that is captured by a capturing group. You can retrieve a complete set of substrings that are captured by groups that have quantifiers from the CaptureCollection object that is returned by the Group.Captures property.

Grouping Constructs Summary

Grouping Constructs Capturing Backtracking  
(subexpression) Yes Yes  
(?<name>subexpression), (?'name'subexpression) Yes Yes  
(?<name1-name2>subexpression), (?'name1-name2'subexpression) Yes Yes  
(?:subexpression) No Yes  
(?imnsx-imnsx:subexpression) No Yes  
(?=subexpression) No No  
(?!subexpression) No    
(?<=subexpression) No No  
(?<!subexpression) No    
(?>subexpression) No No  

Examples

Examples of Grouping Constructs
ASP.NET Code Input:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
    <head>
       <title>Sample Page</title>
       <meta http-equiv="Content-Type" content="text/html;charset=utf-8">
       <script runat="server">
           Sub Page_Load()
               Dim xstring As String = "012324 67 9011112212 Monday February 1, 2010,"
               Dim xmatchstr As String = ""
               xmatchstr = xmatchstr & "Given string: """ & xstring & """<br />"
               xmatchstr = xmatchstr & showresult(xstring,"([1-3])+3\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero>[1-3])+3\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?'zero'[1-3])+3\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?'zero'[1-3])+3\k<zero>")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-3])+.+(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-3]+).+(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-4])+(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-4]+)(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-4])+.(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?<zero1>[1-4]+).(?<zero2-zero1>2)")
               xmatchstr = xmatchstr & showresult(xstring,"(?:\w)*(.+)\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?:(\w))*(.+)\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?:(\w))*(.)+\2")
               xmatchstr = xmatchstr & showresult(xstring,"(.)(?i:f)(\w).+\1")
               xmatchstr = xmatchstr & showresult(xstring,"(2)(?i:[\w ])+(\w)+\1")
               xmatchstr = xmatchstr & showresult(xstring,"(\w).+\1(?=.)")
               xmatchstr = xmatchstr & showresult(xstring,"(\w).+\1(?=1)")
               xmatchstr = xmatchstr & showresult(xstring,"(\w).+\1(?=.)+")
               xmatchstr = xmatchstr & showresult(xstring,"(\w).+\1(?=[\d])+")
               xmatchstr = xmatchstr & showresult(xstring,".+(.)\1(?!1)+")
               xmatchstr = xmatchstr & showresult(xstring,".+(.)\1(?!a)+")
               xmatchstr = xmatchstr & showresult(xstring,"(?<=.)(\w).+\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?<=1)(\w).+\1")
               xmatchstr = xmatchstr & showresult(xstring,"(?<!(Saturday|Sunday) )\b\w+ \d{1,2}, \d{4}\b")
               xmatchstr = xmatchstr & showresult(xstring,"(?<!(.+))6.")
               xmatchstr = xmatchstr & showresult(xstring,"(?<!([01234]))\b.")
               xmatchstr = xmatchstr & showresult(xstring,"[1-4]+(\d)")
               xmatchstr = xmatchstr & showresult(xstring,"(?>[1-4]+)(\d)")
               lbl01.Text = xmatchstr
           End Sub
           Function showresult(xstring,xpattern)
               Dim xmatch As Match
               Dim xcaptures As CaptureCollection
               Dim ycaptures As CaptureCollection
               Dim xgroups As GroupCollection
               Dim xmatchstr As String = ""
               Dim xint As Integer
               Dim yint As Integer
               Dim zint As Integer
               xmatchstr = xmatchstr & "<br />Result of Regex.Match(string,""" & Replace(xpattern,"<","&lt;") & """): """
               xmatch = Regex.Match(xstring,xpattern)
               xmatchstr = xmatchstr & xmatch.value & """<br />"
               xcaptures = xmatch.Captures
               xmatchstr = xmatchstr & "->Result of CaptureCollection.Count: """
               xmatchstr = xmatchstr & xcaptures.Count & """<br />"
               For xint = 0 to xcaptures.Count - 1
                   xmatchstr = xmatchstr & "->->Result of CaptureCollection("& xint & ").Value, Index, Length: """
                   xmatchstr = xmatchstr & xcaptures(xint).Value & ", " & xcaptures(xint).Index & ", " & xcaptures(xint).Length & """<br />"
                   xgroups = xmatch.Groups
                   xmatchstr = xmatchstr & "->Result of GroupCollection.Count: """
                   xmatchstr = xmatchstr & xgroups.Count & """<br />"
                   For yint = 0 to xgroups.Count - 1
                       xmatchstr = xmatchstr & "->->Result of GroupCollection("& yint & ").Value, Index, Length: """
                       xmatchstr = xmatchstr & xgroups(yint).Value & ", " & xgroups(yint).Index & ", " & xgroups(yint).Length & """<br />"
                       ycaptures = xgroups(yint).Captures
                       xmatchstr = xmatchstr & "->->->Result of CaptureCollection.Count: """
                       xmatchstr = xmatchstr & ycaptures.Count & """<br />"
                       For zint = 0 to ycaptures.Count - 1
                           xmatchstr = xmatchstr & "->->->->Result of CaptureCollection("& zint & ").Value, Index, Length: """
                           xmatchstr = xmatchstr & ycaptures(zint).Value & ", " & ycaptures(zint).Index & ", " & ycaptures(zint).Length & """<br />"
                       Next
                   Next
               Next
               Return xmatchstr
           End Function
       </script>
    </head>
    <body>
       <% Response.Write ("<h1>This is a Sample Page of Grouping Constructs</h1>") %>
       <p>
           <%-- Set on Page_Load --%>
           <asp:Label id="lbl01" runat="server" />
       </p>
    </body>
</html>
HTML Web Page Embedded Output:

See also

  • Regular Expression Language - Quick Reference
  • Backtracking

Source/Referencee

  • https://docs.microsoft.com/en-us/dotnet/standard/base-types/regular-expression-language-quick-reference
  • https://docs.microsoft.com/en-us/dotnet/standard/base-types/grouping-constructs-in-regular-expressions

©sideway

ID: 190700029 Last Updated: 7/29/2019 Revision: 0 Ref:

close

References

  1. Active Server Pages,  , http://msdn.microsoft.com/en-us/library/aa286483.aspx
  2. ASP Overview,  , http://msdn.microsoft.com/en-us/library/ms524929%28v=vs.90%29.aspx
  3. ASP Best Practices,  , http://technet.microsoft.com/en-us/library/cc939157.aspx
  4. ASP Built-in Objects,  , http://msdn.microsoft.com/en-us/library/ie/ms524716(v=vs.90).aspx
  5. Response Object,  , http://msdn.microsoft.com/en-us/library/ms525405(v=vs.90).aspx
  6. Request Object,  , http://msdn.microsoft.com/en-us/library/ms524948(v=vs.90).aspx
  7. Server Object (IIS),  , http://msdn.microsoft.com/en-us/library/ms525541(v=vs.90).aspx
  8. Application Object (IIS),  , http://msdn.microsoft.com/en-us/library/ms525360(v=vs.90).aspx
  9. Session Object (IIS),  , http://msdn.microsoft.com/en-us/library/ms524319(8v=vs.90).aspx
  10. ASPError Object,  , http://msdn.microsoft.com/en-us/library/ms524942(v=vs.90).aspx
  11. ObjectContext Object (IIS),  , http://msdn.microsoft.com/en-us/library/ms525667(v=vs.90).aspx
  12. Debugging Global.asa Files,  , http://msdn.microsoft.com/en-us/library/aa291249(v=vs.71).aspx
  13. How to: Debug Global.asa files,  , http://msdn.microsoft.com/en-us/library/ms241868(v=vs.80).aspx
  14. Calling COM Components from ASP Pages,  , http://msdn.microsoft.com/en-us/library/ms524620(v=VS.90).aspx
  15. IIS ASP Scripting Reference,  , http://msdn.microsoft.com/en-us/library/ms524664(v=vs.90).aspx
  16. ASP Keywords,  , http://msdn.microsoft.com/en-us/library/ms524672(v=vs.90).aspx
  17. Creating Simple ASP Pages,  , http://msdn.microsoft.com/en-us/library/ms524741(v=vs.90).aspx
  18. Including Files in ASP Applications,  , http://msdn.microsoft.com/en-us/library/ms524876(v=vs.90).aspx
  19. ASP Overview,  , http://msdn.microsoft.com/en-us/library/ms524929(v=vs.90).aspx
  20. FileSystemObject Object,  , http://msdn.microsoft.com/en-us/library/z9ty6h50(v=vs.84).aspx
  21. http://msdn.microsoft.com/en-us/library/windows/desktop/ms675944(v=vs.85).aspx,  , ADO Object Model
  22. ADO Fundamentals,  , http://msdn.microsoft.com/en-us/library/windows/desktop/ms680928(v=vs.85).aspx
close

Latest Updated LinksValid XHTML 1.0 Transitional Valid CSS!Nu Html Checker Firefox53 Chromena IExplorerna
IMAGE

Home 5

Business

Management

HBR 3

Information

Recreation

Hobbies 8

Culture

Chinese 1097

English 339

Reference 79

Computer

Hardware 249

Software

Application 213

Digitization 32

Latex 52

Manim 205

KB 1

Numeric 19

Programming

Web 289

Unicode 504

HTML 66

CSS 65

SVG 46

ASP.NET 270

OS 429

DeskTop 7

Python 72

Knowledge

Mathematics

Formulas 8

Algebra 84

Number Theory 206

Trigonometry 31

Geometry 34

Coordinate Geometry 2

Calculus 67

Complex Analysis 21

Engineering

Tables 8

Mechanical

Mechanics 1

Rigid Bodies

Statics 92

Dynamics 37

Fluid 5

Fluid Kinematics 5

Control

Process Control 1

Acoustics 19

FiniteElement 2

Natural Sciences

Matter 1

Electric 27

Biology 1

Geography 1


Copyright © 2000-2024 Sideway . All rights reserved Disclaimers last modified on 06 September 2019