Regular Expression
Backreference Constructs .NET Backreference Constructs Numbered Backreferences Named Backreferences Named Numeric Backreferences Backreferences Matching Examples See also Source/Referencee
Regular Expression
Backreference Constructs
Backreferences provide a convenient way to identify a repeated character or substring within a string. The backreference constructs is used to match subsequent occurrences of the substring with reference to the first occurrence with a capturing group.
.NET Backreference Constructs
In general, a backreference construct can be a numbered backreference, named backreference or named numeric backreference according to the capturing groups.
A separate syntax is used to refer to named and numbered capturing groups in replacement strings. .NET defines separate language elements to refer to numbered and named capturing groups.
Numbered Backreferences
A numbered backreference is of the form, \number, where number is the ordinal position of the capturing group in the regular expression. If number is not defined in the regular expression pattern, a parsing error occurs, and the regular expression engine throws an ArgumentException. In addition, if number identifies a capturing group in a particular ordinal position, but that capturing group has been assigned a numeric name different than its ordinal position, the regular expression parser also throws an ArgumentException.
Note the ambiguity between octal escape codes (such as \16) and \number backreferences that use the same notation. This ambiguity is resolved as follows:
The expressions \1 through \9 are always interpreted as backreferences, and not as octal codes.
If the first digit of a multidigit expression is 8 or 9 (such as \80 or \91), the expression as interpreted as a literal.
Expressions from \10 and greater are considered backreferences if there is a backreference corresponding to that number; otherwise, they are interpreted as octal codes.
If a regular expression contains a backreference to an undefined group number, a parsing error occurs, and the regular expression engine throws an ArgumentException.
If the ambiguity is a problem, you can use the \k<name> notation, which is unambiguous and cannot be confused with octal character codes. Similarly, hexadecimal codes such as \xdd are unambiguous and cannot be confused with backreferences.
Named Backreferences
A named backreference is of the form, \k<name> or \k'name', where name is the name of a capturing group defined in the regular expression pattern. If name is not defined in the regular expression pattern, a parsing error occurs, and the regular expression engine throws an ArgumentException.
Named Numeric Backreferences
In a named backreference with \k, name can also be the string representation of a number.
If name is the string representation of a number, and no capturing group has that name, \k<name> is the same as the backreference \number, where number is the ordinal position of the capture.
However, if name is the string representation of a number and the capturing group in that position has been explicitly assigned a numeric name, the regular expression parser cannot identify the capturing group by its ordinal position. Instead, it throws an ArgumentException.
Backreferences Matching
A backreference refers to the most recent definition of a group (the definition most immediately to the left, when matching left to right). When a group makes multiple captures, a backreference refers to the most recent capture.
Examples
Examples of Backreference Constructs
ASP.NET Code Input:
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>Sample Page</title>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<script runat="server">
Sub Page_Load()
Dim xstring As String = "01234 67 90112"
Dim xmatchstr As String = ""
xmatchstr = xmatchstr & "Given string: """ & xstring & """<br />"
xmatchstr = xmatchstr & showresult(xstring,"(\w).+\1")
xmatchstr = xmatchstr & showresult(xstring,"(\w)+.+\1")
xmatchstr = xmatchstr & showresult(xstring,"(?<zero>\w).+\k<zero>")
xmatchstr = xmatchstr & showresult(xstring,"(?'zero'\w)+.+\k'zero'")
xmatchstr = xmatchstr & showresult(xstring,"(?<1>\w).+\k'1'")
xmatchstr = xmatchstr & showresult(xstring,"(?'1'\w)+.+\k'1'")
xmatchstr = xmatchstr & showresult(xstring,"(?<zero>\w).+\k<1>")
xmatchstr = xmatchstr & showresult(xstring,"(?'zero'\w)+.+\k<1>")
lbl01.Text = xmatchstr
End Sub
Function showresult(xstring,xpattern)
Dim xmatch As Match
Dim xcaptures As CaptureCollection
Dim ycaptures As CaptureCollection
Dim xgroups As GroupCollection
Dim xmatchstr As String = ""
Dim xint As Integer
Dim yint As Integer
Dim zint As Integer
xmatchstr = xmatchstr & "<br />Result of Regex.Match(string,""" & Replace(xpattern,"<","<") & """): """
xmatch = Regex.Match(xstring,xpattern)
xmatchstr = xmatchstr & xmatch.value & """<br />"
xcaptures = xmatch.Captures
xmatchstr = xmatchstr & "->Result of CaptureCollection.Count: """
xmatchstr = xmatchstr & xcaptures.Count & """<br />"
For xint = 0 to xcaptures.Count - 1
xmatchstr = xmatchstr & "->->Result of CaptureCollection("& xint & ").Value, Index, Length: """
xmatchstr = xmatchstr & xcaptures(xint).Value & ", " & xcaptures(xint).Index & ", " & xcaptures(xint).Length & """<br />"
xgroups = xmatch.Groups
xmatchstr = xmatchstr & "->Result of GroupCollection.Count: """
xmatchstr = xmatchstr & xgroups.Count & """<br />"
For yint = 0 to xgroups.Count - 1
xmatchstr = xmatchstr & "->->Result of GroupCollection("& yint & ").Value, Index, Length: """
xmatchstr = xmatchstr & xgroups(yint).Value & ", " & xgroups(yint).Index & ", " & xgroups(yint).Length & """<br />"
ycaptures = xgroups(yint).Captures
xmatchstr = xmatchstr & "->->->Result of CaptureCollection.Count: """
xmatchstr = xmatchstr & ycaptures.Count & """<br />"
For zint = 0 to ycaptures.Count - 1
xmatchstr = xmatchstr & "->->->->Result of CaptureCollection("& zint & ").Value, Index, Length: """
xmatchstr = xmatchstr & ycaptures(zint).Value & ", " & ycaptures(zint).Index & ", " & ycaptures(zint).Length & """<br />"
Next
Next
Next
Return xmatchstr
End Function
</script>
</head>
<body>
<% Response.Write ("<h1>This is a Sample Page of Backreference Constructs</h1>") %>
<p>
<%-- Set on Page_Load --%>
<asp:Label id="lbl01" runat="server" />
</p>
</body>
</html>