top of page
Search
Tech Explore

JavaScript Regular Expression with Grouping and Back References

Updated: Jul 10, 2020

In this blog, let us learn some of the Javascript regular expression basics using an example.


Regular Expressions in JavaScript


A Regular Expression, or RegEx, is a pattern used to match character combinations in a string. In JavaScript, regular expressions are also objects.

Let us start by taking some basic examples, and then explain the syntax needed to construct and understand RegExes in further detail.


Let's create a RegExp object, "re", that matches any string "S" that begins and ends with the same vowel.


String S is,

The length of string S is >= 3.

String S consists of lowercase letters only (i.e., [a-z]).


In this example we will be using Node JS, a server side Javascript language.

Node JS is a runtime library and environment which is cross-platform and used for creating running JavaScript applications outside the browser. While Angular JS is a web application framework for building front-end web application's based on the Javascript. Node JS is a server side language for creating Javascript based application runtimes.


I will be using the following RegEx.js code hosted in my Github repository.

To execute the above NodeJS script, we need to create an input file which can be provided as the standard input to read the string values from to test the regular expression.


I will be using the following RegExInput.txt file as the input file which contains the strings that need to be evaluated against the regular expression.

Now, let us run the script against the above input file using the below command,

node RegEx.js < RegExInput.txt

Output will look like below,


Since the first input string doesn't start and end with the same vowel it is returned as false, rest of the strings satisfy the condition thus returned true.


Now, let us examine the regular expression used and learn some basics of forming such a regular expression pattern.


The regular expression used is, "/^([aeiou])\w*\1$/ig"


Regular expression literal used is, "/^([aeiou])\w*\1$/"

Regular expression literal is a RegEx pattern enclosed within forward slashes.


Regular expression pattern used is, "^([aeiou])\w*\1$".


Let's start from exploring the pattern,


^ => Matches beginning of input. If the multiline flag is set to true, also matches immediately after a line break character.


[aeiou] => Character set [aeiou] will match any one character from the set {a, e, i, o, u}.


The ^ and [aeiou] patterns makes sure, our regular expression matches with Strings starting with any of the vowel characters.


The important part here is, if the String starts with a first vowel character we need to match it and also remember it. We need to remember the vowel that was matched in the start of the String to compare and check whether it is the same vowel appearing at the end of the String.


To capture the match and remember the captured match we will be using the JavaScript "Capturing Groups".


Capturing Groups are defined by specifying a matched characters inside the round brackets.


([aeiou]) => Round brackets enclosing the square bracket (character set) is the notation to capture and remember the first matched vowel.


\w* => This part of the regex pattern matches, zero or more alphanumeric word character, including the underscore (i.e., [A-Za-z0-9_]).


String can contain zero or more alphanumeric characters followed by the first vowel character.


Refer to the link for more information on Javascript capturing groups.


To reference the captured match we will be using the JavaScript "Back References".


Now that we have traversed the String and reached to the end of the String, we need to match the last character of the String and check the following conditions,

  • Whether it matches with the first matched vowel.

To references the first captured match, we will be using "Back Reference" with the following notation.


\1$ => This part of the regex pattern references the character remembered by the first matched capturing group.


Last character of the String will be matched only if it is same as the first captured match.


Refer to the link for more information on Javascript back references.


Placing the regular expression pattern within the forward slashes will give us the regular expression literal.


Regular expression literal combined with the flags gives us the complete regular expression.


ig => This part of the regular expression signifies the flags.


i: ignore case

g: global match


Now that we have constructed the entire regular expression (/^([aeiou])\w*\1$/ig) we can use it with a regular expression method.


There are two regular expression methods available as below,

  • test( ): This method executes a search for a match between a regular expression and a specified string. Returns true or false.

  • executes( ): This method executes a search for a match in a specified string. Returns a result array or null.


In our example, we need to check whether the given String satisfies the regular expression, so we have used the test( ) method.


References:

[2] https://javascript.info/regexp-backreferences


Thank you for reading. Cheers!!!


169 views0 comments

Comments


Post: Blog2_Post
bottom of page