JavaScript Regular Expressions Explained

JavaScript Regular Expressions Explained

1. Regular expression creation

JavaScript has two ways to create regular expressions:

  • The first method: write it directly through regular expression
  • The second method: create a RegExp object through new RegExp('regular expression')
const re1 = /ABC\-001/;
const re2 = new RegExp('ABC\\-001');
re1; // /ABC\-001/
re2; // /ABC\-001/

Note that if you use the second way of writing, due to string escape issues, the two \ in the string are actually one \.

2. Usage Mode

2.1 Using Simple Mode

Simple patterns consist of direct matches found. For example, the pattern /abc/ matches only the characters 'abc' that appear simultaneously and in that order in a string. This would match "Hi, do you know your abc's?" and "The latest airplane designs evolved from slabcraft." In both examples above, the substring 'abc' is matched. It will not match the string "Grab crab" because it does not contain any 'abc' substring.

2.2 Using special characters

For example: the pattern /abc/ matches a single 'a' followed by zero or more 'b's (meaning zero or more occurrences of the previous item), followed by any character combination of 'c'. In the string "s'scbbabbbbcdebc", this pattern matches the substring "abbbbc".

character meaning
\ Matching will follow the following rules:
A backslash before a non-special character indicates that the next character is special and is not to be interpreted literally. For example, a 'd' without a preceding '' usually matches a lowercase 'd'. If '' is added, this character becomes a special character, meaning it matches a number.
A backslash can also be used to escape the special character that follows it into a literal. For example, the pattern /a/ will match 0 or more a's. In contrast, the pattern /a*/ removes the specialness of '' and thus matches strings like "a*".
When using new RegExp("pattern"), don't forget to escape \, because \ is also an escape character in a string.
^ Matches the beginning of the input. For example, /^A/ will not match the 'A' in "an A", but will match the 'A' in "An E".
$ Matches the end of input. For example, /t$/ will not match the 't' in "eater", but will match the 't' in "eat".
* Matches the preceding expression 0 or more times. Equivalent to {0,}. For example, /bo*/ will match 'booooo' in "A ghost boooooed".
+ Matches the preceding expression one or more times. Equivalent to {1,}. For example, /a+/ matches the 'a' in "candy" and all 'a's in "caaaaaaandy".
? Matches the preceding expression 0 or 1 times. Equivalent to {0,1}. For example, /e?le?/ matches 'el' in "angel", 'le' in "angle", and 'l' in "oslo".
If it follows any of the quantifiers *, +, ?, or {}, it will make the quantifier non-greedy (match as few characters as possible), as opposed to the default greedy mode (match as many characters as possible).
For example, applying /\d+/ to "123abc" will return "123", but using /\d+?/ will only match "1".
. Matches any single character except newline. For example, /.n/ will match 'an' and 'on' but not 'nay' in "nay, an apple is on the tree".
x y
{n} n is a positive integer, matching the previous character exactly n times.
For example, /a{2}/ will not match the 'a' in "candy", but will match all the a's in "caandy" and the first two 'a's in "caaandy".
{n,m} Both n and m are integers. Matches the preceding character at least n times and at most m times. If the value of n or m is 0, this value is ignored. For example, /a{1, 3}/ does not match any character in "cndy", but matches the a in "candy", the first two a's in "caandy", and the first three a's in "caaaaaaandy". Note that when matching "caaaaaaandy", the matched value is "aaa", even though there were more a's in the original string.
[xyz] A collection of characters. Matches any character within the square brackets, including escape sequences. You can use a dash (-) to specify a range of characters. Special characters such as dot (.) and asterisk (*) have no special meaning in a character set. They do not need to be escaped, but escaping works.
For example, [abcd] and [ad] are the same. They both match the 'b' in "brisket" and the 'c' in "city". /[az.]+/ and /[\w.]+/ match the string "test.i.ng".
[^xyz] A reverse character set. That is, it matches any character not enclosed in square brackets. You can use a dash (-) to specify a range of characters. Any normal character will work here.
\b Matches a word boundary. A word boundary is a position where a word is not followed by another "字" character or where no other "字" character precedes it. Note that a matched word boundary is not included in the matched content. In other words, the length of the content of a matched word boundary is 0. For example:
/\bm/ matches the 'm' in "moon"; /oo\b/ does not match the 'oo' in "moon" because 'oo' is followed by a "word" character 'n'. /oon\b/ matches 'oon' in "moon" because 'oon' is the end of the string. This way it is not followed by a "字" character.
\d Matches a digit. Equivalent to [0-9]. For example, /\d/ or /[0-9]/ matches '2' in "B2 is the suite number.".
\D Matches a non-digit character. Equivalent to [^0-9]. For example, /\D/ or /[^0-9]/ matches 'B' in "B2 is the suite number.".
\f Matches a form feed character (U+000C).
\n Matches a newline character (U+000A).
\r Matches a carriage return character (U+000D).
\s Matches a whitespace character, including space, tab, form feed, and newline. Equivalent to [ \f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff].
For example, /\s\w*/ matches ' bar' in "foo bar.".
\S Matches a non-whitespace character. Equivalent to [^ \f\n\r\t\v\u00a0\u1680\u180e\u2000-\u200a\u2028\u2029\u202f\u205f\u3000\ufeff].
For example, /\S\w*/ matches 'foo' in "foo bar.".
\t Matches a horizontal tab character (U+0009).
\w Matches a single character (letter, digit, or underscore). Equivalent to [A-Za-z0-9_].
For example, /\w/ matches 'a' in "apple," '5' in "$5.28," and '3' in "3D.".
\W Matches a non-word character.
\n In a regular expression, it returns the substring matched by the last n-th subcapture (the number of captures is counted after the left parenthesis).

3. Application

3.1 Splitting a string

Using regular expressions to split strings is more flexible than using fixed characters. The usual split code is:

'ad c'.split(' '); // ['a', 'd', '', '', 'c']

The above method cannot recognize consecutive spaces, so use regular expressions instead:

'ab c'.split(/\s+/); // ['a', 'b', 'c']

No matter how many spaces there are, the string can be split normally. Then add ',':

'a,b, c d'.split(/[\s\,]+/); // ['a', 'b', 'c', 'd']

Then add:

'a,b;; c d'.split(/[\s\,\;]+/); // ['a', 'b', 'c', 'd']

Therefore, regular expressions can be used to convert irregular input into correct arrays.

3.2 Grouping

In addition to determining whether a match occurs, regular expressions can also extract substrings. The substrings represented by () are the groups to be extracted. for example:

^(\d{4})-(\d{4,9})$ defines two groups respectively, which can directly extract the area code and local number from the matched string:

var re = /^(\d{4})-(\d{4,9})$/;
re.exec('0530-12306'); // ['010-12345', '010', '12345']
re.exec('0530 12306'); // null

After a successful match, the exec() method returns an array. The first element is the entire string matched by the regular expression, and the subsequent strings represent the substrings that matched successfully.

The exec() method returns null if the match fails.

3.3 Greedy Matching

Note that regular expression matching is greedy by default, that is, it matches as many characters as possible. As follows, match the 0 after the number:

var re = /^(\d+)(0*)$/;
re.exec('102300'); // ['102300', '102300', '']

Because \d+ uses greedy matching, it directly matches all the following 0s, so 0* can only match the empty string.

You must make \d+ use non-greedy matching (that is, matching as little as possible) to match the following 0. Adding a ? will make \d+ use non-greedy matching:

var re = /^(\d+?)(0*)$/;
re.exec('102300'); // ['102300', '1023', '00']

3.4 Regular Expression Flags

g Global search.
i Case-insensitive search.
m Multi-line search.
y Perform a "sticky" search, where the match starts at the current position in the target string. You can use the y flag.

3.5 test() method

The test() method is used to check whether a string matches a pattern. If the string contains matching text, it returns true, otherwise it returns false.

var re = /^(\d{4})-(\d{4,9})$/;
re.test('0530-12321'); // true
re.test('0530-123ab'); // false
re.test('0530 12321'); // false

4. Commonly used regular expressions (reference)

Verify email address: ^\w+[-+.]\w+)*@\w+([-.]\w+)*\.\w+([-.]\w+)*$
Verify ID number (15 or 18 digits): ^\d{15}|\d{}18$
Mainland China mobile phone number: 1\d{10}
Mainland China landline number: (\d{4}-|\d{3}-)?(\d{8}|\d{7})
Mainland China postal code: [1-9]\d{5}
IP address: ((2[0-4]\d|25[0-5]|[01]?\d\d?)\.){3}(2[0-4]\d|25[0-5]|[01]?\d\d?)
Date (year-month-day): (\d{4}|\d{2})-((1[0-2])|(0?[1-9]))-(([12][0-9])|(3[01])|(0?[1-9]))
Date (month/day/year): ((1[0-2])|(0?[1-9]))/(([12][0-9])|(3[01])|(0?[1-9]))/(\d{4}|\d{2})
Verify number: ^[0-9]*$
Verify n-digit number: ^\d{n}$
Verify that at least n digits are present: ^\d{n,}$
Verify the number of mn digits: ^\d{m,n}$
Verify that the number starts with zero and non-zero: ^(0|[1-9][0-9]*)$
Verify that there are 1-3 decimal places in a positive real number: ^[0-9]+(.[0-9]{1,3})?$
Verify a non-zero positive integer: ^\+?[1-9][0-9]*$
Verify that the integer is non-zero: ^\-[1-9][0-9]*$
Verify non-negative integer (positive integer + 0) ^\d+$
Verify non-positive integer (negative integer + 0) ^((-\d+)|(0+))$
Verify that the length of the character is 3: ^.{3}$
Verify a string consisting of 26 English letters: ^[A-Za-z]+$
Verify a string consisting of 26 uppercase English letters: ^[AZ]+$
Verify a string consisting of 26 lowercase English letters: ^[az]+$
Verify a string consisting of numbers and 26 English letters: ^[A-Za-z0-9]+$

Summarize

This article ends here. I hope it can be helpful to you. I also hope that you can pay more attention to more content on 123WORDPRESS.COM!

You may also be interested in:
  • Inventory of the pitfalls of brackets in javascript regular expressions
  • The difference between using tofixed and round in JS to process data rounding
  • JS processes data rounding (detailed explanation of the difference between tofixed and round)
  • Detailed explanation of how to use toFixed() rounding in JavaScript
  • js regular expression simple verification method
  • js uses regular expressions to filter year, month and day examples
  • Detailed explanation of the rounding accuracy problem of the toFixed() method in JS
  • Two ways of implementing interface association in jmeter (regular expression extractor and json extractor)
  • How to correctly set validation via regular expressions in nest.js
  • Pitfalls of toFixed() and regular expressions in jJavaScript

<<:  How to use CSS styles and selectors

>>:  HTML Several Special Dividing Line Effects

Recommend

Vue-pdf implements online preview of PDF files

Preface In most projects, you will encounter onli...

Mini Program implements list countdown function

This article example shares the specific code for...

TypeScript generic parameter default types and new strict compilation option

Table of contents Overview Create a type definiti...

Summary of js execution context and scope

Table of contents Preface text 1. Concepts relate...

An example of installing MySQL on Linux and configuring external network access

Configuration steps 1. Check whether DNS is confi...

Steps to install RocketMQ instance on Linux

1. Install JDK 1.1 Check whether the current virt...

vue-cli introduction and installation

Table of contents 1. Introduction 2. Introduction...

Summary of 10 amazing tricks of Element-UI

Table of contents el-scrollbar scroll bar el-uplo...

mysql show simple operation example

This article describes the mysql show operation w...

WeChat applet development chapter: pitfall record

Recently, I participated in the development of th...

Solution to the problem that docker nginx cannot be accessed after running

## 1 I'm learning docker deployment recently,...