Regular Expressions - patrickcole/learning GitHub Wiki
Regular Expressions
Basics
RegEx is comprised of:
/pattern/flags
Basic Pattern
// just check for the phrase 'hello':
const regex = /hello/;
console.log(regex.test(`hello world`));
// => true
To get matches in an array, use .exec()
:
const regex = /hello/;
const string = `hello world`;
const result = regex.exec(string);
console.log(result);
// => ['hello', index: 0, input: 'hello world', groups: undefined]
Using Flags
g
: matches pattern multiple timesi
: case insensitivem
: multi-line mode,^
= start,$
= end of entire string; without adding this, multi-line strings match the beginning and end of each lineu
: unicodes
: single-line;.
also matches new line characters
console.log(/hello/ig.test(`HEllo`));
// => true;
// this also works:
console.log(new RegExp('hello', 'ig').test('HEllo'));
// => true;
Character Groups
Character Set
Matching anything that is enclosed in set:
const regex = /[hc]ello/;
console.log(regex.test('hello'));
// => true;
console.log(regex.test('cello'));
// => true;
console.log(regex.test('jello'));
// => false;
Negated Character Set
Matching anything that is not in set:
const regex = /[^hc]ello/;
console.log(regex.test('hello'));
// => false;
console.log(regex.test('cello'));
// => false;
console.log(regex.test('jello'));
// => true;
Ranges
const regex = /[a-z]ello/;
console.log(regex.test('hello'));
// => true;
console.log(regex.test('cello'));
// => true;
console.log(regex.test('jello'));
// => true;
console.log(regex.test('Hello'));
// => false; as set is all lowercase!
Combining Ranges
const regex = /[A-Z-0-9]/
console.log(regex.test('a'));
// => false;
console.log(regex.test('A'));
// => true;
console.log(regex.test('1'));
// => true;
Multiple Ranges
console.log(/^[A-Z]$/.test('A'));
// => true;
console.log(/^[A-Z]$/.test('AB'));
// => false;
console.log(/^[A-Z]$/.test('Ab'));
// => false;
console.log(/^[A-Z-0-9]$/.test('1'));
// => true;
console.log(/^[A-Z-0-9]$/.test('A1'));
// => false;
Meta Characters
\d
- any digit0-9
\D
- any character NOT a digit\w
- any alphanumeric character and underscore\W
- any non-alphanumeric character including underscore\s
- any whitespace character (spaces, tabs, newlines and Unicode spaces)\S
- any non-whitespace character\0
- null\n
- newline\t
- tab character\uXXXX
- unicode character with XXXX replacing the actual code number.
- any character that is not a newline character, unless use of thes
is provided[^]
- matches any character including newline characters
Quantifiers
+
- matches preceding expression 1 or more times
const regex = /\d+/;
console.log(regex.test('1'));
// => true;
console.log(regex.test('1122'));
// => true;
console.log(regex.test('Abdd'));
// => false;
*
- matches preceding expression 0 or more times:
const regex = /hi*d/;
console.log(regex.test('hd'));
// => true; because i can still be ommited
console.log(regex.test('hid'));
// => true;
?
- matches preceding expression 0 or 1 time:
const regex = /hii?d/;
console.log(regex.test('hid'));
// => true; because second i is not provided, and that's ok due to rule (0 or 1);
console.log(regex.test('hiid'));
// => true; because second i is provided
console.log(regex.test('hiiid'));
// => false; one too many i characters
^
- matches the beginning of the string
const regex = /^h/;
console.log(regex.test('hi'));
// => true;
console.log(regex.test('bye'));
// => false;
console.log(regex.test('hello'));
// => true;
$
- matches the end of the string
const regex = /.com$/;
console.log(regex.test('[email protected]'));
// => true;
console.log(regex.test('test@test'));
// => false;
console.log(regex.test('[email protected]'));
// => true;
console.log(regex.test('.com'));
// => true;
console.log(regex.test('com'));
// => false;
{N}
- matches exactlyN
occurrences
const regex = /hi{2}d/;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hid'));
// => false;
{N,}
- matches at leastN
occurrences preceeding
const regex = /hi{2,}d/;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hiiid'));
// => true; because at least two i characters exist
console.log(regex.test('hiiiid'));
// => true;
{N,M}
- matches at leastN
and no more thanM
amount whenM > N
const regex = /hi{1,2}d/;
console.log(regex.test('hid'));
// => true;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hiiid'));
// => false;
X|Y
- matches eitherX
orY
const regex = /(red|green) apple/;
console.log(regex.test('red apple'));
// => true;
console.log(regex.test('green apple'));
// => true;
console.log(regex.test('delicious apple'));
// => false;
Special Characters
To use special characters or characters used in patterns, you'll need to escape them in the regex:
// this won't check for 'a+b';
const regex = /a+b/;
console.log(regex.test('a+b'));
// => false;
const updated = /a\+b/;
console.log(regex.test('a+b'));
// => true;
Examples
Match Any 10 Digit Number
const regex = /^\d{10}$/;
console.log(regex.test('39930'));
// => false;
console.log(regex.test('9294628302'));
// => true;
String Transformations
Capitalize First Letter of String
Pattern: /^\w/
/
: begin RegEx^
: the beginning of the string\w
: matches any word character (alphanumeric & underscore)/
: end RegEx
let phrase = 'the quick green aligator...';
phrase.trim().replace(/^\w/, (char) => char.toUpperCase());
console.log(phrase);
// => "The quick green aligator..."
Capitalize First Letter of Each Word
Pattern: /\w\S*/g
/
: begin RegEx\w
: matches any word character (alphanumeric & underscore)\S
: matches any character that is not a whitespace character (spaces, tabs or line breaks)*
: quantifier, match 0 or more of the preceding token/
: end RegExg
: global search, search entire string
let phrase = 'the quick green alligator...';
phrase = 'the quick green alligator...';
phrase.replace(/\w\S*/g, (w) => (w.replace(/^\w/, (c) => c.toUpperCase())));
// => "The Quick Green Aligator..."
- Once again, can use
.trim()
to remove leading spaces - Also can perform
.toLowerCase()
before capitalizing the first letter, if the text is mixed case (upper and lower)