Regular Expressions - patrickcole/learning GitHub Wiki

Regular Expressions

Basics

RegEx is comprised of:

/pattern/flags

Basic Pattern

// just check for the phrase 'hello':
const regex = /hello/;
console.log(regex.test(`hello world`));
// => true

To get matches in an array, use .exec():

const regex = /hello/;
const string = `hello world`;
const result = regex.exec(string);
console.log(result);

// => ['hello', index: 0, input: 'hello world', groups: undefined]

Using Flags

  • g: matches pattern multiple times
  • i: case insensitive
  • m: multi-line mode, ^ = start, $ = end of entire string; without adding this, multi-line strings match the beginning and end of each line
  • u: unicode
  • s: single-line; . also matches new line characters
console.log(/hello/ig.test(`HEllo`));
// => true;
// this also works:
console.log(new RegExp('hello', 'ig').test('HEllo'));
// => true;

Character Groups

Character Set

Matching anything that is enclosed in set:

const regex = /[hc]ello/;
console.log(regex.test('hello'));
// => true;
console.log(regex.test('cello'));
// => true;
console.log(regex.test('jello'));
// => false;

Negated Character Set

Matching anything that is not in set:

const regex = /[^hc]ello/;
console.log(regex.test('hello'));
// => false;
console.log(regex.test('cello'));
// => false;
console.log(regex.test('jello'));
// => true;

Ranges

const regex = /[a-z]ello/;
console.log(regex.test('hello'));
// => true;
console.log(regex.test('cello'));
// => true;
console.log(regex.test('jello'));
// => true;
console.log(regex.test('Hello'));
// => false; as set is all lowercase!

Combining Ranges

const regex = /[A-Z-0-9]/
console.log(regex.test('a'));
// => false;
console.log(regex.test('A'));
// => true;
console.log(regex.test('1'));
// => true;

Multiple Ranges

console.log(/^[A-Z]$/.test('A'));
// => true;
console.log(/^[A-Z]$/.test('AB'));
// => false;
console.log(/^[A-Z]$/.test('Ab'));
// => false;
console.log(/^[A-Z-0-9]$/.test('1'));
// => true;
console.log(/^[A-Z-0-9]$/.test('A1'));
// => false;

Meta Characters

  • \d - any digit 0-9
  • \D - any character NOT a digit
  • \w - any alphanumeric character and underscore
  • \W - any non-alphanumeric character including underscore
  • \s - any whitespace character (spaces, tabs, newlines and Unicode spaces)
  • \S - any non-whitespace character
  • \0 - null
  • \n - newline
  • \t - tab character
  • \uXXXX - unicode character with XXXX replacing the actual code number
  • . - any character that is not a newline character, unless use of the s is provided
  • [^] - matches any character including newline characters

Quantifiers

  • + - matches preceding expression 1 or more times
const regex = /\d+/;
console.log(regex.test('1'));
// => true;
console.log(regex.test('1122'));
// => true;
console.log(regex.test('Abdd'));
// => false;
  • * - matches preceding expression 0 or more times:
const regex = /hi*d/;
console.log(regex.test('hd'));
// => true; because i can still be ommited
console.log(regex.test('hid'));
// => true;
  • ? - matches preceding expression 0 or 1 time:
const regex = /hii?d/;
console.log(regex.test('hid'));
// => true; because second i is not provided, and that's ok due to rule (0 or 1);
console.log(regex.test('hiid'));
// => true; because second i is provided
console.log(regex.test('hiiid'));
// => false; one too many i characters
  • ^ - matches the beginning of the string
const regex = /^h/;
console.log(regex.test('hi'));
// => true;
console.log(regex.test('bye'));
// => false;
console.log(regex.test('hello'));
// => true;
  • $ - matches the end of the string
const regex = /.com$/;
console.log(regex.test('[email protected]'));
// => true;
console.log(regex.test('test@test'));
// => false;
console.log(regex.test('[email protected]'));
// => true;
console.log(regex.test('.com'));
// => true;
console.log(regex.test('com'));
// => false;
  • {N} - matches exactly N occurrences
const regex = /hi{2}d/;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hid'));
// => false;
  • {N,} - matches at least N occurrences preceeding
const regex = /hi{2,}d/;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hiiid'));
// => true; because at least two i characters exist
console.log(regex.test('hiiiid'));
// => true;
  • {N,M} - matches at least N and no more than M amount when M > N
const regex = /hi{1,2}d/;
console.log(regex.test('hid'));
// => true;
console.log(regex.test('hiid'));
// => true;
console.log(regex.test('hiiid'));
// => false;
  • X|Y - matches either X or Y
const regex = /(red|green) apple/;
console.log(regex.test('red apple'));
// => true;
console.log(regex.test('green apple'));
// => true;
console.log(regex.test('delicious apple'));
// => false;

Special Characters

To use special characters or characters used in patterns, you'll need to escape them in the regex:

// this won't check for 'a+b';
const regex = /a+b/;
console.log(regex.test('a+b'));
// => false;
const updated = /a\+b/;
console.log(regex.test('a+b'));
// => true;

Examples

Match Any 10 Digit Number

const regex = /^\d{10}$/;
console.log(regex.test('39930'));
// => false;
console.log(regex.test('9294628302'));
// => true;

String Transformations

Capitalize First Letter of String

Pattern: /^\w/

  • /: begin RegEx
  • ^: the beginning of the string
  • \w: matches any word character (alphanumeric & underscore)
  • /: end RegEx
let phrase = 'the quick green aligator...';
phrase.trim().replace(/^\w/, (char) => char.toUpperCase());
console.log(phrase);
// => "The quick green aligator..."

Capitalize First Letter of Each Word

Pattern: /\w\S*/g

  • /: begin RegEx
  • \w: matches any word character (alphanumeric & underscore)
  • \S: matches any character that is not a whitespace character (spaces, tabs or line breaks)
  • *: quantifier, match 0 or more of the preceding token
  • /: end RegEx
  • g: global search, search entire string
let phrase = 'the quick green alligator...';
phrase = 'the quick green alligator...';
phrase.replace(/\w\S*/g, (w) => (w.replace(/^\w/, (c) => c.toUpperCase())));
// => "The Quick Green Aligator..."
  • Once again, can use .trim() to remove leading spaces
  • Also can perform .toLowerCase() before capitalizing the first letter, if the text is mixed case (upper and lower)

Sources