Regex

  • https://regexr.com/
  • Character classes
    • [ACD] — any character in set
    • [^ACD] — any character not in set
    • . — any character except line breaks equivalent to [^\n\r]
    • [g-s] — matches character having character code between g and s
    • \w — matches word equivalent to [A-Za-z0-9_]
    • \W — not word
    • \d — digit equivalent to [0-9]
    • \D — not digit
    • \s — whitespace (spaces, tabs, line breaks and other unicode spaces)
    • \S — not whitespace
  • Quantifier
    • + — 1 or more
    • * — 0 or more
    • Curly braces:
      • {1,3} — 1 to 3
      • {3} — exactly 3
      • {3, } — 3 or more
  • Anchors
    • ^ — beginning
    • $ — end
    • \b — Matches a word boundary position b/w word character and non-word character
    • \B — not word boundary

Flags

  • g — global search
  • i — ignore case: make match case insensitive
  • m — multiline: ^ (beginning) and $ (end) anchors will match immediately after a line break character
  • u — unicode: can use extended unicode escapes

Capturing groups

  • Regex can contain capturing groups. Capturing groups are denoted by parentheses and during replace operation, they can be referenced in replacement string using $ symbol.
  • For each regex match:
    • $0 denotes the whole regex match
    • $1 denotes the first capturing group
    • $2 denotes the second capturing group and so on…

Capturing group example

  • Input
John, Doe | Jane, Doe
  • Regex = (\w+), (\w+)
  • Matches:
    • John, Doe:
      • Capturing groups:
        • $0 = John, Doe
        • $1 = John
        • $2 = Doe
    • Jane, Doe:
      • Capturing groups:
        • $0 = Jane, Doe
        • $1 = Jane
        • $2 = Doe
  • If Replacement = Hello $1 for each of the above match, we get
Hello John | Hello Jane

Regexp class

// via regular expression literal
const re = /ab+c/i;
 
// via class
const re = new RegExp("ab+c", "i");
  • Methods
// finding
re.exec() // return all matches
re.test() // return boolean if match exists
  • Regex objects
    • they are stateful if global g or sticky y flag is set
    • they store the lastIndex property from the previous match
const regex = RegExp('foo*', 'g');
const str = 'table football, foosball';
let arr;
 
while ((arr = regex.exec(str)) !== null) {
  console.log(`Found ${arr[0]}. Next starts at ${regex.lastIndex}.`);
  // Output: "Found foo. Next starts at 9."
  // Output: "Found foo. Next starts at 19."
}

Regex symbol methods

  • They define how below string methods should behave when the regular expression is passed in as the argument
str.match(regex) // return first match including capturing group
str.matchAll(regex) // return all match including capturing group
str.search(regex) // return index of the first match
 
// replacing
str.replace(str, replacement)
str.replaceAll(str, replacement)
 
// split
str.split(str, limit)