Turn your manual testers into automation experts! Request a DemoStart testRigor Free

Your RegEx Cheat Sheet

Regular expressions, or regexes, are a special kind of text string used to describe patterns in text. They’re extremely powerful tools for working with and modifying large amounts of text quickly, which is why they’re often used by developers and other professionals who need to deal with a lot of data. But they can also be intimidating! This cheat sheet will help you get started writing regex expressions and show you some helpful tricks along the way.

The basic format of a regex expression looks like this:

<pattern> = <expression>

In other words, take whatever pattern you want to match, put it inside brackets, then use that pattern to search for possible matches in your data.

For example, say you wanted to find all the phrases in your text that matched a certain pattern of words (for example, “I love pizza” or “Bella went to school today”). If you want to match the phrase “I love pizza,” you would search for this regex expression: I\w+love\w+pizza. The \w part stands for any word character (e.g., letter, number, or underscore), and the + indicates that there can be one or more of those characters between each word-separator character (for example, space or period).

Now if you wanted to find all the phrases that matched “Bella went to school today,” you would search for this regex expression: Bella\w+went\w+to\w+school\w+today. The \w part stands for any word character (e.g., letter, number, or underscore), and + indicates that there can be one or more of those characters between each word-separator character.

And here is what it means: find any string that has the word “Bella” followed by one or more word characters (\w), then another word (\’ went) followed by one or more of those same word characters, then again a word (to) followed by one or more of those same characters, and finally ending with another word (today).

Now that we have some background information about regexes, let’s take a look at what kinds of tasks are best suited for them.

For example, a regular expression might be used to find the position of words in a sentence or find text that is formatted with certain HTML tags. In addition, regexes can also search large amounts of data such as log files and computer folders for specific strings (e.g., names of photos). Regular expressions are also excellent at finding patterns and making replacements on large groups of data.

We’ve seen how a few examples work now; let’s look at some more!

Most modern programming languages have built-in support for regular expressions. If you need to use them often, it may be worth looking into learning about one (or all) of these languages, for example:
  • JavaScript has RegExp as one of its built-in objects. You may want to check out Javascript RegExp Library, which contains JavaScript regular expression functions and pre-written regexes for handling common tasks such as email validation and IP address formatting/validation.
  • Ruby has the StringScanner class, which provides a more complex way to find patterns in strings.
  • Python includes an extensive module called re.
  • Perl uses “Perl Regular Expressions” syntax.

Writing regular expressions can be a little tricky, so you may find it helpful to check out some of the resources available on Regular Expression Tutorials and References. You might also want to read up on what makes for a good regex when using Python’s re module.

Now that you have an understanding of regexes, see the full reference guide, including symbols, ranges, grouping, assertions and some sample patterns:

Anchors
^ Start of string, or start of line in a multiline pattern
\A Start of string
$ End of string, or end of line in a multiline pattern
\Z End of string
\b Word boundary
\B Non-word boundary
\< Start of word
\> End of word
Quantifiers
* 0 or more {3} Exactly 3
+ 1 or more {3,} 3 or more
? 0 or 1 {3,5} 3, 4 or 5
Add a ? to a quantifier to make it ungreedy.
Assertions
?= Lookahead assertion
?! Negative lookahead
?<= Lookbehind assertion
?> Once-only Subexpression
?() Condition [if then]
?()| Condition [if then else]
?# Comment
Character Classes
\c Control character
\s White space
\S Non- white space
\d Digit
\D Non- digit
\w Non- word
\x Hexadecimal digit
\O Octal digit
Groups and Ranges
. Any character except new line (\n)
(a|b) a or b
(...) Group
(?:...) Passive (non-capturing) group
[abc] Range (a or b or c)
[^abc] Not (a or b or c)
[a-q] Lower case letter from a to q
[A-Q] Upper case letter from A to Q
[0-7] Digit from 0 to 7
\x Group/subpattern number "x"
Ranges are inclusive.
POSIX
[:upper:] Upper case letters
[:lower:] Lower case letters
[:alpha:] All letters
[:digit:] Digits
[:alnum:] Digits and letters
[:xdigit:] Hexadecimal digits
[:punct:] Punctuation
[:blank:] Space and tab
[:space:] Blank characters
[:cntrl:] Control characters
[:graph:] Printed characters
[:print:] Printed characters and spaces
[:word:] Digits, letters and underscore
Escape Sequences
\ Escape following character
\Q Begin literal sequence
\E End literal sequence
Common Metacharacters
^ [ . $
{ * ( \
+ ) | ?
< >
The escape character is usually \
Special Characters
\n New line
\r Return
\t Tab
\v Vertical tab
\f Form feed
\xxx Octal character xxx
\xhh Hex character hh
Pattern Modifiers
g Global match
i * Case-insensitive
m * Multiline input treated as multiple lines
s * Treat string as single line
x * Allow comments and whitespace in pattern
e * Evaluate replacement
U * Non- unicode dependent
* PCRE modifier
String Replacement
$n nth non-passive group
$2 "xyz" in /^(abc(xyz))$/
$1 "xyz" in /^(?:abc)(xyz)$/
$` Before matched string
$' After matched string
$& Entire matched string
$+ Last matched string
Some regex implementations use \ instead of $.

Related Articles