std::regex is an STL class for regular expressions.
The default regular expression notation is that of ECMAScript [1], but the regex can use POSIX, awk, grep and egrep notation additionally [2].
Adapted from [1]:
| Text | description |
|---|---|
[[:alnum:]] |
alpha-numerical character |
[[:alpha:]] |
alphabetic character |
[[:blank:]] |
blank character |
[[:cntrl:]] |
control character |
[[:digit:]] |
decimal digit character |
[[:graph:]] |
character with graphical representation |
[[:lower:]] |
lowercase letter |
[[:print:]] |
printable character |
[[:punct:]] |
punctuation mark character |
[[:space:]] |
whitespace character |
[[:upper:]] |
uppercase letter |
[[:xdigit:]] |
hexadecimal digit character |
[[:d:]] |
decimal digit character |
[[:w:]] |
word character |
[[:s:]] |
whitespace character |
The example show the use of:
.: can be anything[[:digit:]]: a digit?: zero or one repeats of the preceding thing+: one or more repeats of the preceding thing*: zero or more repeats of the preceding thing{2}: two repeats of the preceding thing
#include <cassert>
#include <regex>
#include <string>
int main()
{
assert(!std::regex_match("", std::regex("."))); //One anything
assert(!std::regex_match("", std::regex("[[:digit:]]"))); //One digit
assert( std::regex_match("", std::regex("[[:digit:]]?"))); //Zero or one digit
assert(!std::regex_match("", std::regex("[[:digit:]]+"))); //One or more digits
assert( std::regex_match("", std::regex("[[:digit:]]*"))); //Zero or more digits
assert( std::regex_match("", std::regex("[[:digit:]]{0}"))); //Zero digits
assert(std::regex_match("1", std::regex("."))); //One anything
assert(std::regex_match("1", std::regex("[[:digit:]]"))); //One digit
assert(std::regex_match("1", std::regex("[[:digit:]]?"))); //Zero or one digit
assert(std::regex_match("1", std::regex("[[:digit:]]+"))); //One or more digits
assert(std::regex_match("1", std::regex("[[:digit:]]*"))); //Zero or more digits
assert(std::regex_match("1", std::regex("[[:digit:]]{1}"))); //One digit
assert(!std::regex_match("12", std::regex("."))); //One anything
assert(!std::regex_match("12", std::regex("[[:digit:]]"))) ; //One digit
assert(!std::regex_match("12", std::regex("[[:digit:]]?"))); //Zero or one digit
assert( std::regex_match("12", std::regex("[[:digit:]]+"))); //One or more digits
assert( std::regex_match("12", std::regex("[[:digit:]]*"))); //Zero or more digits
assert( std::regex_match("12", std::regex("[[:digit:]]{2}"))); //Two digits
}The example show the use of:
|: or(): group\\.: a literal dot,.. The backslash escapes the dot being a wildcard. Because the backslash is a std::string escape character itself, it needs to be escaped by anothed backslash
#include <cassert>
#include <regex>
#include <string>
//A (simplified) Benelux (Dutch, Flemisch, Luxembourg) URL:
// - has one or more alphanumeric characters
// - ends on '.nl', '.be' or '.lu'
int main()
{
const std::regex benelux_url("[[:alnum:]]+\\.(nl|be|lu)");
assert( std::regex_match("nu.nl", benelux_url));
assert( std::regex_match("k3.be", benelux_url));
assert( std::regex_match("start.lu", benelux_url));
assert(!std::regex_match("lemonde.fr", benelux_url));
assert(!std::regex_match("nlbelu", benelux_url));
}Example programs and code snippets
- RegexTester: tool to test regular expressions
- is_dutch_postal_code: function to test for a Dutch postal code (e.g.
1234 AB)
- std::regex homepage
- cplusplus.com's ECMA script page
- C++ Weekly, Episode 62: std::regex
- C++ Weekly, Epsiode 74: std::regex optimize
- [1] Bjarne Stroustrup. The C++ Programming Language (4th edition). 2013. ISBN: 978-0-321-56384-2. Page 1071, 37.6 'Advice', item 3: 'The default regular expression notation is that of ECMAScript'
- [2] Bjarne Stroustrup. The C++ Programming Language (4th edition). 2013. ISBN: 978-0-321-56384-2. Page 1071, 37.6 'Advice', item 9: 'regex can use ECMAScript, POSIX, awk, grep and egrep notation'
