PHP Supported Metacharacters (You Must Know Them)


This article provides tables to support your regular expressions need at PHP. Regular expressions transform became a language such as sql in database word. We always call it, Regex. Regex are a language used for parsing and manipulating text. Actually, they are used to search a pattern in text. Then, we can do more such as replacing operation and validating data. This post will collect tables PHP metacharacters that support for regex.

Before see tables, I want to show you a summary of the tables. Look following mind map:

mindmap about php metacharacters

PHP character representations

\a Alert, \x07
\b Backspace, \x08, supported only in character class
\e Esc, \x1B
\n Newline, \x0A
\r Carriage return, \x0D
\f Form feed, \x0C
\t Horizontal tab, \x09
\octal character specified by a three-digit octal code
\xhex character specified by a one- or two digit hexadecimal code
\x{hex} character specified by any hexadecimal code
\cchar Named control character

PHP character classes and class-like constructs

[...] A single character listed or contained within a listed range
[^...] A single character not listed and not contained within a listed range
[:class:] POSIX-style character class (valid only within a regex character class)
. Any character excep newline (unless single-line mode, /s)
\C One byte (this might corrupt a Unicode character stream however)
\w Word character, [a-zA-Z0-9_]
\W Nonword character, [^a-zA-Z0-9_]
\d Digit character, [0-9]
\D Nondigit character
\s Whitespace character, [\n\r\f\t]
\S Nonwhitespace character, [^\n\r\f\t]

PHP anchors and zero-width tests

^ Start of string, or the point after any newline if in multiline match mode, /m
\A Start of search string, in all match modes
$ End of search string, or the point before a string-ending newline, or before any newline if in multiline match mode, /m
\Z End of string, or the point before a string-ending newline, in any match mode
\z End of string, in any match mode
\G Begining of current search
\b Word boundary; position between a word character (\w), and a nonword character (\W), the start of the string, or the end of the string
\B Not-word-bounday
(?=...) Positive lookahead
(?!...) Negative lookahead
(?<=...) Positive lookbehind
(?Negative lookbehind

PHP comments and mode modifiers

i Case-insensitive matching
m ^ and $ match next to embedded \n
s Dot(.) matches newline
x Ignore whitespace, and allow comments (#) in pattern
U Inverts greediness of all quantifies: * becomes lazy, and *? greedy
A Force match to start at beginning of subject string
D Force $ to match end of string instead of before the string ending newline. Overriden by multiline mode
u Treat regular expressin and subject strings as strings of multibyte UTF-8 characters
(?mode) Turn listed modes (one or more of imsxU) on for the rest of the subexpression
(?-mode) Turn listed modes (one or more of imsxU) off for the rest of the subexpression
(?mode:...) Turn mode (xsmi) on within parentheses
(?-mode:...) Turn mode (xsmi) off within parentheses
(?#...) Treat substring as a comment
(#...) Rest of line is treated as a comment in x mode
\Q Quotes all following regex metacharacters
\E Ends a span started with \Q

PHP grouping, capturing, conditional, and control

(...) Group subpattern and capture submatch into \1, \2, ...
(?P...) Group subpattern, and capture submatch into named capture group, name
\n Contains the results of the nth earlier submatch from a parentheses capture group, or a named capture group
(?:...) Groups subpattern, but does not capture submatch
(?>...) Atomic grouping
...|... Try subpatterns in alternation
* Match 0 or more times
+ Match 1 or more times
? Match 1 or 0 times
{n} Match exactly n times
{n,} Match exactly n times
{x,y} Match at least x times, but no more than y times
*? Match 0 or more times, but as few times as possible
+? Match 1 or more times, but as few times as possible
?? Match 0 or 1 times, but as few times as possible
{n,}? Match at least n times, but as few times as possible
{x,y}? Match at least x times, no more than y times, and as few times as possible
*+ Match 0 or more times, and never backtrack
++ Match 1 or more times, and never backtrack
?+ Match 0 or 1 times, and never backtrack
{n,}+ Match at least n times, and never backtrack
{x,y}+ Match at least x times, no more than y times, and never backtrack
(?(condition)...|...) Match with if-then-else pattern. The condition can be the number of a capture group, or a lookahead or lookbehind construct
(?(condition)...) Match with if-then pattern. The condition can be the number of a capture group, or a lookahead or lookbehind construct



Bookmark and Share Tag: metacharacter, metasequence, regular expression Category: PHP Basic Post : August 29th 2004 Read: 9,102

advertisements


blog comments powered by Disqus