Skip to: Site menu | Main content

Regular expressions rock coz they can do cool stuff, but they suck coz no-one can understand them. I got sick of looking up the same basic things over and over again, so I've put this here as much for myself as anyone else. I can't take credit for all of it, some of it actually came from some site I found a while ago, but I don't remember the URL now. Sorry, whoever you are.

Operators

m// 'Match' operator
s/// 'Substitute' operator
tr/// 'Translate' operator

Special Characters

. Any single character escept newline n
b Between word and non-word chars, ie: /bcat/b/ matches "cat" but not "scat"
B NOT between word and non-word chars
w Word character
W Non-word character
d Digit
D Non-digit
s White-space
S Non-white-space

Assertions (Position Definers)

^ Start of the string
$ End of the string

Quantifiers (Numbers Of Characters)

n* Zero or more of 'n'
n+ One or more of 'n'
n? A possible 'n'

n{2} Exactly 2 of 'n'
n{2, } At least 2 (or more) of 'n'
n{2,4} From 2 to 4 of 'n'

Groupings

() Parenthesis to group expressions
(n/sitebuilder/regex/images/1/a) Either 'n' or 'a'

Character Classes

[1-6] A number between 1 and 6
[c-h] A lower case character between c and h
[D-M] An upper case character between D and M
[^a-z] Absence of lower case character between a and z
[_a-zA-Z] An underscore or any letter of the alphabet

And now for a nasty example:
^.{2}[a-z]{1,2}_?[0-9]*([1-6]/sitebuilder/regex/images/1/[a-f])[^1-9]{2}a+$

Which means:
A string beginning with any two characters, followed by either 1 or 2 lower case characters, followed by an optional underscore, followed by zero or more digits, followed by either a number between 1 and 6 or a character between a and f, followed by two characters that are not digits between 1 and 9, followed by one or more 'a' characters at the end of the string.

Whew!

So this string would be a match (I think!):
Axi_234b0Gaaa

Useful Perl Snippets

$mystring =~ s/^s*(.*?)s*$/$1/; Trim leading and trailing whitespace from $mystring
$mystring =~ tr/A-Z/a-z/; Convert $mystring to all lower case
$mystring =~ tr/a-z/A-Z/; Convert $mystring to all upper case
tr/a-zA-Z//s; Compress character runs, eg: bookkeeper -> bokeper
tr/a-zA-Z/ /cs; Convert non-alpha characters to a single space