Table of Contents

Regular expressions

A regular expression (or short: regex) is a pattern that is used to identify certain lines or parts of character-strings.

Regular Expressions (or regex) are a basic concept of the information technology. They let you define parts of strings. These definitions can have variable parts. This makes them so interesting.

Many unix-programs and programming languages use regex for searching (aka pattern matching) and, if applicable, for replacing. The following list is far from complete: vi, grep, sed, Perl, PHP, and many more.

As you can see in the table below, the regular expression syntax is quite similar in grep, sed and Perl. Bash has a much more limited syntax, and uses different characters for the same function (? vs. .) and the same character for a slightly different function (* vs. .*). This probably has to do with the usefullness of the dot in filenames, not only to mark file-extensions.

Regular expressions are also used in text-editors or other applications for searching and replacing. The syntax may vary from program to program, but the basic functions remain the same. (feel free to add your favorite application :)

match… Bash-glob grep sed Perl
…any one character ? .
…any characters (or no character) * .* (see repetition)
…one character from a set [ ]
…anything but the given set
[^ ]
…ranges in the set (e.g. a to z) [a-z]
repetitions Bash-glob grep sed Perl
zero or more of the preceding match n.a. *
zero or one of the preceding match n.a. \? ?
one or more of the preceding match n.a. \+ +
exactly x times n.a. \{x\} {x}
x times or more n.a. \{x,\} {x,}
x through y times n.a. \{x,y\} {x,y}
positions Bash-glob grep sed Perl
Beginning of line Beg. of expr.
^
End of line End of expr. $
Word boundary ' ' \b

Links

http://www.weitz.de/regex-coach/ - An interactive program that let's you analyze regular expressions.

http://www.regular-expressions.info/ - Probably more than you want to know about regex…

http://www.dotnetcoders.com/web/Learning/Regex/RegexTester.aspx: an alternate tester, specifically for .Net, but seems to work “normally” and also *checks “groups”* (parens)

Regular Expression Tester

Examples

  • In this UGU-admin-tip, there are regexs to match the hidden files. Since '.*' is not good (it also matches '..' and thus goes up in the directory-hirarchy when used in recursive commands), my suggestion is be '.[^.]*' (shell-glob) or '^\.[^\.]' (sed, grep and the like). This matches everything that starts with a dot and is followed by a 'non-dot'.
/home/www/LinuxBasics.org/data/pages/tutorials/advanced/realworld/regex.txt · Last modified: 2008/07/20 21:08 (external edit)
www.chimeric.de Valid CSS Driven by DokuWiki do yourself a favour and use a real browser - get firefox!! Recent changes RSS feed Valid XHTML 1.0