Texwrangler & Regular Expressions Primer

Textwrangler & regexp

Think to regexp as advanced â&#x20AC;&#x153;find and replaceâ&#x20AC;?

With regexp you can look for characters or metacharactes

Selectors \t \r \s \d \w \d

tabulation line break (\n on windows) space any digit any alphanumeric any digit

Examples

Find semicolon and replace with tabulation

Find empty lines and remove them

Looking for a point

Metacharachters ^$()[]{}\|.*+?

Escaping metacharacters adding a backslash is a way of indicating that we want to use one of our metacharacters as a literal

\^ \$  \[ \] \{ \} \\ \| \. \* \+ \?

Ex. 1 •  go to “operabase > performances > Season 13/14”; •  copy & past the Germany list of theatres; •  Replace “ \\” with end of line; •  Replace commas with tabulations.

Ex. 1b •  go to “List of countries by carbon dioxide emissions” on Wikipedia, copy & past the list; •  replace points with commas; •  Remove commas between numbers; •  Get rid of “sources” column

Any Character .

Any character

Repetitions + *

one or more (until the last match) zero or more (until the last match)

+? *?

one or more (until the first match) zero or more (until the first match)

{3} exact number of repetitions

Examples \s+ one or more spaces cats* value.*

match “cat” and “cats” match any character after “value”

Group of characters []

Group of characters negation [^]

Examples [azm] [0-9] [a-z] [A-Z] [A-z] [0-9,.] [^b]

match “a” “z” “m” match digits match lowercase characters match uppercase characters match both upper & lowercase match digits commas and points any character apart “b”

Ex. 2 •  go to “Craiglist milano for sale / wanted”; •  Look for the source code; •  Find all posts links and description; •  For each line, keep the URL and the description. Hint: The link structure is: <a href=”[URL]”>[DESCRIPTION]</a>

Catch ()

Examples

Find a series of digits and write â&#x20AC;&#x153;number: â&#x20AC;&#x153; before them

Examples

Find dates followed by time, like â&#x20AC;&#x153;3-feb-1984 10:23â&#x20AC;? and divide them in parts

Start/end of line ^ $

line start line end

Turn static files into dynamic content formats.

Create a flipbook