Category: Regular Expressions

Regular Expression To Find Single Apersands In Text

22 September, 2008 | Regular Expressions | No comments

Encoding special characters in a block of HTML or other code can be a pain because there might already be ampersands there that impart encoding. This might be an ampersand that has already been encoded with a &, or it might be an ampersand in the code as an if statement or similar.

Use the following regular expression to find any ampersand that hasn’t already been encoded.

([^&])&(?!#?[a-zA-Z0-9]{2,6};|\$|&)

When using replace, you can turn any ampersand into & by using the following replace.

$1&

The only problem with this statement is when the code uses a & operator as part of a statement to do bitwise operations.

Common Regular Expressions

24 March, 2008 | Regular Expressions | No comments

Here are some of the regular expressions that I frequently use.

Find a blank line
^$

Spaces
[ \t]+
You can use this to break a text string apart into words.

Date
\d{1,2}(\-|\/|\.)\d{1,2}\1\d{4}
This will match anything in the format mm/dd/yyyy, or even dd/mm/yyyy.
[A-Z][a-z][a-z] [0-9][0-9]*, [0-9]{4}
Will match a formatted date, such as Mar 24, 2007.

Time
([1-9]|1[0-2]):[0-5]\d(:[0-5]\d(\.\d{1,3})?)?
This will match HH:MM or HH:MM:SS or HH:MM:SS.mmm.

IP Address
(((\d{1,2}|(1\d{2})|(2[0-4]\d)|25[0-5]))\.){3}((\d{1,2}|(1\d{2})|(2[0-4]\d)|25[0-5]))
This also checks to see that the IP address is within the range 0.0.0.0 to 255.255.255.255.

Email Address
([\w\-\.]+)@((\[([0-9]{1,3}\.){3}[0-9]{1,3}\])|(([\w\-]+\.)+)([a-zA-Z]{2,4}))
This is using a simple mechanism, the following expression uses the RFC standard for an email address format and so should match %99.99 of all email addresses.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

Complete URLs
https?://(\w*:\w*@)?[-\w.]+(:\d+)?(/([\w/_.]*(\?\S+)?)?)?
This will match virtually any URL.

HTML Comments
<!-{2,}.*?-{2,}>

Inline Comments
//.*
This will match inline comments in C, PHP, Java, JavaScript etc.

RegExLib.com: The Regular Expression Library

22 March, 2008 | Regular Expressions Websites | No comments

Writing regular expressions can sometimes be a real pain, especially if you are not used to them. Rather than trying for yourself to make a regular expression you might want to think about looking for regular expressions that other people have made. Rather than reinventing the wheel to prove you can do something,using free third party regular expressions can save you a lot of time.

The RegExLib or regular expression library is a great resource for finding any regular expression that you are looking for.

RegExLib.com: The Regular Expression Library

A good example of a regular expression that everybody seems to use at one point or another is to match an email address. Using the search function on the site I was able to find 98 different expressions that either matched email addresses, or had something to do with them.

What makes the site a really great resource is that every expression is given a description, a rating and an example list of what it matches and doesn’t match. You can also run a quick test on the expression to see what it can do. This way you can see straight away if the expression available is going to do the job you want it to do.

reWork: A Regular Expression Workbench

21 March, 2008 | Regular Expressions Websites | No comments

Regular expressions are a very useful tool for any programmer wanting to validate input, format strings, change words, reformat data or even split apart a string into an array. However, when you are starting out, writing them it can be hard going, they are not very easy to learn and the only way to really understand them is to practice, practice, practice.

This is where reWork steps in. It is a fully functional online regular expression workbench that will allow you to plug the expression and the text in one end, and it will show you exactly what is being matched. This simple JavaScript program is far better than any stand alone application I have seen and has more functionality than you could even think about.

reWork: Regular Expression Workbench

I have often wondered why a program wasn’t working and found that I had written an expression incorrectly so that the correct string wasn’t being found. reWork has been a real help on those occasions. I especially like the fact that I can get an expression ready and then copy the JavaScript or PHP code from the bottom of the page.

If you are starting out you should get yourself a decent regular expressions book, and then use this tool to see what an expression does and how they work. However, this tool is exceedingly useful for seasoned developers. Go ahead and try it!