Archive

Posts Tagged ‘regular expression’

Delete Trailing Commas In PHP

April 14th, 2009 1 comment

Converting an array of information into a string is easy, but when you are doing this for insertion into a database having trailing commas is going to mess up your SQL statements.

Take the following example, which takes an array of values and converts them into a string of values. This practice is quite common in PHP database manipulation.

$values = array('one', 'two', 'three', 'four', 'five');
$string = '';
 
foreach ( $values as $val ) {
    $string .= '"'.$val.'", ';
}
 
echo $string; // prints "one", "two", "three", "four", "five",

Obviously we need to strip the trailing comma from the end of this string. To do this you can use the following function.

function deleteTrailingCommas($str)
{
    return trim(preg_replace("/(.*?)((,|s)*)$/m", "$1", $str));
}

This function uses a regular expression to match for one or more commas or spaces after the main bulk of text and before the end of the string and prints out the main bulk of text. The trailing commas are not returned.

Here is another example:

$string = '"one", , ,  , , , ,,';
echo $string;
$string = deleteTrailingCommas($string);
echo $string;

This prints out the following:

"one", , ,  , , , ,,
"one"

WordPress Post Friendly Code With JavaScript Replace

February 12th, 2009 No comments

I recently talked about adding code to blogs and comments to WordPress and making sure that certain characters are encoded properly. So to simplify things I thought I would create a little set of regular expressions that takes a sample of code and convert it into a Wordress friendly format. It consists of the following function, which takes the value of a text area called tochange and runs some regular expression replace functions on it. I have kept the expressions as simple as possible so they are quite easy to understand. The g argument for each expression means that the replace will be done for all of the text.

<script type="text/javascript">
function changeIt(){
  var text = document.getElementById('tochange').value;
  text = text.replace(/&/g,'&amp;');
  text = text.replace(/"/g,'&quot;');
  text = text.replace(/'/g,'&#39;');
  text = text.replace(/</g,'&lt;');
  text = text.replace(/>/g,'&gt;');
  text = text.replace(/^\s+/mg,'&nbsp;&nbsp;');
  document.getElementById('changed').value = text;
  document.getElementById('preTag').innerHTML = text;
}
</script>

The only one which might cause an issue is the last one with the expression ^\s+. This simply matches for 1 or more white space characters at the beginning of a line. The m argument means that the ^ symbol will be used to mean the start of a line. You can test this function with the following HTML tags.

  <textarea id="tochange" cols="50" rows="10"></textarea>
  <input type="submit" onclick="changeIt()" />
  <textarea id="changed" cols="50" rows="10"></textarea>
  <pre id="preTag"></pre>

The first textarea is what you want to alter, the second is the altered text and the pre tag displays what the altered text will look like in your browser.

Common Regular Expressions

March 24th, 2008 No comments

Here are some of the regular expressions that I frequently use.

Find a blank line
^$

Spaces
[ \t]+
You can use this to break a text string apart into words.

Date
\d{1,2}(\-|\/|\.)\d{1,2}\1\d{4}
This will match anything in the format mm/dd/yyyy, or even dd/mm/yyyy.
[A-Z][a-z][a-z] [0-9][0-9]*, [0-9]{4}
Will match a formatted date, such as Mar 24, 2007.

Time
([1-9]|1[0-2]):[0-5]\d(:[0-5]\d(\.\d{1,3})?)?
This will match HH:MM or HH:MM:SS or HH:MM:SS.mmm.

IP Address
(((\d{1,2}|(1\d{2})|(2[0-4]\d)|25[0-5]))\.){3}((\d{1,2}|(1\d{2})|(2[0-4]\d)|25[0-5]))
This also checks to see that the IP address is within the range 0.0.0.0 to 255.255.255.255.

Email Address
([\w\-\.]+)@((\[([0-9]{1,3}\.){3}[0-9]{1,3}\])|(([\w\-]+\.)+)([a-zA-Z]{2,4}))
This is using a simple mechanism, the following expression uses the RFC standard for an email address format and so should match %99.99 of all email addresses.
[a-z0-9!#$%&'*+/=?^_`{|}~-]+(?:\.[a-z0-9!#$%&'*+/=?^_`{|}~-]+)*@(?:[a-z0-9](?:[a-z0-9-]*[a-z0-9])?\.)+[a-z0-9](?:[a-z0-9-]*[a-z0-9])?

Complete URLs
https?://(\w*:\w*@)?[-\w.]+(:\d+)?(/([\w/_.]*(\?\S+)?)?)?
This will match virtually any URL.

HTML Comments
<!-{2,}.*?-{2,}>

Inline Comments
//.*
This will match inline comments in C, PHP, Java, JavaScript etc.

RegExLib.com: The Regular Expression Library

March 22nd, 2008 No comments

Writing regular expressions can sometimes be a real pain, especially if you are not used to them. Rather than trying for yourself to make a regular expression you might want to think about looking for regular expressions that other people have made. Rather than reinventing the wheel to prove you can do something,using free third party regular expressions can save you a lot of time.

The RegExLib or regular expression library is a great resource for finding any regular expression that you are looking for.

RegExLib.com: The Regular Expression Library

A good example of a regular expression that everybody seems to use at one point or another is to match an email address. Using the search function on the site I was able to find 98 different expressions that either matched email addresses, or had something to do with them.

What makes the site a really great resource is that every expression is given a description, a rating and an example list of what it matches and doesn’t match. You can also run a quick test on the expression to see what it can do. This way you can see straight away if the expression available is going to do the job you want it to do.