Home > PHP Strings > Convert HTML To ASCII With PHP

Convert HTML To ASCII With PHP

The reverse of turning ASCII text into HTML is to convert HTML into ASCII. And to this end here is a little function that does this.

function html2ascii($s){
 // convert links
 $s = preg_replace('/<a\s+.*?href="?([^\" >]*)"?[^>]*>(.*?)<\/a>/i','$2 ($1)',$s);
 
 // convert p, br and hr tags
 $s = preg_replace('@<(b|h)r[^>]*>@i',"\n",$s);
 $s = preg_replace('@<p[^>]*>@i',"\n\n",$s);
 $s = preg_replace('@<div[^>]*>(.*)</div>@i',"\n".'$1'."\n",$s);
 
 // convert bold and italic tags
 $s = preg_replace('@<b[^>]*>(.*?)</b>@i','*$1*',$s);
 $s = preg_replace('@<strong[^>]*>(.*?)</strong>@i','*$1*',$s);
 $s = preg_replace('@<i[^>]*>(.*?)</i>@i','_$1_',$s);
 $s = preg_replace('@<em[^>]*>(.*?)</em>@i','_$1_',$s);
 
 // decode any entities
 $s = strtr($s,array_flip(get_html_translation_table(HTML_ENTITIES)));
 
 // decode numbered entities
 $s = preg_replace('/&#(\d+);/e','chr(str_replace(";","",str_replace("&#","","$0")))',$s);
 
 // strip any remaining HTML tags
 $s = strip_tags($s);
 
 // return the string
 return $s;
}

To use this function just pass it a string. Here is an example of it at work.

$htmlString = '<p>This is some <strong>XHTML</strong> markup that <em>will</em> be<br />turned <a href="http://www.talkincode.com/" title="Talk in code">into</a> an ascii string</p>';
 
echo html2ascii($htmlString);

Produces the following output.

This is some *XHTML* markup that _will_ be
turned into (http://www.talkincode.com/) an ascii string

Categories: PHP Strings Tags: , , , ,
  1. Marsel
    November 21st, 2008 at 03:29 | #1

    I got error at in line 19 –> $s = preg_replace(’//e’,'chr(\\1)’,$s);

    Warning: Wrong parameter count for chr() in C:\PHP-test\xxxxx.php (??) : regexp code on line 1

  2. November 21st, 2008 at 10:00 | #2

    You are quite right, that would never work!
    I have updated the script with the fix.

  1. No trackbacks yet.