Archive

Posts Tagged ‘words’

Randomising The Middle Of Words In PHP

November 18th, 2008 No comments

I was sent an email the other day that contained some text were the start and end letter of each word were left alone, but the middle of each word was randomized. The weird part was that the text was still readable, which is due to the way in which the brain processes words.

I wondered if I could replicate this using a PHP script. All I would need to do is split apart the sentence into the component words and loop through those words, randomizing the middle of them. Clearly, it is not possible to mix up the order of letters in a word less than four characters long so a check would be needed for this. This is what I cam up with:

function mixWordMiddle($string)
{
 $string = explode(' ',$string);
 foreach ( $string as $pos=>$word ) {
  $tmpArray = array();
  if ( strlen($word) > 3 ) {
   $chars = preg_split('//', $word, -1, PREG_SPLIT_NO_EMPTY);
   for ( $i = 1 ; $i < count($chars)-1 ; ++$i ) {
    $tmpArray[] = $chars[$i];
    shuffle($tmpArray);
   }
   $string[$pos] = $chars[0].implode($tmpArray).$chars[count($chars)-1] .' ';
  }
 }
 echo implode(' ',$string);
}

I then tried plugging in the following text about evolution.

$string = 'In biology, evolution is the changes in the inherited traits of a population of organisms from one generation to the next. These changes are caused by a combination of three main processes: variation, reproduction, and selection.';

And came up with something like the following.

In bliygoo, eoutivoln is the cganhes in the iethirned titras of a piaplouotn of oargnsims form one gneoeatirn to the nxte. Thsee cagnhes are ceusad by a cmibitoonan of there main persocses: voaitanri, rteunodpoirc, and stoneleic.

Which is actually quite difficult to read. I thought that this might be because I had used a bit of text with too many long words, so I selected another:

$string = 'A giant Saudi oil tanker seized by pirates in the Indian Ocean is nearing the coast of Somalia, the US Navy says.';

This produced the following text.

A ganit Suadi oil taeknr seezid by ptaiers in the Ianidn Oecan is nraneig the cosat of Smiolaa, the US Navy syas.

This is just a test script, so it doesn’t take into account any punctuation. However, the text it produces is still difficult to read, which leads me be skeptical of the claims of that the email I received.

Categories: PHP Strings Tags: , , , , , ,

Simple Swear Filter In PHP

September 30th, 2008 No comments

Use the following function to filter out words from user input. It works by having a pre-set array of words that are to be excluded, this array is then looped through and each item is used to replace any instances of that word within the text. The regular expression uses the \b character class, which stands for any word boundary. This way you don’t get the middle of words being filtered out when they are not meant to be.

By using the e of the preg_replace function it is possible to run PHP functions within the output. In this case we count the number of characters found in the replace and use this to create a string of stars (*) of equal length.

function filterwords($text){
 $filterWords = array('gosh','darn','poo');
 $filterCount = sizeof($filterWords);
 for($i=0; $i<$filterCount; $i++){
  $text = preg_replace('/\b'.$filterWords[$i].'\b/ie',"str_repeat('*',strlen('$0'))",$text);
 }
 return $text;
}

When the following text is run through this function.

echo filterwords('Darn, I have a mild form of torretts, poo!');

It produces the following result.

****, I have a mild form of torretts, ***!

Categories: PHP Strings Tags: , , , , , ,

JavaScript Word Counter

September 23rd, 2008 No comments

I found this neat tool on a site to do with search engine optimisation, which counts the number of words that are typed into a textarea. I have tried all sorts of patterns and characters and it seems very robust.

The tool uses a textarea of a form and outputs the number of words into an input box in the same form. Here is the HTML. The textarea calls a function called textCounter() every time a key is pressed.

<form onsubmit="return false;" action="" id="wordCountCalc">
<textarea name="message1" rows="10" cols="68" onkeydown="textCounter()" onkeyup="textCounter()"></textarea>
<input readonly="readonly" size="15" type="text" name="len" maxlength="10" value="0" />
</form>

The function works by removing any white space from the start of the text. It then removes any tab characters from the text before splitting the text by one or more white space characters.

The first step is to detect what browser the user is viewing the site in due to a discrepancy between how different browsers split a string apart by white space. The following snippet is used to detect browsers.

var sUserAgent = navigator.userAgent;
var isOpera = sUserAgent.indexOf("Opera")>-1;
var isIE = sUserAgent.indexOf("compatible")>1 && sUserAgent.indexOf("MSIE")>1 && !isOpera;

Here is the function that counts the number of characters in the textarea element.

function textCounter(){
 var area = document.getElementById('wordCountCalc');
 var formcontent;
 if(area.message1.value.length != 0){
  var reg;
  reg = /^\s/gi;
  formcontent = area.message1.value.replace(reg,''); // remove white space at start or string
  reg = /\t+/g;
  formcontent = formcontent.replace(reg,' '); // remove any tab characters
  reg = /\s+/g;
  formcontent = formcontent.split(reg); // split string by spaces
  if(isIE){
   area.len.value = formcontent.length;
  }else{
   if(area.message1.value.charAt(area.message1.value.length-1)==' ' ||     area.message1.value.charAt(area.message1.value.length-1)=='\n'){
    area.len.value = formcontent.length-1;
   }else{
    area.len.value = formcontent.length;
   };
  };
 }else{
  area.len.value = 0;
 };
};