Archive

Posts Tagged ‘search’

Search A Table With JavaScript

March 10th, 2009 3 comments

Using server side scripts to search for things can be as complex or as simple as the situation requires. However, if you have a table of results and you just want to enable a simple JavaScript search on that table then this might be the script for you.

To search a table using JavaScript you need to split the table into bits, this can be done using the getElementsByTagName() function, which takes the name of the element that you want to capture. So to grab all of the rows of a table as an array you need to pass the value of tr.

var rows = document.getElementsByTagName("tr");

We can then iterate through these rows, grabbing the column that you want to search on, with the following code.

<form action="#" method="get" onsubmit="return false;">
<input type="text" size="30" name="q" id="q" value="" onkeyup="doSearch();" />
</form>

Next, the table. Note that I have added an additional row to the end of this table. This will be used to display a note to the user if they have entered a query that isn’t found.

<table>
<tr><td>One</td></tr>
<tr><td>Two</td></tr>
<tr><td>Three</td></tr>
<tr><td>Four</td></tr>
<tr><td>Five</td></tr>
<tr><td>Six</td></tr>
<tr><td>Seven</td></tr>
<tr><td>Eight</td></tr>
<tr style="display:none;" id="noresults">
 <td>(no listings that start with "<span id="qt"></span>")</td>
</tr>
</table>

The first thing we need to do in our search function is to prepare the search term. This turns the query string to lowercase, which we can then match to the table column.

var q = document.getElementById("q");
var v = q.value.toLowerCase();

Now we can go through each row value and try to match it to the value in the query string. If it matches then we display the row, if not then we hide it.

  for ( var i = 0; i < rows.length; i++ ) {
    var fullname = rows[i].getElementsByTagName("td");
    fullname = fullname[0].innerHTML.toLowerCase();
    if ( fullname ) {
        if ( v.length == 0 || (v.length < 3 && fullname.indexOf(v) == 0) || (v.length >= 3 && fullname.indexOf(v) > -1 ) ) {
        rows[i].style.display = "";
      } else {
        rows[i].style.display = "none";
      }
    }
  }

Here is the full function, including the code to implement the no results note.

<script type="text/javascript">
//<!--
function doSearch() {
  var q = document.getElementById("q");
  var v = q.value.toLowerCase();
  var rows = document.getElementsByTagName("tr");
  var on = 0;
  for ( var i = 0; i < rows.length; i++ ) {
    var fullname = rows[i].getElementsByTagName("td");
    fullname = fullname[0].innerHTML.toLowerCase();
    if ( fullname ) {
        if ( v.length == 0 || (v.length < 3 && fullname.indexOf(v) == 0) || (v.length >= 3 && fullname.indexOf(v) > -1 ) ) {
        rows[i].style.display = "";
        on++;
      } else {
        rows[i].style.display = "none";
      }
    }
  }
  var n = document.getElementById("noresults");
  if ( on == 0 && n ) {
    n.style.display = "";
    document.getElementById("qt").innerHTML = q.value;
  } else {
    n.style.display = "none";
  }
}
//-->
</script>

Here is a working example of the code, with valid HTML.

Getting Started With Zend_Lucene

February 20th, 2009 1 comment

Zend_Lucene is an implementation of the Lucene search engine in PHP5 and is included as part of the Zend Framework from version 1.6. Lucene implements all of the standard search engine query syntaxes (eg. boolean and wildcard searches) and stores its index as files so it doesn’t need a database server to run. Lucene can be used if you want to add search functionality to a site but don’t want to go down the route of building a querying syntax from scratch.

To get started with Lucene you need to create an index. The following code has the effect of creating a directory on your server that Lucene will use to store and retrieve documents.

$index = Zend_Search_Lucene::create('/data/my-index');

To open the index use the following code.

$index = Zend_Search_Lucene::open('/data/my-index');

Of course your index will not contain anything so the next step is to add some documents to it.

To create a new document you need to create a new document object. This is done using the Zend_Search_Lucene_Document() class.

$doc = new Zend_Search_Lucene_Document();

You can then assign fields to this document using the static functions of the Zend_Search_Lucene_Field class.

$doc->addField(Zend_Search_Lucene_Field::Text('title', 'The title of the document'));
$doc->addField(Zend_Search_Lucene_Field::Text('contents', 'The contents of the document.'));

You can also use binary data, which is useful if you have used a document scanning service and want to be able to search the data at a later date.

$doc->addField(Zend_Search_Lucene_Field::Binary('originalfile', $filedata));

Any binary data you assign like this isn’t tokenized or indexed but it is stored in the index so you would need to assign other fields so that the data can be searched for.

Once you have added your fields you can add the document using the addDocument() function of the index opened index object.

$index->addDocument($doc);

If you are building a search index for a site then you might want to use the built in HTML parsing functionality. This makes it easy for you to add either a HTML string or a HTML filename that Lucene will then index. You then add this file to the index using the addDocument() function of the opened index object. Note that when adding documents in this way you should also add the URL of the document as a field so that you can retrieve it later.

$doc = Zend_Search_Lucene_Document_Html::loadHTMLFile('http://www.talkincode.com/');
$doc->addField(Zend_Search_Lucene_Field::Text('url','http://www.talkincode.com/'));
$index->addDocument($doc);

You can also index and search Word, Excel and Powerpoint documents in much the same way as this.

Once you have the index you can search it. This is done using an opened index object, you can find out how big your index is and how many documents you have in your index by using the count() and numDocs() functions receptively.

$indexSize = $index->count();
$documents = $index->numDocs();

To construct a query and implement the boolean and wildcard searching you need to use the Zend_Search_Lucene_Search_QueryParser class, this is then passed onto the Zend_Search_Lucene_Search_Query_Boolean object using the addSubquery() function.

$queryStr = 'talk';
$userQuery = Zend_Search_Lucene_Search_QueryParser::parse($queryStr);
 
$query = new Zend_Search_Lucene_Search_Query_Boolean();
$query->addSubquery($userQuery, true);
 
  // do the search
$hits = $index->find($query);

The variable $hits now contains an array of the Zend_Search_Lucene_Search_QueryHit object. This object has a property called score, which is the score of the hit result. The score is an indication (between 0 and 1) of how closely the query matched the index. The first item in the $hits array will have the highest score value. Every field that you defined for the document whilst indexing is now presented as a property of this object. So if you set a URL field for your document you can see a list of your documents using the following code:

$hits = $index->find($query);
foreach ($hits as $hit) {
 echo $hit->score.'<br />';
 echo $hit->url.'<br />';
}

Lucene can do a lot more than what I have briefly detailed here so I might write some posts in the future on how to refine updating, indexing and searching.

Search Engine Spider Detection With PHP

September 29th, 2008 2 comments

Part of any search engine optimisation strategy should always be that the user and the search engine see the same thing. If you start delivering different content you will either end up not performing or just getting outright banned. However, there are certain circumstances where you will want to detect the presence of a search engine spider. For example, lets say that you had a link to a section of your site, and you wanted to add a counter to it that registered an action every time a user clicked on the link. One way of doing this would be to add a parameter to the URL of the link, if the parameter is present then it is a user going through the site and so the action will be registered. You don’t want to register every time a search engine bot spiders the site so using the following function will allow you to turn off this parameter for these spiders.

function spiderDetect() {
 $agentArray = array("ArchitextSpider", "Googlebot", "TeomaAgent",
  "Zyborg", "Gulliver", "Architext spider", "FAST-WebCrawler",
  "Slurp", "Ask Jeeves", "ia_archiver", "Scooter", "Mercator",
  "crawler@fast", "Crawler", "InfoSeek Sidewinder",
  "almaden.ibm.com", "appie 1.1", "augurfind", "baiduspider",
  "bannana_bot", "bdcindexer", "docomo", "frooglebot", "geobot",
  "henrythemiragorobot", "sidewinder", "lachesis", "moget/1.0",
  "nationaldirectory-webspider", "naverrobot", "ncsa beta",
  "netresearchserver", "ng/1.0", "osis-project", "polybot",
  "pompos", "seventwentyfour", "steeler/1.3", "szukacz",
  "teoma", "turnitinbot", "vagabondo", "zao/0", "zyborg/1.0",
  "Lycos_Spider_(T-Rex)", "Lycos_Spider_Beta2(T-Rex)",
  "Fluffy the Spider", "Ultraseek", "MantraAgent","Moget",
  "T-H-U-N-D-E-R-S-T-O-N-E", "MuscatFerret", "VoilaBot",
  "Sleek Spider", "KIT_Fireball", "WISEnut", "WebCrawler",
  "asterias2.0", "suchtop-bot", "YahooSeeker", "ai_archiver",
  "Jetbot"
 );
 
 $theAgent = $_SERVER["HTTP_USER_AGENT"];
 
 for($i=0;$i<count($agentArray);$i++){
  if(strpos(" ".strtolower($theAgent), strtolower($agentArray[$i]))!= false){
   return true;
  };
 };
 return false;
}

The function works by finding the current user agent string of the visitor and the comparing it to the list of user agents in an array. If the user agent is found then true is returned, otherwise the return value is false.

With this function present you can now include an if statement to see if the user agent is a search engine spider or not.

if(spiderDetect()){
 // do something for spiders
}else{
 // do something for users
}

Please be careful with this function. If you server different content to users and search engine spiders you will more than likely get banned for your efforts.

Also, this might be an incomplete list of search engine spider user agents, if you know any more then please write a comment and I will add them onto this list.

Categories: PHP Tags: , , , , ,

Enable Custom Field Searching With WordPress 2.6

September 5th, 2008 1 comment

I have previously talked about enable custom field search in WordPress, but that involved altering the main WordPress files, which is a big no-no.

So is there an alternative? Well, yes, otherwise I wouldn’t have bothered writing the post!

To enable custom field (also called WordPress metadata) searching you need to set up two things.

First you need to have created a custom field (or two) and added this to a number of posts.

Next, you need to have a custom search form that has the name of the field set as the name of an input box. You don’t even need the normal s input box that WordPress uses as default.

Open up your template functions.php file and add in the following three lines of code.

add_filter('posts_join','search_metadata_join');
add_filter('posts_where','search_metadata');
add_filter('posts_groupby','search_meta_groupby');

This adds three callback filters to WordPress at run time. The posts_join will add the meta table into our query so that we can work with it. The posts_where will be the main function where all of our where clauses will be build. Finally, posts_groupby will stop multiple posts appearing for the same term. All we have to do now is include the queries we need in order to enable custom field searching. I will do each of the callback functions in the order they appeared above.

function search_metadata_join($join) {
 global $wp_query, $wpdb, $wp_version;
  // add in check for older versions
 if($wp_version >= '2.3'){
  $join .= " LEFT JOIN $wpdb->postmeta AS m ON ($wpdb->posts.ID = m.post_id) ";
 }else{
  $join .= "LEFT JOIN $wpdb->postmeta ON $wpdb->posts.ID = $wpdb->postmeta.post_id ";
 }
 
 return $join;
}

The search_metadata() function has a array variable called $metaVars. This contains an array of the names of the custom fields that you want to search through. Each item in the array is checked to see if it has a value, and if it does it is included in a string called clause. The function is set up to allow for any of the custom fields being true, which is what I wanted it do to.

function search_metadata($where) {
 global $wp_query, $wpdb, $wp_version;
 
 $metaVars = array('custom1','custom2');
 $where .= ' AND (';
 foreach($metaVars as $metaValue=>$metaKey){
  if(!empty($_GET[$metaKey])){
   if($wp_version >= '2.3'){
    $clause .= " OR (m.meta_key = '".$metaKey."' AND m.meta_value LIKE '%".$wpdb->escape($_GET[$metaKey]) . "%') ";
   }else{
    $clause .= " OR (meta_key = '".$metaKey."' AND meta_value LIKE '%" . $wpdb->escape($_GET[$metaKey]) . "%') ";
   }
  }
 }
 $where .= substr($clause,3).') ';
 return $where;
}

Finally, here is the search_meta_groupby() function. I pulled this from the wordpress.org site and adapted it slightly.

function search_meta_groupby( $groupby ){
 global $wpdb;
 
 // we need to group on post ID
 $mygroupby = "{$wpdb->posts}.ID";
 if( preg_match( "/$mygroupby/", $groupby )) {
  // grouping we need is already there
  return $groupby;
 }
 
 if( !strlen(trim($groupby))) {
  // groupby was empty, use ours
  return $mygroupby;
 }
 
 // wasn't empty, append ours
 return $groupby . ", " . $mygroupby;
}

All this is designed to get you started in creating custom field WordPress searching. You might have to tweak the code involved here before you get it working.

Categories: Wordpress Tags: , , , , ,

Custom Search Form With WordPress Search Widget

July 30th, 2008 No comments

When adding a search form to your WordPress blog you will want to have control over what sort of form is displayed. It is possible to override the search form created by the widget function without having to go into the /wp-includes/widget.php file and editing the wp_widget_search function. Here is the function that is present in WordPress 2.6.

function wp_widget_search($args) {
 extract($args);
 $searchform_template = get_template_directory() . '/searchform.php';
 
 echo $before_widget;
 
 // Use current theme search form if it exists
 if ( file_exists($searchform_template) ) {
  include_once($searchform_template);
 } else { ?>
  <form id="searchform" method="get" action="<?php bloginfo('url'); ?>/"><div>
  <label class="hidden" for="s"><?php _e('Search for:'); ?></label>
  <input type="text" name="s" id="s" size="15" value="<?php the_search_query(); ?>" />
  <input type="submit" value="<?php echo attribute_escape(__('Search')); ?>" />
  </div></form>
 <?php }
 
 echo $after_widget;
}

Notice that the first thing the function tries to do is load a template file called searchform.php, and if this doesn’t exist the function prints out a standard search form. That is it basically. If you want to override the search form created by this function then just create a file called searchform.php in your template directory and create a search form inside this file. The form must have the following:

  • The action of the form should go to the home of the blog. So if your blog is located at domain.com/blog then the action should go to there.
  • The method of the form should be get, but the form will also work with a post method.
  • A text input box must be present and this must have the name of s.
  • For good search form usability you should print out the search query in the text field of the form. This can be done with the function the_search_query(), which will print out the search query, if there is one.

There are many different ways to create a search form. I found these two in two different templates, created by different people.

<h2>Search</h2>
<form method="get" id="searchform" action="<?php echo $_SERVER['PHP_SELF']; ?>">
<div class="searchbox">
<label for="s">Find:</label>
<input type="text" value="<?php echo wp_specialchars($s, 1); ?>" name="s" id="s" size="14" />
<input type="submit" id="searchsubmit" value="Search" />
</div>
</form>

This one will print out a little bit of JavaScript in the text box.

<div id="search">
<form method="get" id="searchform" action="<?php bloginfo('home'); ?>/">
<div><input type="text" value="search..." name="s" id="s" onfocus="if (this.value == 'search...') {this.value = '';}" onblur="if (this.value == '') {this.value = 'search...';}" />
</div>
</form>
</div>

They both seem to work quite well, but I would advise that any search form you create should be copied from the original and modified. This is usually the best practice with WordPress themes as the WordPress documentation can be a little thin on the ground (or hidden) and their developers tend to use the best functions available.

Categories: Wordpress Tags: , , , , ,