Build a Site Search with the Google Search API
Google provides several APIs, including one for their web based search, another for their desktop toolbar, and one for their Adwords program. Here we look at creating a custom site search with PHP and the search API.
A quick note before getting into the meat of the article. If you happened to make it thru last week's lengthy instalment, you should be well prepared for this article. I'm going to do my best to make this one a bit more concise, so that readers can get to playing with the API. Apologies if it seems that I gloss over anything. Code is available for downloading at the end of the article.
The formalities: Getting started with the Google APIs
Develop Your Own Applications Using Google is the homepage for Google's APIs. From this page you can get started by registering for an API Key, downloading their developers kit with some example code, and head over to their terms of service and some help and Faqs (their is also a Google Group).
Visions of Dollar Signs, Dancing
Before anyone gets too worked up over this API, note the following from their terms of service:Can I develop commercial applications using Google Web APIs?
You can develop any application you want, but you must abide by the Google Web APIs terms of service. One condition is you cannot create a commercial service using Google Web APIs without first obtaining written consent from Google. Another is that you can only create one account for your personal use.
The Google Search API
Much like our look at the Yahoo! Search API, we will be focusing on the Search API to build a site search.
For their main search API, however, Google offers us
doGoogleSearch(), doGetCachedPage() and
doSpellingSuggestion() - these allow you to do what
you would expect:
- Given Search terms, return results
- Given a web page, return the cache file
- Given a word it returns spelling suggestions
Getting creative, you could allow a user to do a search query, then fetch the results and the resulting cache pages, take an automated screenshot of the cached pages (trimming out the Google cache info from the top of the page) and provide a set of linked images with the search results.
Steps to building our site search
Much like the last instalment, we will be taking input, in the form of search terms from a user, and building a request that we will send to the API server. We will then take the response, unserialize it and format the results into some html.
Google uses the SOAP protocol (specs, W3Schools, Wikipedia) for sending data between it's API server and your application. PHP 5 can handle SOAP natively, however for PHP 4 we need to use an external library to send, receive, and unserialize our communications. As you can see, this is much the same process as with the REST powered Yahoo! example from last week.
There are many SOAP classes available for use, and since we'll be using PHP (4) in this example, I've chosen to use NuSOAP. You may also want to check out the PEAR SOAP class.
Note: if you are playing with nusoap on PHP 5, it will throw an error as the nusoap class has the same name as the SOAP client in PHP 5: soapclient.
Step 1: Request and receive
The first thing we need to do is build a request to send to Google. We will do this by setting our search parameters in an array:
-
// Build an array with the parameters we want to use: -
$params = array( -
'key' => 'yourGoogleAPIKeyHere', -
'q' => 'Search Terms Here', -
'start' => 0, -
'maxResults' => 10, -
'filter' => true, -
); - Download this code
Site search
In order to make a site search, a couple of things need to take place.
- A form that passes the search terms must be used, and when
submitted those terms must be passed into the value for
qin our array above. - As Google doesn't provide a parameter for a site search, a
value of
site:www.yoursite.commust be embedded into the value forq.
Taking into account those two points, the line for q would be
'q' => 'site:www.yoursite.com search terms'.
Moving forward: request and receive in one easy step
This next part moves quite fast, condensing a few of the steps done in last weeks Yahoo! example into simply a few lines of code. Rather than opening a file with PHP, we will be passing the url and the parameters directly to the nusoap class:
-
// include the class: -
include('nusoap.php'); -
// -
// instantiate a new soap client: -
$soapclient = new soapclient("http://api.google.com/search/beta2"); -
// -
// send the query off to the server, with our -
// parameters and using the 'doGoogleSearch' method -
$searchresults = $soapclient->call("doGoogleSearch", $params, -
"urn:GoogleSearch", "urn:GoogleSearch"); - Download this code
First we include the class, then we instantiate a new 'soapclient', passing it the URI for the API server. From this point we can call the server as outlined in the example.
Nusoap returns the data from the server to us in the form of an array. Compared to the example from last week, this was certainly much simpler to get from the request to an array of data (granted, nusoap is composed of a lot of lines of code, and it did all of the heavy lifting).
A look at the data
If you were to print_r($searchresults) at this
point, you would see something similar to the following:
-
Array -
( -
[directoryCategories] => Array -
( -
) -
[documentFiltering] => -
[endIndex] => 2 -
[estimateIsExact] => -
[estimatedTotalResultsCount] => 190 -
[resultElements] => Array -
( -
[0] => Array -
( -
[URL] => http://www.fiftyfoureleven.com/sandbox/weblog/2004/jun/the-definitive-css-gzip-method/ -
[cachedSize] => 18k -
[directoryCategory] => Array -
( -
[fullViewableName] => -
[specialEncoding] => -
) -
[directoryTitle] => -
[hostName] => -
[relatedInformationPresent] => 1 -
[snippet] => This post is the source for the most definitive/recent/tested version of gzipping -
your CSS. -
[summary] => -
[title] => The Definitive Post on Gzipping your CSS -
) -
[1] => Array -
( -
[URL] => http://www.fiftyfoureleven.com/weblog/web-development/css/applied-css-management-and-optimization -
[cachedSize] => 44k -
[directoryCategory] => Array -
( -
[fullViewableName] => -
[specialEncoding] => -
) -
[directoryTitle] => -
[hostName] => -
[relatedInformationPresent] => 1 -
[snippet] => Building on the previous discussion about managing CSS files, this post looks at -
the practical solutions in use to help offset the results of some of the ... -
[summary] => -
[title] => Applied CSS Management and Optimization -
) -
) -
[searchComments] => -
[searchQuery] => site:www.fiftyfoureleven.com css -
[searchTime] => 0.093227 -
[searchTips] => -
[startIndex] => 1 -
) - Download this code
Looking at the array above, we can see that the total number of
search results can be taken from
$searchresults[estimatedTotalResultsCount], and that
our results are held in
$searchresul

Loading...