devbox@COMPUTEC The Computec development blog

5May/102

Full-text search with ColdFusion using Sphinx

Full text searching is and probably will be for a long time an interesting challenge for any database driven application. Of course ColdFusion already offers a couple of options, though I found most of them somewhat lacking in features or quite complicated to set up.

As we're running mostly on PostgreSQL as database backend, we used to rely solely on the built-in TSearch2 full text search methods of that database. But over the years we have accumulated so much data, some of which is nicely distributed over several tables (the dark side of normalization), that we were really yearning for a less table based and more document focused indexing mechanism - and more speed than TSearch2 could deliver.

Verity never really quite met all of our needs and was a real pain to set up and maintain. CF 9's Solr, which is based on Lucene, might be a mighty step forward, but we're still running on ColdFusion 8, so I really cannot say a lot about handling and performance of the new indexing beast.

For our use cases (i.e. indexing of articles, products in our CMS as well as our forums), Sphinx (for SQL Phrase Index) has shown some amazing results - and we're using it for a couple of months now. In this article I'll show you how to compile, set up and use Sphinx in your ColdFusion application to retrieve search results from documents stored in a PostgreSQL or MySQL database.

At the time of writing this article, the current version of Sphinx is 0.9.9-release. I assume you intend to deploy the Sphinx server on a Linux box (I'm using Debian Lenny) and are familiar with compiling code - if you're on Windows, you should get the binary distribution and see docs for installation.

The server you'll want to run Sphinx may or may not be the same as your ColdFusion box, though I'd recommend you set up a separate machine with a generous amount of RAM - Sphinx will store indexes completely in memory which explains the amazing search speed, so the more data you wish to index, the more RAM you should provide.

Next page: Compiling Sphinx

« »

Comments (2) Trackbacks (1)

Leave a comment

(required)