Full-text search with ColdFusion using Sphinx
Configuration (3/3)
Sphinx indexer
Please see the docs for details, you really need to tune these settings so it matches both the resources available and the amount of data in your indexes.
indexer
{
mem_limit = 1024M
max_xmlpipe2_field = 8M
write_buffer = 12M
}
For the indexer to actually do something, we'll have to add jobs to the crontab - we'll deal with this later.
Sphinx search daemon (searchd)
searchd
{
listen = 9312
log = /var/log/sphinx/searchd.log
query_log = /var/log/sphinx/query.log
read_timeout = 120
max_children = 100
pid_file = /var/run/sphinx/searchd.pid
preopen_indexes = 1
max_packet_size = 32M
crash_log_path = /var/log/sphinx/crashlog
read_buffer = 1M
}
# --eof--
listen sets the port/and or socket path you have your clients connect on. In previous Sphinx versions the default port has been 3312. With 0.9.9 this has changed to 9312, which is now the official IANA assigned port for the Sphinx API. The rest of the settings are well documented in both the online docs and the example config file sphinx.conf.dist you'll find in /etc/sphinx/ after installation.
Sphinx indexing job
Now we have got everything set in place, we should do a first indexing run:
su sphinx -c "/opt/sphinx/bin/indexer \ --config /etc/sphinx/sphinx.conf forummain"
This should output something like
Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff using config file '/etc/sphinx/sphinx.conf'... indexing index 'forummain'... collected 4043648 docs, 1963.5 MB sorted 515.8 Mhits, 100.0% done total 4043648 docs, 1963534582 bytes total 1431.203 sec, 1371946 bytes/sec, 2825.34 docs/sec total 21 reads, 3.403 sec, 124044.4 kb/call avg, 162.0 msec/call avg total 508 writes, 26.715 sec, 12481.5 kb/call avg, 52.5 msec/call avg
The procedure took several minutes. Now let's build the delta index:
su sphinx -c "/opt/sphinx/bin/indexer \ --config /etc/sphinx/sphinx.conf forumdelta"
This will take no more than a few seconds:
Sphinx 0.9.9-release (r2117) Copyright (c) 2001-2009, Andrew Aksyonoff using config file '/etc/sphinx/sphinx.conf'... indexing index 'forumdelta'... collected 20 docs, 0.0 MB sorted 0.0 Mhits, 100.0% done total 20 docs, 10456 bytes total 0.223 sec, 46709 bytes/sec, 89.34 docs/sec total 2 reads, 0.000 sec, 11.4 kb/call avg, 0.0 msec/call avg total 8 writes, 0.000 sec, 13.4 kb/call avg, 0.0 msec/call avg
Now it is time to set up the Sphinx start script and start the search daemon:
chmod +x /etc/init.d/sphinx update-rc.d sphinx defaults /etc/init.d/sphinx start
And finally we set up the jobs for the indexer in /etc/crontab:
# rebuild main and archive indexes from scratch once a day 10 1 * * * sphinx /opt/sphinx/bin/indexer --rotate --config /etc/sphinx/sphinx.conf forummain 2>&1 > /dev/null 30 1 * * * sphinx /opt/sphinx/bin/indexer --rotate --config /etc/sphinx/sphinx.conf forumarchive 2>&1 > /dev/null # rebuild forumdelta index every five minutes from 00:00-00:55, 02:00-23:55, # leaving a one hour gap from 01:00-01:55 during which the full index will be rebuilt */5 0,2-23 * * * sphinx /opt/sphinx/bin/indexer --rotate --config /etc/sphinx/sphinx.conf forumdelta > /dev/null
The additional --rotate argument tells the indexer to send SIGHUP to searchd, which causes the search daemon to reload the index from the newly created files into memory.
Concerning the server side, we're all done. Of course you should do the standard admin stuff like deal with log rotation, set up a nagios watchdog for /opt/sphinx/bin/searchd etc., so this won't be covered here.
Next page: The Sphinx search component

May 12th, 2010 - 19:39
Good stuff! Keep it up!
June 25th, 2010 - 16:28
nice post. thanks.