Making your site fast and offering search functionality is mostly covered with the integrated database selects – like with MySQL, you would probably use the MATCH AGAINST statement in a SELECT.
The problem with that statement is that it will get VERY SLOW as soon as you hit the amount of a few hundred thousand entries in your database. It becomes especially slow if you want to do live updates and quick AJAX requests to your servers like a LIVE search… answering to every keystroke of your user:
If you are in need for a specific search solution that can handle millions of datasets and responds QUICKLY for your users, we strongly recommend SphinxSearch.
Here is the quick start info for those, who want to decide, if that software is the right solution for them:
- It is a server-side software for Linux (and Windows)
- It is a server for itself: it runs on a port, just like MySQL does
- It is basically a quick INDEX retrieval engine, you may abuse it for data storage, too
- It works like a MySQL server
are the core descriptive aspects of it.
The installation and the operation of the software is pretty straight forward. The most important thing you have to understand before using it: The Sphinx server needs to be fed. You may either do that REALTIME or on a cronjob every few hours/minutes.
To give you an idea: we currently have around 30 Mio. products on tradebit. Every day merchants delete, add or modify the datasets. We run 2 different sphinx servers on dedicated machines and feed the changes periodically every 4 hours into the Sphinx server.
Here are some code / bash snippets to give you an idea:
/usr/bin/indexer –config /etc/sphinx/sphinx.conf idxAllEn –rotate
is run daily – only once and eats the whole database in less than 1 hour. While that server is under load, we use the second Sphinx server.
Before you can start to use that index, you have to create it with the SAME command, but without the “–rotate”.
To connect locally from your shell to the Sphinx Server, you use the mysql client, like you would connect to your database:
mysql -h localhost –protocol=tcp -P9306
in the default configuration. The port may be changed to suit your needs!
A full text search is then done like a SQL statement:
select id from idxTBVMANEN where match(‘test’);
and gives back the PRIMARY KEY of your full text “table”. These IDs are then used to select the original data from the database.
The trick is: selecting data from MySQL is sooooo much faster, if you have the PRIMARY KEYS already at hand! Once you have setup your servers and understood how cool SphinxSearch is: make sure the index you create is compact enough to sit in the memory and take a look at all the nice tricks the team published in their blog.