Thursday, December 17, 2009

Burst of activity in Lucene

To put it very simply, search engines transfer a lot of work from query-time to index-time. The reason this is done, is to speed up queries at the cost of adding documents slower. Until now, Lucene based systems have had problems with dealing with scenarios in which the searchers need to see the changes instantly (think Twitter Search). There exist a variety of tricks and techniques to acheive this even now. However, near real-time search support in Lucene itself is a boon to all those people who have been building and managing such systems because the grunt work will be done by Lucene itself.

I can't wait to see this implemented. I'm in the process of rethinking a big application to use Lucene/Solr instead of hitting a database for searches, and even though this application only gets data updates twice daily, they can be numerous and I can only imagine having near real-time search available will make it that much easier to keep your search indexes up-to-date.

Sometimes I honestly can't believe all the amazing open source projects that are out there.

