Wednesday, December 29, 2010

Proximity search using SQLite's FTS feature

A few months ago I was playing with SQLite's Full Text Search feature. I was especially interested in the Match-Near-Term operator - which allows you to search for a bunch of terms that are with 'm' words of each other. Lucene also has this feature (obviously) called SpanQuery. This is called Proximity search if you didn't already know.

This kind of search has its limitations - so does SQLite, especially performance problems for large data sets. I chose SQLite with the SQLite-JDBC driver because of its simplicty of setup and SQL interface (duh!). I created the FTS table in an in-memory database and tried some simple queries. It's not too bad. I'll just file it for later.

Here's the code. I just create 2 streams of stock ticks (all contrived, just like the rest of the code) and try to search for patterns in the 2 series. It does not exactly do what I wanted it to, but it was fun to play with the concept.

1 comments:

Subbaraj Ramalingam said...

Nice Concept dude..interesting to play around..