Thursday, October 21, 2010

select 'Whoa..' || 'Cool!' from mozilla_history_storage

SQLite is a brilliant embedded SQL engine and it is used by software that you and I use everyday - Firefox, Android and many more. See earlier reference.
What you might not know is how extensively it is used in Firefox. History, bookmarks all use SQLite. The kind of queries that are used would put many Enterprise applications to shame. I was just curious to see what was under the hood and decided to hunt down the History and Bookmarks source code.

 "SELECT v.visit_date, COALESCE( "
   "(SELECT r.visit_type FROM moz_historyvisits_temp r "
     "WHERE v.visit_type IN ") +
       nsPrintfCString("(%d,%d) ", TRANSITION_REDIRECT_PERMANENT,
                                   TRANSITION_REDIRECT_TEMPORARY) +
       NS_LITERAL_CSTRING(" AND r.id = v.from_visit), "
   "(SELECT r.visit_type FROM moz_historyvisits r "
     "WHERE v.visit_type IN ") +
       nsPrintfCString("(%d,%d) ", TRANSITION_REDIRECT_PERMANENT,
                                   TRANSITION_REDIRECT_TEMPORARY) +
       NS_LITERAL_CSTRING(" AND r.id = v.from_visit), "
   "visit_type) "
 "FROM moz_historyvisits_temp v "
 "WHERE v.place_id = :page_id "
 "UNION ALL "
 "SELECT v.visit_date, COALESCE( "
   "(SELECT r.visit_type FROM moz_historyvisits_temp r "
     "WHERE v.visit_type IN ") +
       nsPrintfCString("(%d,%d) ", TRANSITION_REDIRECT_PERMANENT,
                                   TRANSITION_REDIRECT_TEMPORARY) +
       NS_LITERAL_CSTRING(" AND r.id = v.from_visit), "
   "(SELECT r.visit_type FROM moz_historyvisits r "
     "WHERE v.visit_type IN ") +
       nsPrintfCString("(%d,%d) ", TRANSITION_REDIRECT_PERMANENT,
                                   TRANSITION_REDIRECT_TEMPORARY) +
       NS_LITERAL_CSTRING(" AND r.id = v.from_visit), "
   "visit_type) "
 "FROM moz_historyvisits v "
 "WHERE v.place_id = :page_id "
   "AND v.id NOT IN (SELECT id FROM moz_historyvisits_temp) "
 "ORDER BY visit_date DESC LIMIT ") +

Now...see this. Imagine doing this without SQL? I am however, a little disappointed. I was expecting the code to be using some fancy NGram search like what Lucene does, but it doesn't look like it.
 SELECT DISTINCT null, 
        'place:type=%ld&sort=%ld&domain=&domainIsHost=true
          &beginTime='||:begin_time||'&endTime='||:end_time, 
        :localhost, :localhost, null, null, null, null, null, null, null 
 WHERE EXISTS( 
   SELECT h.id 
   FROM moz_places h 
   JOIN moz_historyvisits v ON v.place_id = h.id 
   WHERE h.hidden <> 1 AND h.rev_host = '.' 
     AND h.visit_count > 0 
     AND h.url BETWEEN 'file://' AND 'file:/~' 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT h.id 
   FROM moz_places_temp h 
   JOIN moz_historyvisits v ON v.place_id = h.id 
   WHERE h.hidden <> 1 AND h.rev_host = '.' 
     AND h.visit_count > 0 
     AND h.url BETWEEN 'file://' AND 'file:/~' 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT h.id 
   FROM moz_places h 
   JOIN moz_historyvisits_temp v ON v.place_id = h.id 
   WHERE h.hidden <> 1 AND h.rev_host = '.' 
     AND h.visit_count > 0 
     AND h.url BETWEEN 'file://' AND 'file:/~' 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT h.id 
   FROM moz_places_temp h 
   JOIN moz_historyvisits_temp v ON v.place_id = h.id 
   WHERE h.hidden <> 1 AND h.rev_host = '.' 
     AND h.visit_count > 0 
     AND h.url BETWEEN 'file://' AND 'file:/~' 
     {QUERY_OPTIONS_VISITS}  {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS}         
 ) 
 UNION ALL 
 SELECT DISTINCT null, 
        'place:type=%ld&sort=%ld&domain='||host||'&domainIsHost=true
          &beginTime='||:begin_time||'&endTime='||:end_time, 
        host, host, null, null, null, null, null, null, null 
 FROM ( 
   SELECT DISTINCT get_unreversed_host(rev_host) AS host 
   FROM moz_places h 
   JOIN moz_historyvisits v ON v.place_id = h.id 
   WHERE h.rev_host <> '.' 
     AND h.visit_count > 0 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT DISTINCT get_unreversed_host(rev_host) AS host 
   FROM moz_places_temp h 
   JOIN moz_historyvisits v ON v.place_id = h.id 
   WHERE h.rev_host <> '.' 
     AND h.visit_count > 0 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT DISTINCT get_unreversed_host(rev_host) AS host 
   FROM moz_places h 
   JOIN moz_historyvisits_temp v ON v.place_id = h.id 
   WHERE h.rev_host <> '.' 
     AND h.visit_count > 0 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS} 
   UNION 
   SELECT DISTINCT get_unreversed_host(rev_host) AS host 
   FROM moz_places_temp h 
   JOIN moz_historyvisits_temp v ON v.place_id = h.id         
   WHERE h.rev_host <> '.' 
     AND h.visit_count > 0 
     {QUERY_OPTIONS_VISITS} {QUERY_OPTIONS_PLACES} 
     {ADDITIONAL_CONDITIONS}         
   ORDER BY 1 ASC 

There's plenty more where this came from:
  - mozilla-central/source/toolkit/components/places/src/nsNavHistory.cpp
  - https://developer.mozilla.org/en/Places/Query_System
  - https://developer.mozilla.org/en/Storage
  - https://developer.mozilla.org/en/Places/History_Service/Design

If you wish to use SQLite from your Java program, there are some drivers available. SQLite also supports a pure memory mode, much like my other favorite database H2.
  - http://www.xerial.org/trac/Xerial/wiki/SQLiteJDBC
  - http://code.google.com/p/sqlite4java/

2 comments:

Ashwin Jayaprakash said...

Correction.. I should've realized that the Full Text Seach inside SQLite is doing all the grunt work - sqlite.org/fts3.html

Ashwin Jayaprakash said...

Another one - sqljet - Pure Java SQLite.