Review: SQLite has improved enough to be chosen by Google for its Gears app
Feature: A second look at SQLite finds continued feature and performance improvements
SQLite has been the ultra-small SQL database that can do a complete set of SQL operations with very fast response time especially where the number of concurrent users is small (1-20). Back in late 2004 we reviewed it along side SQL Anywhere. This review will look in more depth at some of the capabilities of SQLite; but think very complete, compact and fast for local data in the 10 to 200MB range.
This has attracted a number of Web-savvy software vendors. Adobe is including SQLite in its latest Air/Apollo beta as a local database target. Google has done the same for its Google Gears tool. And the reasons are obvious. SQLite has been designed to match its name. First, it advertises itself as the low administration and no maintenance database. And this reviewer can testify to the quality of both attributes. Unzip the download and you are good to go. If you have an inkling of SQL knowledge then you are good to go in and start working with SQLite right away. And SQLite has some very nice performance results which are noted in the tables below.
But best of all, in a world of software memory bloat, SQLite is really thin. In the 3 years since our first review of SQLite, PC desktops have moved from the 256-512MB to the 1-2GB range while SQLite's main engine has moved from 275KB to 350KB - 25% growth versus 200% in memory usage for OS in general. At the same time SQLite has added much better concurrency control and locking mechanisms. So lets take a broader look at some of the SQLite virtues.
SQLite Virtues
I suspect the virtue that may have attracted
the Google engineers the most is what SQLite designers call its zero-configuration capabilities:
SQLite does not need to be "installed" before it is used. There is no "setup" procedure.
There is no server process that needs to be started, stopped, or configured. There
is no need for an administrator to create a new database instance or assign access
permissions to users. SQLite uses no configuration files. Nothing needs to be done
to tell the system that SQLite is running. No actions are required to recover after
a system crash or power failure. There is nothing to troubleshoot.
SQLite just works.
I can testify to the fact that indeed on install of 726KB and 4files as soon as I entered: sqlite3 test.db
I was off to the races and in five minutes had three of the largest baseball database tables .imported and was doing sophisticated left joins and aggregate summary SELECTs with very fast response times. So fast that these SQL queries often matched if not beat the equivalent queries done in MySQL 5.1 which is no slouch itself. And remember this is code smaller than a megabyte. Do like I have done - put it on a flash memory card and have a SQL database with you at all times.
SQLite is Blazing Fast
Another attribute that must have attracted Google, PHP, Adobe and others to SQLite is its blazing fast speed of operation. The new version increases that despite moving to full Unicode support, by tuning the locking mechanisms and refining data storage. The following Performance Tests compare SQLite 3.4 versus MySQL 5.1 on the same machine and using the same database.
The data used for the following tests were drawn from the Lahman Baseball database and have 4200 records for the Allstars table and 16,700 records for the Master table. The table below summarizes our tests which started as usual with loading up the database.
Comparison of SQLite 3.4 with MySQL 5.1 |
Test |
SQLite |
MySQL |
Comment |
| 4200 Insert data statements |
8 min 13 sec |
2 min 45 sec |
Concern: 3 times slower |
| 4200 bulkload Insert |
2 min |
2 seconds |
Concern: no fast data load for SQLite |
| 4200 record CSV load |
1 sec |
1 sec |
fast table loads via import CSV |
| 4200 record delimited load |
1 sec |
1 sec |
neither use huge surplus CPU, mem |
| Select count(*) |
<1 sec |
< 1sec |
both very fast |
| Select * from allstars |
1 sec |
1 sec |
neither use huge surplus CPU, mem |
| Select, group, having |
2 sec |
1 sec |
Use of 1MB more memory by both |
| left join allstar, master database |
3 sec |
2 sec |
both, small use of memory and CPU |
| left join allstar, master grouped |
4 sec |
5 sec |
first big memory extra by both SQLs |
| update 1 field, all records |
1 sec |
2 sec |
updates will be fast |
| delete selectively 400 records |
1 sec |
1 sec |
deletes also are fast |
| delete all records |
< 1 sec |
< 1 sec |
especially empty table |
As you can see there was some concern that loading up SQLite
would lead to some troublesome performance trade-offs - especially given that data downloads in the era of DSL, cable and WiFI will only get bigger. However, a command line .CSV import is very fast. Another concern is that in an Internet setting, even offline operations will bring multiple users to any one Gears/SQLite database. Networked database settings with many concurrent users is one area that even SQLIte designers warn users about. This essentially becomes a locking problem. However, the new 3.4 version of SQLite attacks this problem in two ways:
1)It has new SHARED and RESERVED locking semantics that allow a database to be intending to write but not yet doing so such that SHARED Readers can continue to read the database until an EXCLUSIVE lock is taken on the database. At this EXCLUSIVE point all SHARED readers and other RESERVED or EXCLUSIVE writers have cleared
the database. With an EXCLUSIVE lock only one user can access the database.
2)With the ATTACH command users can link to other SQLite databases the syntax then changes as follows:
ATTACH BMASTER;
SELECT B.NAME FROM ALLSTAR2 A, BMASTER.MASTER B WHERE B.PLAYERID == A.PLAYERID
A database transaction can use up to 10 ATTACH-ed database tables. But more importantly, all locking and rollback journaling works across attached databases - so an EXCLUSIVE lock can lock several databases. But more importantly, it can leave several other databases unlocked for other readers and writers.
The bottom line is greater throughput for a well designed SQLite site. First, more concurrent reading and writing operations are possible with the new design (writer starvation and unnecessary reader lockout are reduced with attached databases). But also, now database designers will try to group tables into natural databases where common read-write operations take place and/or create snapshot databases and views for query-only operations.
One observation during the tests which surprised this reviewer is how parsimonious both SQLite and MySQL were about using more CPU and more memory. Neither went above 10% CPU usage or added more than 1.5MB to their memory allotment to get their database operations done. This despite the fact that 90% CPU was available and 1.5GB of virtual and 500MB of real memory were available during all the tests. Now this strategy appears to be geared to providing many concurrent processes/threads with "equal opportunity" to scarce CPU and memory resources. But in the case of SQLite, where the number of processes are likely to be lower - this may be performance opportunity lost. It is worthwhile reading what SQLite is architected for here and what usage limitations are envisioned for SQLite here.
But the bottom line is that SQLite is very fast indeed. Most SQL operations on the 16,000 records of the Baseball master file are done in less than a second. For many local uses this type of response time is more than adequate. But such speed also has the virtue of making testing and debugging as well as warehousing operations very fast. One can see the speed attraction of SQLite.
However, what has this reviewer amazed is the breadth of SQL syntax that is implemented with SQLite. Update, Delete, Insert commands with Constraint and Trigger enforcement using a ACID compliant Commit and Rollback capabilities. SQLite databases are not likely to be corrupted by legitimate data operations. All in all, these area major improvements over the first versions of SQLite.
SQLite is SQL Robust
Being so compact and fast, one would expect SQLite to cut corners on its implementation of SQL. However, the querying capabilities of SQLite are also impressively complete. For example, the database provides a full set of join and union syntax including use of subqueries. These are matched with traditional where, order by, group by, and having SELECT clauses. The syntax is very close to SQL92 compliant - something that is hard to say for major databases like SQLServer, Oracle and even DB2.
In fact I found SQLite SQL to be much richer than expected of a Lite database. For example, column and table name aliases are supported along with a complete set of SQL expressions and functions. Note the robust set of functions below (note date and time functions are not included in this list):
SQLite Math String, and Utility Functions |
| abs(X) |
Return the absolute value of argument X. |
| coalesce(X,Y,...) |
Return a copy of the first non-NULL argument. If all arguments are NULL then NULL is returned. There must be at least 2 arguments. |
| glob(X,Y) |
This function is used to implement the "X GLOB Y" syntax of SQLite. The sqlite3_create_function() interface can be used to override this function and thereby change the operation of the GLOB operator. |
| ifnull(X,Y) |
Return a copy of the first non-NULL argument. If both arguments are NULL then NULL is returned. This behaves the same as coalesce() above. |
| hex(X) |
The argument is interpreted as a BLOB. The result is a hexadecimal rendering of the content of that blob. |
| last_insert_rowid() |
Return the ROWID of the last row insert from this connection to the database. This is the same value that would be returned from the sqlite_last_insert_rowid() API function. |
| length(X) |
Return the string length of X in characters. If SQLite is configured to support UTF-8, then the number of UTF-8 characters is returned, not the number of bytes. |
like(X,Y)
like(X,Y,Z) |
This function is used to implement the "X LIKE Y [ESCAPE Z]" syntax of SQL. If the optional ESCAPE clause is present, then the user-function is invoked with three arguments. Otherwise, it is invoked with two arguments only. The sqlite_create_function() interface can be used to override this function and thereby change the operation of the LIKE operator. When doing this, it may be important to override both the two and three argument versions of the like() function. Otherwise, different code may be called to implement the LIKE operator depending on whether or not an ESCAPE clause was specified. |
load_extension(X)
load_extension(X,Y) |
Load SQLite extensions out of the shared library file named X using the entry point Y. The result is a NULL. If Y is omitted then the default entry point of sqlite3_extension_init is used. This function raises an exception if the extension fails to load or initialize correctly. |
| lower(X) |
Return a copy of string X will all characters converted to lower case. The C library tolower() routine is used for the conversion, which means that this function might not work correctly on UTF-8 characters. |
ltrim(X)
ltrim(X,Y) |
Return a string formed by removing any and all characters that appear in Y from the left side of X. If the Y argument is omitted, spaces are removed. |
| max(X,Y,...) |
Return the argument with the maximum value. Arguments may be strings in addition to numbers. The maximum value is determined by the usual sort order. Note that max() is a simple function when it has 2 or more arguments but converts to an aggregate function if given only a single argument. |
| min(X,Y,...) |
Return the argument with the minimum value. Arguments may be strings in addition to numbers. The minimum value is determined by the usual sort order. Note that min() is a simple function when it has 2 or more arguments but converts to an aggregate function if given only a single argument. |
| nullif(X,Y) |
Return the first argument if the arguments are different, otherwise return NULL. |
| quote(X) |
This routine returns a string which is the value of its argument suitable for inclusion into another SQL statement. Strings are surrounded by single-quotes with escapes on interior quotes as needed. BLOBs are encoded as hexadecimal literals. The current implementation of VACUUM uses this function. The function is also useful when writing triggers to implement undo/redo functionality. |
| random(*) |
Return a pseudo-random integer between -9223372036854775808 and +9223372036854775807. |
| replace(X,Y,Z) |
Return a string formed by substituting string Z for every occurrence of string Y in string X. The BINARY collating sequence is used for comparisons. |
| randomblob(N) |
Return a N-byte blob containing pseudo-random bytes. N should be a positive integer. |
round(X)
round(X,Y) |
Round off the number X to Y digits to the right of the decimal point. If the Y argument is omitted, 0 is assumed. |
rtrim(X)
rtrim(X,Y) |
Return a string formed by removing any and all characters that appear in Y from the right side of X. If the Y argument is omitted, spaces are removed. |
| soundex(X) |
Compute the soundex encoding of the string X. The string "?000" is returned if the argument is NULL. This function is omitted from SQLite by default. It is only available the -DSQLITE_SOUNDEX=1 compiler option is used when SQLite is built. |
| sqlite_version(*) |
Return the version string for the SQLite library that is running. Example: "2.8.0" |
| substr(X,Y,Z) |
Return a substring of input string X that begins with the Y-th character and which is Z characters long. The left-most character of X is number 1. If Y is negative the the first character of the substring is found by counting from the right rather than the left. If X is string then characters indices refer to actual UTF-8 characters. If X is a BLOB then the indices refer to bytes. |
trim(X)
trim(X,Y) |
Return a string formed by removing any and all characters that appear in Y from both ends of X. If the Y argument is omitted, spaces are removed. |
| typeof(X) |
Return the type of the expression X. The only return values are "null", "integer", "real", "text", and "blob". SQLite's type handling is explained in Datatypes in SQLite Version 3. |
| upper(X) |
Return a copy of input string X converted to all upper-case letters. The implementation of this function uses the C library routine toupper() which means it may not work correctly on UTF-8 strings. |
| zeroblob(N) |
Return a BLOB consisting of N bytes of 0x00. SQLite manages these zeroblobs very efficiently. Zeroblobs can be used to reserve space for a BLOB that is later written using incremental BLOB I/O. |
When
I originally looked at SQLite,
the three most vexing problems confronting the database were the performance hit associated with full table locking, the lack of Unicode support, and the comparatively thin set of function and datatypes support by SQLite. By version 3.0 SQLite has with ATTACH-ed databases and new locking techniques markedly improved the file locking problem as well as addressed the Unicode support in this latest revision. At the same time SQLite has taken a unique approach to SQL datatypes and therefore associated functions.
Here is how it stands right now. Each value stored in an SQLite database (or manipulated by the database engine) has one of the following storage classes:
-
NULL - The value is a NULL value.
-
INTEGER - The value is a signed integer, stored in 1, 2, 3, 4, 6, or 8 bytes depending on the magnitude of the value.
-
REAL - The value is a floating point value, stored as an 8-byte IEEE floating point number.
-
TEXT - The value is a text string, stored using the database encoding (any one of UTF-8, UTF-16BE or UTF-16-LE).
-
BLOB - The value is a blob of data, stored exactly as it was input.
Now
there are two problems here. First, the DATE, TIME and TIMESTAMP datatypes are not listed. Second, like many other major SQL databases, objects like spatial co-ordinates, URL addresses or financial codings - all may be treated but not in a universal way. SQLite is no different. The underlying code structure is or is in the process of being put in place to handle a broader set of datatypes and objects. However, there is a lack of clear standards on how to implement these emerging object datatypes and standards.
For example, both Oracle and IBM DB2 have methods of handling spatial co-ordinates (and other object extensions) but they are hardly uniform or standard. Even the exact definition of DATE, TIME, and TIMESTAMP datatypes and functions varies among the major database vendors. Likewise, ODBMS-Object Database Management Systems, although maintaining some standards - each, in the end, has its own proprietary query and data operations. The state of SQLite SQL reflects this industry-wide non-standard state-of-the-art.
Lord knows the industry has been trying for over thirty years to get a series of ANSI SQL standards that all can and will adhere to. But the centrifugal force of "seeking competitive advantage" has easily outstripped the centripetal force of "adhering to common standards". And so for all practical purposes, cross database queries and operations remain non-trivial to do(and hence the creation of thousands of non-compatible information silos in the IT world). Unfortunately, SQLite inherits this broader SQL database legacy.
However, it is useful to note that in following SQL 92 syntax, SQLite is as robust and complete as say MySQL or PostgreSQL. It is easier and faster to look at a short list of what is omitted in SQLite than to compare what has been implemented - and we shall do that in the next section. But it bears repeating, this viewer sees in SQLite an implementation that is in some respects more robust(see how cleverly SQLite implements common datatypes here and then here) than its big time competitors.
SQLite's Omissions
I have already touched on the lack of object support in SQLite and the variation in the date/time support. But SQLite maintains its own list of omissions and differences with the major databases. Here are the major items:
| FOREIGN KEY constraints |
|
FOREIGN KEY constraints are parsed but are not enforced. |
| Complete trigger support |
|
There is some support for triggers but it is not complete. Missing sub features include FOR EACH STATEMENT triggers (currently all triggers must be FOR EACH ROW), INSTEAD OF triggers on tables (currently INSTEAD OF triggers are only allowed on views), and recursive triggers - triggers that trigger themselves. |
| Complete ALTER TABLE support |
|
Only the RENAME TABLE and ADD COLUMN variants of the ALTER TABLE command are supported. Other kinds of ALTER TABLE operations such as DROP COLUMN, ALTER COLUMN, ADD CONSTRAINT, and so forth are omitted. |
| Nested transactions |
|
The current implementation only allows a single active transaction. |
| RIGHT and FULL OUTER JOIN |
|
LEFT OUTER JOIN is implemented, but not RIGHT OUTER JOIN or FULL OUTER JOIN. |
| Writing to VIEWs |
|
VIEWs in SQLite are read-only. You may not execute a DELETE, INSERT, or UPDATE statement on a view. But you can create a trigger that fires on an attempt to DELETE, INSERT, or UPDATE a view and do what you need in the body of the trigger. |
| GRANT and REVOKE |
|
Since SQLite reads and writes an ordinary disk file, the only access permissions that can be applied are the normal file access permissions of the underlying operating system. The GRANT and REVOKE commands commonly found on client/server RDBMSes are not implemented because they would be meaningless for an embedded database engine. |
SQLite has barebones ALTER TABLE, CONSTRAINTS and FOREIGN KEY support. These shortcomings then are impacted because the user has limited TRIGGER support and no STORED PROCEDURES with which to develop work arounds for these missing capabilities. True there are many coding hacks and work arounds available on the SQLite Wiki - but these fixes may not fit specific user requirements.
Likewise in the SQL query arena, SQLite lacks complete JOIN, VIEW, and Subquery implementations. Again the lack of a STORED PROCEDURE capability puts the onus on the user to find SQL fixes. Fortunately, SQL has much flexibility in its ability to find alternative solutions to query and update problems. And because SQLite is Open Source and provides clear documentation as well as all its C language source code(there is , users can construct their own functions, Virtual Tables, and therefore basic operations. SQLite's stored procedure language can be thought of as C plus the BTree, Parser, and VDBE-Virtual Database Engine opcodes it makes so publicly available. Not a trivial stored procedure language; but also not mission impossible either.
However, the lack of GRANT and REVOKE command and the almost utter dependence of SQLite on the underlying OS for data security is a concern. All of SQLite data is in one place: the .db database. So once a hacker or malicious party gets hold of the database - the SQLite data is lost. In fact using SQLite would probably be the fastest way to purloin its own data.
SQLite or its users, like all other major database vendors, will have to come up with a means of encrypting stored SQLite data so that if the database file is stolen, it is not open to the taking by hackers. I think there has to be a last line of security and defense for the native database. Fortunately there are several 3rd party compression and encryption routines for SQLite - for example here. But users will have to be aware that non-standard encryption may complicate some interactions.
In contrast, I side with SQLite on the scarce need for GRANTs and REVOKEs of security privileges within the database. SQLite is designed to be embedded for use with C/C++ or Java programs. Those programs should supply the appropriate access and privileged roles plus security for SQLite data. If the users program needs to add additional security/privileges beyond what it already has - SQLite stands ready to supply the database backend to deliver those added security capabilities. I reckon user-made code will be faster, easier, and more compact in size than invoking some of the existing GRANT/REVOKE based systems provided by Oracle or SQL Server. The downside is that the code will be non-standard. Thus a glance at the SQLite wiki shows there to be interesting user proxies for GRANT and REVOKE. Bottom line: security is really the one major outstanding concern for SQLite. However, going for SQLite is its very broad Open Source community support.
SQLite: Free and Open Source
SQLite is available for free - just go here to download it. It is also ridiculously easy to install in Windows or Linux. Mac version is rumored to be coming. This will be essential if Google Gears or Adobe Apollo want to use it in a broad way, because both vendors competitive advantage is that they are decidedly cross platform. And to start using it is as simple as: sqlite3 test.db
One can almost guess the operations in sqlite3 command line processor - but it helps to know about .help and SQLite's own documentation. I have created a standalone version of the SQLite Help file (circa July 14th 2007) and info about it is available here. I can see how university professors teaching database classes would love SQLite - here is not just the source code in C but the design architecture all described and readily available to students to see how a very robust SQL database is built.
And the IT development community has not been slow to pick up on the quality and robustness available in SQLite. Back in 2004 with the launch of PHP 5, the Zend/PHP people touted SQLite and its database access code as a major new enhancement to PHP 5. As well we have mentioned the use of SQLite in Google Gears and Adobe Apollo. These are major wins for the database - and so I have decided to add it to the 3 other databases I am following - IBM DB2, MySQL, and Oracle.
I am not yet convinced whether database will remain RDBMS based, take more XML/Networking capabilities, or pickup more standard object and unstructured functionality (echoing the XML/Networking thrust seen best in DB2 Viper V2). But I am convinced small and lite has reached its take off point - and Pointbase plus SQLAnywhere are promising in different ways. But I suspect SQLite is closest to the real deal - hence our expanded coverage.
Top of Page Home Tutorials
Developers Weblog Graphics Reviews and Tips |