Engineering
March 30, 2023
May 25, 2023
(updated)

SQLite Optimizations for Ultra High-Performance

SQLite works great as an in-app database but can require some configuration to achieve optimal performance. This post covers 10 SQLite optimizations that can help maintain blazingly fast performance.

Ralf Kistner

SQLite is small, fast, has a lot of built-in functionality, and works great as an in-app database. However, SQLite is designed for many different use cases, and requires some configuration for optimal performance as an in-app database. And being a C library designed to run on practically any platform, it leaves a lot up to higher-level libraries to implement.

Benchmarking Baseline Performance/Throughput

To start with, it helps to have a good idea of expected baseline performance. If an app's performance is far from this, there is likely room for optimization.

Graph showing SQLite baseline performance
This is for a single table, with small rows (around 30 bytes each), and no secondary indexes. Reads are lookups by primary key.
For tables with secondary indexes, expect up to 5x reduction in insert performance. Update and delete throughout should be similar to insert.

Key SQLite Performance Optimizations To Consider

Here are 10 optimizations that could collectively potentially result in significant SQLite performance gains in your application:

#1: Enable Write-Ahead Logging (WAL) And Disable Synchronous Mode

Effect: Can reduce per-transaction overhead from 30ms+ to < 1ms.

The default in SQLite is to use a rollback journal. This involves at least two disk writes for every transaction, and waiting for a filesystem sync.

WAL mode instead writes changes to a sequential write-ahead log, and then later synchronizes it back to the main database. Additionally, it allows safely using [.inline-code-snippet]synchronous = normal[.inline-code-snippet], which avoids having to wait for an filesystem sync (fsync) operation in most transactions.

These settings are good for most applications:

pragma journal_mode = WAL;
pragma synchronous = normal;
pragma journal_size_limit = 6144000;

The only durability caveat is that on a system crash (not application crash), the last transaction(s) may be rolled back. This is fine for most applications — if durability is that important, you should probably rely on more than just a phone's flash storage.

#2: Reduce Binding Overhead

Effect: Using different bindings can reduce per-transaction and per-statement overhead by 2-10x.

SQLite itself is fast, and can execute millions of individual statements per second. The official documentation mentions there is no problem with using lots of individual queries, unlike with typical client/server databases.

However, very often the higher-level library / SQLite bindings does add a lot of overhead to each statement, that could be an order of magnitude or two more than the underlying SQLite operation. This is especially true for (1) higher-level languages, (2) when the bindings provide asynchronous operations, or (3) when the operation passes through multiple layers of bindings. For example, if the call goes from React Native -> Java -> JNI → SQLite and back, that adds a lot of overhead.

The only options here are:

  1. Choose a library that has as little as possible overhead.
  2. Combine multiple operations into a single statement wherever possible. Use JOIN statements or JSON1 for this (see below).


Android

Avoid the built-in Android SQLite implementation — it has severe limitations:

  1. The SQLite version is different based on the Android version, and outdated even on the latest Android version.
  2. No JSON1 support.
  3. The bindings only support String arguments for queries.

If you use bindings on another platform (e.g. Flutter or React Native), make sure that it doesn't use the built-in Android SQLite under the hood. Most libraries provide an option to bundle SQLite instead of using the built-in one, if it isn't the default.

#3: Wrap Writes Into Transactions

Effect: Can increase write throughput by 2-20x.

By default, every write to the database is effectively its own transaction. The overhead per transaction is reduced in WAL mode, but not eliminated. For high throughput of writes, wrap multiple writes in the same transaction.

Even read performance can be improved with transactions, since it reduces the number of lock operations required. However, when WAL mode is used, this effect is small.

#4: Run Database Operations In A Background Thread

Effect: Avoid blocking of the UI thread.

Running operations in the background is important to avoid the application UI stuttering. For many operations, SQLite is fast enough that this is not a problem, even when running on the main thread. However, there may be an occasional query that takes longer, especially when working with large sets of data.

For this reason it's usually best to run operations on a background thread. The SQLite bindings may do this automatically (e.g. ones that provide [.inline-code-snippet]async[.inline-code-snippet] methods for Dart), or may require manual work for this. This does often add some overhead to each operation, but it is still usually fast enough to not be an issue, and is worth the trade-off of not blocking the UI.

#5: Optimize Concurrency

Effect: Allow running large read or write queries in the background, without blocking time-sensitive read queries.

Even when running operations in the background, operations may still block each other. If the bindings run in single-threaded mode, one large transaction (e.g. downloading data in the background) would block all other queries.

SQLite natively supports concurrent transactions. However, each separate "connection" is effectively single-threaded. So to run multiple concurrent transactions, separate connections are needed on different threads.

In the default rollback journal mode, SQLite can run any number of concurrent read transactions, but when a transaction writes to the database, it locks it in exclusive mode (no other readers or writers can run at that point).

In WAL mode, readers can continue reading from the database while writing. The only limit is that only a single transaction can write at any point.

If multiple transactions attempt to write at the same time, it will fail with [.inline-code-snippet]SQLITE_BUSY[.inline-code-snippet], and need to be retried. When only a single process uses the database at a time, it is often better to use an application-level write queue, rather than relying on [.inline-code-snippet]SQLITE_BUSY[.inline-code-snippet]. This could provide more fair writes, avoiding issues such as writer starvation.

SQLite also has an experimental mode called begin concurrent, allowing concurrent write transactions, as long as they don't read or write the same pages of data. This is not part of the main build yet, and requires a custom build from a separate branch.

#6: Use Prepared Statements

Effect: Can increase per-statement throughput by up to 1.5x.

When running the same small statement many times over, the overhead of parsing the SQL could add up.

For example, instead of executing the statement each time like this:

for (var entry in entries) { 
	db.execute("INSERT INTO data(id, value) VALUES (?, ?)", entry);

Prepare the statement once, and run it multiple times:

var stmt = db.prepare("INSERT INTO data VALUES (?, ?)");
for (var entry in entries) {  
	stmt.execute(entry);
}

Bindings often do not expose the low-level [.inline-code-snippet]prepare()[.inline-code-snippet] call directly. When the bindings are asynchronous, the overhead of the bindings would be more than the gain of the prepared statement.

However, the bindings could:

  1. Implicitly use prepared statements for batch calls, e.g. [.inline-code-snippet]db.executeBatch("INSERT INTO data VALUES (?, ?)", entries)[.inline-code-snippet].
  2. Cache prepared statements for recently-used queries.

#7: Use JSON1 Functions

Effect: Throughput of 1-2x of prepared statements, depending on binding overhead. If the bindings do not provide efficient alternatives for batching, this could increase throughput much more.

In recent SQLite versions, the JSON1 extension is enabled by default. Even when not storing JSON data, this makes it easy to pass in large sets of data to a query.

This is one great way to avoid per-statement overhead, regardless of the bindings used.


Example 1: Lookup multiple rows in a single statement

Typical example for a single row lookup:

db.select('SELECT * FROM users WHERE id = ?', [id]);

To query multiple rows in a single statement, use the [.inline-code-snippet]IN[.inline-code-snippet] operator:

db.select('SELECT * FROM users WHERE id IN (?, ?, ?)', [id1, id2, id3]);

However, the above typically involves generating SQL programmatically, which is error-prone, and could run into the limit on the maximum number of parameters per statement (999 or 32766 by default, depending on the SQLite version used).

Using JSON1 functions avoids that complexity, in addition to typically being faster:

db.select('SELECT * FROM users WHERE id IN (SELECT e.value FROM json_each(?) e)', [jsonEncode(ids)]);

Bulk insert

JSON1 can also be used to bulk insert data:

var data = users.map(u => u.toMap());
db.execute("INSERT INTO users(name, email) SELECT e ->> 'name', e ->> 'email' FROM json_each(?) e)", [jsonEncode(data)]);

When the query gets large, use a CTE (Common Table Expressions) ([.inline-code-snippet]WITH[.inline-code-snippet] statement) to split it out from the rest of the query:

db.execute("""
WITH data AS (SELECT e ->> 'name' as name, e ->> 'email' as email FROM json_each(?) e)
INSERT INTO users(name, email) SELECT data.name, data.email FROM data
""", [jsonEncode(data)]);

Bulk update

Bulk update using [.inline-code-snippet]UPDATE FROM[.inline-code-snippet]:

var data = users.map(u => u.toMap());
db.execute("""
WITH data AS (SELECT e ->> 'id' as id, e ->> 'name' as name, e ->> 'email' as email FROM json_each(?) e)
UPDATE users   
	SET email = data.email, name = data.name  
  FROM data 
  WHERE users.id = data.id
""", [jsonEncode(data)]);

Bulk delete

Bulk delete using [.inline-code-snippet]IN[.inline-code-snippet]:

db.execute("""
	DELETE FROM users WHERE users.id IN (SELECT e.value FROM json_each(?) e)
""", [jsonEncode(ids)]);

#8: Use Indexes Optimally

Effect: Read queries can be orders of magnitude faster.

Indexes can increase the performance of read queries by orders of magnitude. There are many great guides on indexing, and won't be repeated here.

SQLite has a great guide on all the different optimizations the query planner can perform using indexes: https://www.sqlite.org/optoverview.html

Some tips:

  1. Make sure that the correct indexes are being used, by using [.inline-code-snippet]EXPLAIN QUERY PLAN[.inline-code-snippet]. In most cases the query planner is quite good, but it does pick a suboptimal index in some cases.
  2. If a suboptimal index is used, see this guide: https://www.sqlite.org/queryplanner-ng.html#howtofix

Note that by default, SQLite does not know how many unique values are in each column, which may cause it to choose a suboptimal index. The default recommendation is to use [.inline-code-snippet]ANALYZE[.inline-code-snippet] (or [.inline-code-snippet]OPTIMIZE[.inline-code-snippet]) to analyze tables, but this may cause different indexes to be used for different users, resulting in difficult-to-debug performance issues.

Instead, use one of the other options:

  1. Use [.inline-code-snippet]likely()[.inline-code-snippet] and [.inline-code-snippet]unlikely()[.inline-code-snippet] functions to provide hints on conditions that are likely to be true or false.
  2. Use the unary "+" operators to prevent certain terms from begin used for an index.
  3. Pre-populate the [.inline-code-snippet]sqlite_stat1[.inline-code-snippet] table.

SQLite has support for indexes on expressions and partial indexes, which may be very useful in some cases. For example, an index can be created on a field inside a JSON document using:

CREATE INDEX myindex ON mytable(json_document ->> 'subfield');

#9: Use Background WAL Checkpoints

Effect: Remove the occasional fsync overhead on a transaction, typically 30-100ms.

WAL checkpoints are where data is moved from the write-ahead log to the main database. This is one place that does wait synchronously for the filesystem to sync.

By default, this occurs once the WAL is greater than 1,000 pages, as part of a [.inline-code-snippet]COMMIT[.inline-code-snippet] statement. This means that every now and then, a transaction may be slightly slower to complete.

To avoid this overhead, the WAL checkpoint can be run in a separate thread on a separate database connection. This requires some low-level operations, described here.

#10: Free Up Space When Appropriate

By default, the database does not decrease in size after deleting data. There are two files that could grow indefinitely, and does not free space automatically:

  • The main database file. Space is freed using a manual [.inline-code-snippet]VACUUM[.inline-code-snippet] command, or auto [.inline-code-snippet]VACUUM[.inline-code-snippet].
  • The WAL file. This can grow large if there are large transactions, or many concurrent transactions. Space is freed using [.inline-code-snippet]pragma journal_size_limit[.inline-code-snippet] or [.inline-code-snippet]pragma wal_checkpoint(truncate)[.inline-code-snippet].

While SQLite does support an incremental vacuum mode for the main database, it can increase fragmentation and decrease performance. Instead, manually run [.inline-code-snippet]VACUUM[.inline-code-snippet] to reclaim space after a lot of data has been deleted.

To estimate the amount of space that can be freed, use this query:

SELECT page_count * page_size as total_size, freelist_count * page_size as freelist_size FROM pragma_freelist_count(), pragma_page_size();

The actual space that can be freed may be more when many pages are only partially filled.

The WAL size is mostly managed automatically by the [.inline-code-snippet]pragma journal_size_limit[.inline-code-snippet] setting. It could still be useful to directly truncate the WAL after a [.inline-code-snippet]VACUUM[.inline-code-snippet] command. There is currently no built-in SQLite method to get the WAL size, but this can be done by checking the size of the [.inline-code-snippet]-wal[.inline-code-snippet] file using standard filesystem methods.

If there are many free pages (e.g. more than 10MB), run the following to reclaim the space:

vacuum;
pragma wal_checkpoint(truncate);

Note that even though this will free up space, it actually requires more storage while running. This means it can't be used to free up space when the device storage is already full.


Both these statements require a write lock, and may block writing to the database for a long time if the database is large. It is best to run this while the device is idle, to avoid slowing down or blocking other operations.

On iOS, schedule a background processing task:

On Android, use WorkManager to schedule the operation when the device is idle:

Optimized SQLite Library For Flutter

Application developers should not have to care about most of this complexity. To assist with this in Flutter applications, we’ve created a library (sqlite_async) that contains many of the above optimizations built-in, including:

  1. Using WAL mode by default.
  2. Transactions are asynchronous and concurrent by default, by using a pool of connections, each running in a separate Isolate.
  3. As an escape for complex performance-sensitive operations, synchronous database operations can be run directly on the underlying database connection in the database Isolate.
  4. List, Map and custom Object arguments are automatically encoded as JSON to simplify JSON1 operations.
  5. Use SQL queries directly — no code generation required.

Additional support is planned for:

  1. Automatic background WAL checkpoints.
  2. Helper libraries for automatic [.inline-code-snippet]VACUUM[.inline-code-snippet] support when the device is idle (Flutter applications).

If you prefer using an ORM and typed queries, look at the great [.inline-code-snippet]drift[.inline-code-snippet] library instead. While it doesn’t have all the same optimizations out of the box, most of it can be enabled with some configuration.