IndexedDB, camels, and committees

IndexedDB is a W3C standard for storing collections of objects locally in your browser. I have used it extensively to add offline support to an app, and I strongly dislike it. I am not the only one in this camp. I think it represents a huge problem for the web. And one that was easy to avoid.

The only alternative for storing structured data locally

If you need to store large amounts of data locally in your browser you have three options:

Local storage. A simple key/value store with a synchronous API. It is limited to 5MB in most browsers, but the user can choose to increase the limit when it is reached. It is an HTML5 standard supported by all browsers.
WebSQL. A relational database store. It is not a standard and it is only supported by Chrome and Safari, both using SQLite as the underlying technology.
IndexedDB. The standard we are talking about in this article. Supported by all modern browsers.

Local storage is used by many apps to offer offline persistence. It is very simple to use: store strings indexed by key using a synchronous API. That does the job for many apps: save data as a serialized JSON, and deserialize it for rendering it. You can’t compare it to WebSQL or IndexedDB though. For example, you can’t query specific records. It is not designed to manage large sets of data.

Given WebSQL limited support, if you want to support Firefox and Internet Explorer in addition to Safari and Chrome, IndexedDB is your only choice.

Design by committee at its finest

Have you ever heard that a camel is a horse designed by a committee? I think IndexedDB is the perfect example:

Conceived from scratch by a group of experts as an intellectual exercise, that is, without specific product needs driving its design
The API is difficult to work against
It lacks essential features for real-world scenarios

Ugly verbose API

Check this fragments from the IndexedDB spec:

// The following example looks up all books in the database by author using an index and a cursor.
var tx = db.transaction("books", "readonly");
var store = tx.objectStore("books");
var index = store.index("by_author");

var request = index.openCursor(IDBKeyRange.only("Fred"));
request.onsuccess = function() {
  var cursor = request.result;
  if (cursor) {
    // Called for each matching record.
    report(cursor.value.isbn, cursor.value.title, cursor.value.author);
    cursor.continue();
  } else {
    // No more matching records.
    report(null);
  }
};

// The following example looks up a single book in the database by title using an index.
var tx = db.transaction("books", "readonly");
var store = tx.objectStore("books");
var index = store.index("by_title");

var request = index.get("Bedrock Nights");
request.onsuccess = function() {
  var matching = request.result;
  if (matching !== undefined) {
    // A match was found.
    report(matching.isbn, matching.title, matching.author);
  } else {
    // No match was found.
    report(null);
  }
};

This code is very verbose and difficult to read. It is curious that aesthetics was on the radar of Mozilla when pushing forward IndexedDB:

We were resolved that using strings representing SQL commands lacked the elegance of a “web native” JavaScript API and started looking at alternatives

It is no surprise that many IndexedDB wrappers have popped up. I have worked with Dexie, and it is a wonderful library that makes working with IndexedDB much easier. Sadly, IndexedDB problems are deeper than its syntax.

Nice features you don’t need, missing features you do

When you are working with large sets of data you need querying. Sadly, IndexedDB support for querying is very poor.

Queries are based on the IDBKeyRange interface. It lets you express a range of keys to fetch by delimiting its bounds (with >, <, = constraints).

This allows you to express queries like people with age between 20 and 40. Or people whose last name is Smith.

That’s it, believe it or not.

Could you query the people which age is 20 OR whose last name is Smith? No, logical OR is not supported.

Could you query the people which age is 20 AND whose last name is Smith? Logical AND is not supported either, but there is a workaround that involves an undocumented capability: composite indexes. Just create an index on age and last name. It is worth mentioning that Internet Explorer doesn’t support multi-entry indexes.

Ok, so could you query the people which age is older than 20 AND which last name is Smith? Nope! The multi-entry index workaround only works for exact matches.

Now, let’s search the database. Could you query people that contain Smith as part of its last name? No. String matching is not supported other than starts with using a lower bound for the desired match. And ignoring case is not supported either. Full text searching for certain words can be achieved, again, via a clunky workaround involving multi-entry indexes. This implies splitting the words manually and creating and maintaining a field with an index for it.

So, could you join the data from 2 stores? No, sorry, no join operation in IndexedDB. You need to join things manually during cursor iterations. What about N+1 problems? ¯\(ツ)/¯. The theory is that, since we are in the NoSQL realms, you can store records within records to avoid joins. But what if this makes no sense in your domain? Again, ¯\(ツ)/¯.

What about paginating content? Well you don’t have limit and offset as with SQL, but you can advance cursors (offset) and count iterations (limit). The code will be ugly and verbose, but it is indeed supported.

Another problem I have experienced is that inserting many records is slow. A typical scenario for bulk inserting is syncing all your data the first time the app launches. Check these numbers for inserting 5000 records using this benchmark:

Browser	IndexedDB	WebSQL
Chrome	2.8s	0.105s
Safari	0.408 s	0.073s
Firefox	0.339s	—

Compared to Web SQL, it is 28 times slower in Chrome, 4 times slower in Safari and 3 times slower in Firefox. That’s the factor you could cut initial sync time by if you used WebSQL instead of IndexedDB.

So with IndexedDB, you have a verbose and poorly designed API, which very little querying capabilities that are slower than WebSQL for inserting in bulk, and not faster than it for other queries.

There is an upcoming 2.0 version of IndexedDB that recently shipped in Chrome. Sadly it is built on top of the previous version, it’s not a new API. Of all the new features, I like getting data in bulk (getAll), but it doesn’t really address any of the problems mentioned above.

Developer happiness?

The main reason of Mozilla for pushing against Web SQL is that all the existing implementations used SQLite. The team working on the Web SQL spec deferred a lot of details to the SQLite implementation, and the W3C didn’t like this.

SQLite is extremely robust, solid and used everywhere. I have seen it work with multi-GB databases and tables containing millions of records, and it performed like a breeze. It was already available in some browsers. You have all the power of SQL at your disposal. Moreover, the relational paradigm is well known among developers, as well as it is making it work with objects.

A little bit of pragmatism could have changed the course of Web history. Had Web SQL spec been completed, nothing would have prevented other vendors to implement a better alternative to SQLite. It is the kind of project that Google could address (e.g., a faster SQLite-like alternative from Google powering Chrome now!).

It reminds me of the story of Hibernate. Everyone agreed Entity Beans in EJBs 2 sucked. Hibernate was an open source ORM that everyone loved. They created the JPA standard heavily inspired by it. Luckily, they didn’t say: this is too much like Hibernate, let’s think of an entirely new approach from scratch! They just embraced a proven solution developers loved. And it worked. Well, to me, IndexedDB is EJB 2.

By the way, Firefox’s IndexedDB implementation is based on SQLite, which is ironic if you think about it.

A serious problem for the web

I think IndexedDB represents a serious problem for the web. It is the only option for storing large amounts of structured data in a web app if you want to support all the main browsers.

How data is stored locally is a core part of apps that aim to offer a desktop-like experience. To me, IndexedDB means that a whole range of apps is much harder to implement, compared to their native counterparts. Of course, Android, Apple and Microsoft use SQLite as their local storage technology for native apps.

If I had to create a web app that needed to store a significant amount of structured data locally, I would seriously consider going with Web SQL, only supporting Chrome and Safari, and offering an Electron-powered standalone app users can install if they want, to mitigate the problem. Not supporting IE and Firefox makes me sad, but this might be a pragmatic approach valid for some.

I wish the W3C would retake the work on Web SQL. The web would benefit enormously from it.