Back

Getting Started with IndexedDB for Big Data Storage

Getting Started with IndexedDB for Big Data Storage

It’s not always necessary to send a user’s data to the server: you could choose to store some information in the browser. It’s ideal for device-specific settings such as interface configurations (e.g. light/dark modes) or private data which should not be transmitted across the network (e.g. an encryption key).

Consider the Performance API for timing network and page events. You could upload all data the moment it’s available but that’s a considerable volume of information which, ironically, would hit page performance. A better option would be to store it locally, process statistics using a Web Worker, and upload the results when the browser is less busy.

Two cross-browser client-side storage APIs are available:

  1. Web Storage

    Synchronous name-value pair storage saved permanently (localStore) or for the current session (sessionStore). Browsers permit up to 5MB of Web Storage per domain.

  2. IndexedDB

    An asynchronous NoSQL-like name-value store which can save data, files, and blobs. At least 1GB should be available per domain and it can reach up to 60% of the remaining disk space.

Another storage option, WebSQL, is available in some editions of Chrome and Safari. However, it has a 5MB limit, is inconsistent, and was deprecated in 2010.

In this tutorial we’ll store timing data for the page and all assets. Web Storage space can be slow and too limiting so IndexedDB is the best option.

All example code is available on Github should you want to try IndexedDB on your own site.

What is IndexedDB?

IndexedDB was first implemented in 2011 and became a W3C standard in January 2015. It has good browser support although its callback and events-based API seems clunky now we have ES2015+. This article demonstrates how to write a Promise-based wrapper so you can use chaining and async/await.

Note the following IndexedDB terms:

  • database — a top-level store. A domain can create any number of IndexedDB databases but it’s unusual to see more than one. Only pages within the same domain can access the database.

  • object store — a name/value store for related data items. It’s similar to a collection in MongoDB or tables in a relational database.

  • key — a unique name used to reference every record (value) in the object store. It can be generated using an autoIncrement number or set to any unique value within the record.

  • index — another way to organize data in an object store. Search queries can only examine the key or an index.

  • schema — the definition of object stores, keys, and indexes.

  • version — a version number (integer) assigned to a schema. IndexedDB offers automated versioning so you can update databases to the latest schema.

  • operation — database activities such as creating, reading, updating, or deleting records.

  • transaction — a set of one or more operations. A transaction guarantees all its operations either succeed or fail. It cannot fail some and not others.

  • cursor — a way to iterate over records without having to load all into memory at once.

Developing and debugging a database

In this tutorial, you will create an IndexedDB database named performance. It contains two object stores:

1. navigation

This stores page navigation timings information (redirects, DNS lookups, page loading, file sizes, load events, etc). A date will be added to use as the key.

2. resource

This stores resource timings information (timings for other resources such as images, stylesheets, scripts, Ajax calls etc.) A date will be added, but two or more assets could load at the same time so an auto-incrementing ID will be used as the key. Indexes will be created for the date and the name (the resource’s URL).

All Chrome-based browsers have an Application tab where you can examine storage space, artificially limit the capacity, and wipe all data. The IndexedDB entry in the Storage tree allows you to view, update, and delete object stores, indexes, and individual records. Firefox’s panel is named Storage.

You can also run your application in incognito mode so all data is deleted once the browser window is closed.

Connecting to an IndexedDB database

A wrapper class created in indexeddb.js checks for IndexedDB support using:

if ('indexedDB' in window) // ...

It then opens a database connection using indexedDB.open() by passing:

  1. the database name, and
  2. an optional version integer.
const dbOpen = indexedDB.open('performance', 1);

Three important event handler functions must be defined:

  1. dbOpen.onerror runs when an IndexedDB connection cannot be established.

  2. dbOpen.onupgradeneeded runs when the version required (1) is greater than the current version (0 when the database is not defined). A handler function must run IndexedDB methods such as createObjectStore() and createIndex() to create the storage structures.

  3. dbOpen.onsuccess runs when the connection has been established and any upgrades have completed. The connection object in dbOpen.result is used in all subsequent data operations. It is assigned to this.db in the wrapper class.

The wrapper constructor code:

// IndexedDB wrapper class: indexeddb.js
export class IndexedDB {

  // connect to IndexedDB database
  constructor(dbName, dbVersion, dbUpgrade) {

    return new Promise((resolve, reject) => {

      // connection object
      this.db = null;

      // no support
      if (!('indexedDB' in window)) reject('not supported');

      // open database
      const dbOpen = indexedDB.open(dbName, dbVersion);

      if (dbUpgrade) {

        // database upgrade event
        dbOpen.onupgradeneeded = e => {
          dbUpgrade(dbOpen.result, e.oldVersion, e.newVersion);
        };

      }

      // success event handler
      dbOpen.onsuccess = () => {
        this.db = dbOpen.result;
        resolve( this );
      };

      // failure event handler
      dbOpen.onerror = e => {
        reject(`IndexedDB error: ${ e.target.errorCode }`);
      };

    });

  }

  // more methods coming later...

}

A performance.js script loads this module and instantiates a new IndexedDB object named perfDB after the page has loaded. It passes the database name (performance), version (1), and an upgrade function. The indexeddb.js constructor calls the upgrade function with the database connection object, the current database version, and the new version:

// performance.js
import { IndexedDB } from './indexeddb.js';

window.addEventListener('load', async () => {

  // IndexedDB connection
  const perfDB = await new IndexedDB(
    'performance',
    1,
    (db, oldVersion, newVersion) => {

      console.log(`upgrading database from ${ oldVersion } to ${ newVersion }`);

      switch (oldVersion) {

        case 0: {

          const
            navigation = db.createObjectStore('navigation', { keyPath: 'date' }),
            resource = db.createObjectStore('resource', { keyPath: 'id', autoIncrement: true });

          resource.createIndex('dateIdx', 'date', { unique: false });
          resource.createIndex('nameIdx', 'name', { unique: false });

        }


      }

  });

  // more code coming later...

});

At some point it will become necessary to change the database schema — perhaps to add new object stores, indexes, or data updates. In that situation, you must increment the version (from 1 to 2). The next page load will trigger the upgrade handler again so you can add a further block to the switch statement, e.g. to create an index named durationIdx on the duration property in the resource object store:

case 1: {
  const resource = db.transaction.objectStore('resource');
  resource.createIndex('durationIdx', 'duration', { unique: false });
}

The usual break at the end of each case block is omitted. When someone accesses the application for the first time, the case 0 block will run followed by case 1 and all subsequent blocks. Anyone already on version 1 would run the updates starting at the case 1. IndexedDB schema update methods include:

Everyone who loads the page will be on the same version — unless they have the app running in two or more tabs. To avoid conflicts, the database connection onversionchange handler can be added to indexeddb.js which prompts the user to reload the page:

// version change handler
dbOpen.onversionchange = () => {

  dbOpen.close();
  alert('Database upgrade required - reloading...');
  location.reload();

};

You can now add the performance.js script to a page and run it to check that object stores and indexes are created (DevTools Application or Storage panels):

<script type="module" src="./performance.js"></script>

Record performance statistics

All IndexedDB operations are wrapped in a transaction. The following process is used:

  1. Create a database transaction object. This defines one or more object stores (single string or array of strings) and the access type: "readonly" for fetching data, or "readwrite" for inserts and updates.

  2. Create a reference to an objectStore() within the scope of the transaction.

  3. Run any number of add() (inserts only) or put() methods (inserts and updates).

Add a new update() method to the IndexedDB class in indexeddb.js:

  // store item
  update(storeName, value, overwrite = false) {

    return new Promise((resolve, reject) => {

      // new transaction
      const
        transaction = this.db.transaction(storeName, 'readwrite'),
        store = transaction.objectStore(storeName);

      // ensure values are in array
      value = Array.isArray(value) ? value : [ value ];

      // write all values
      value.forEach(v => {
        if (overwrite) store.put(v);
        else store.add(v);
      });

      transaction.oncomplete = () => {
        resolve(true); // success
      };

      transaction.onerror = () => {
        reject(transaction.error); // failure
      };

    });

  }

This adds or updates (if the overwrite parameter is true) one or more values in the named store and wraps the whole transaction in a Promise. The transaction.oncomplete event handler runs when the transaction auto-commits at the end of the function and all database operations are complete. A transaction.onerror handler reports errors.

IndexedDB events bubble up from the operation to the transaction, to the store, and to the database. You could create a single onerror handler on the database which receives all errors. Like DOM events, propagation can be stopped with event.stopPropagation().

The performance.js script can now report page navigation metrics:

  // record page navigation information
  const
    date = new Date(),

    nav = Object.assign(
      { date },
      performance.getEntriesByType('navigation')[0].toJSON()
    );

  await perfDB.update('navigation', nav);

and resource metrics:

  const res = performance.getEntriesByType('resource').map(
    r => Object.assign({ date }, r.toJSON())
  );

  await perfDB.update('resource', res);

In both cases, a date property is added to the cloned timing objects so it becomes possible to search for data within a specific period.

Open Source Session Replay

OpenReplay is an open-source, session replay suite that lets you see what users do on your web app, helping you troubleshoot issues faster. OpenReplay is self-hosted for full control over your data.

replayer.png

Start enjoying your debugging experience - start using OpenReplay for free.

Reading performance records

IndexedDB searching is rudimentary compared to other databases. You can only fetch records by their key or an indexed value. You cannot use equivalents of SQL JOIN or functions such as AVERAGE() and SUM(). All record processing must be handled with JavaScript code; a background Web Worker thread could be a practical option.

You can retrieve a single record by passing its key to an object store or index’s .get() method and defining an onsuccess handler:

// EXAMPLE CODE
const

  // new readonly transaction
  transaction = db.transaction('resource', 'readonly'),

  // get resource object store
  resource = transaction.objectStore('resource'),

  // fetch record 1
  request = resource.get(1);

// request complete
request.onsuccess = () => {
  console.log('result:', request.result);
};

// request failed
request.onerror = () => {
  console.log('failed:', request.error);
};

Similar methods include:

The query can also be a KeyRange argument to find records within a range, e.g. IDBKeyRange.bound(1, 10) returns all records with key between 1 and 10 inclusive:

request = resource.getAll( IDBKeyRange.bound(1, 10) );

KeyRange options:

The lower, upper, and bound methods have an optional exclusive flag, e.g. IDBKeyRange.bound(1, 10, true, false) — keys greater than 1 (but not 1 itself) and less than or equal to 10.

Reading a whole dataset into an array becomes impossible as the database grows larger. IndexedDB provides cursors which can iterate through each record one at a time. The .openCursor() method is passed a KeyRange and optional direction string ("next", "nextunique", "prev", or "preunique").

Add a new fetch() method to the IndexedDB class in indexeddb.js to search an object store or index with upper and lower bounds with a callback function that is passed the cursor. Two further methods are also required:

  1. index(storeName, indexName) — returns either an object store or index on that store, and
  2. bound(lowerBound, upperBound) — returns an appropriate KeyRange object.
  // get items using cursor
  fetch(storeName, indexName, lowerBound = null, upperBound = null, callback) {

    const
      request = this.index(storeName, indexName)
        .openCursor( this.bound(lowerBound, upperBound) );

    // pass cursor to callback function
    request.onsuccess = () => {
      if (callback) callback(request.result);
    };

    request.onerror = () => {
      return(request.error); // failure
    };

  }


  // start a new read transaction on object store or index
  index(storeName, indexName) {

    const
      transaction = this.db.transaction(storeName),
      store = transaction.objectStore(storeName);

    return indexName ? store.index(indexName) : store;

  }


  // get bounding object
  bound(lowerBound, upperBound) {

    if (lowerBound && upperBound) return IDBKeyRange.bound(lowerBound, upperBound);
    else if (lowerBound) return IDBKeyRange.lowerBound(lowerBound);
    else if (upperBound) return IDBKeyRange.upperBound(upperBound);

  }

The performance.js script can now retrieve page navigation metrics, e.g. return all domContentLoadedEventEnd during June 2021:

  // fetch page navigation objects in June 2021
  perfDB.fetch(
    'navigation',
    null, // not an index
    new Date(2021,5,1,10,40,0,0), // lower
    new Date(2021,6,1,10,40,0,0), // upper
    cursor => { // callback function

      if (cursor) {
        console.log(cursor.value.domContentLoadedEventEnd);
        cursor.continue();
      }

    }
  );

Similarly, you can calculate the average download time for a specific file and report it back to OpenReplay:

  // calculate average download time using index
  let
    filename = 'http://mysite.com/main.css',
    count = 0,
    total = 0;

  perfDB.fetch(
    'resource', // object store
    'nameIdx',  // index
    filename,   // matching file
    filename,
    cursor => { // callback

    if (cursor) {

      count++;
      total += cursor.value.duration;
      cursor.continue();

    }
    else {

      // all records processed
      if (count) {

        const avgDuration = total / count;

        console.log(`average duration for ${ filename }: ${ avgDuration } ms`);

        // report to OpenReplay
        if (asayer) asayer.event(`${ filename }`, { avgDuration });

      }

    }

  });

In both cases, the cursor object is passed to the callback function where it can:

  1. obtain the record value with cursor.value
  2. advance to the next record with cursor.continue()
  3. move forward N records with cursor.advance(N)
  4. update the record with cursor.update(data), or
  5. delete the record with cursor.delete()

cursor is null when all matching records have been processed.

Check Remaining Storage Space

Browsers allocate a significant volume of storage to IndexedDB but it will eventually run out. The new Promise-based StorageManager API can calculate the space remaining for the domain:

(async () => {

  if (!navigator.storage) return;

  const
    estimate = await navigator.storage.estimate(),

    // calculate remaining storage in MB
    available = Math.floor((estimate.quota - estimate.usage) / 1024 / 1024);

    console.log(`${ available } MB remaining`);

})();

The API is not supported in IE or Safari. As the limit is approached, you could choose to remove older records.

Conclusion

IndexedDB is one of the older and more complex browser APIs but you can add wrapper methods to adopt Promises and async/await. Pre-built libraries such as idb can help if you’d rather not do that yourself.

Despite its drawbacks and some unusual design decisions, IndexedDB remains the fastest and largest browser-based data store.