Persisting state locally

If you're going to build a highly reliable web application, or if you want your app to be usable while offline, you have to get comfortable with caching data.

Caching, and cache invalidation can be the cause of a lot of confusion. Let's start by defining two major categories of things we'll be caching:

  1. The assets required to run the "application shell." This includes the HTML, JavaScript, and CSS needed to render and run the application. None of this should ever be assumed to be secret or sensitive. These are the public assets that are required to run your app. They should have zero access control because they are not sensitive in and of themselves.

  2. The data fetched by the application when it is running. This may very well be sensitive, user-specific data. The point is, this is data fetched and used by the application. It is not the application itself.

When I talk about persisting "state" locally in this chapter, I'm talking entirely about #2.

How to store it

There are a few options for where we could potentially put this type of data:

  1. localStorage: Dead simple, but synchronous and blocking. Local Storage is fine for small, simple things like access tokens, but it's not great for larger pieces of data or for data that we will read and write frequently. It never reliably self-expires.
  2. indexedDb: Overwrought API, but async and can store more data. Never reliably self-expires.
  3. cookies: Simple, but inefficient as the data gets sent along with all requests. On the plus side, we can tell the browser to expire it at a future point.
  4. window.caches: The caches API is how ServiceWorkers store requests. It could certainly be used to cache API calls too. I've seen people do this, but personally, I disagree with that choice because you lose the separation between user-fetched data and application assets. Also, it is not as commonly available as IndexedDB.

Given the list of pros and cons, IndexedDB seems like the best choice as long as we can overcome its downsides. But perhaps we can simplify the API a bit. All we need for our purposes is a simple key-value store where we can say "cache this stuff at this key." As it turns out, this is a solved problem. There's a tiny little library called idb-keyval written by Jake Archibald, one of the original architects of ServiceWorker. It gives us a nice promise-based API for treating IndexedDB like a key-value store, and it handles all the ugliness of creating a DB, etc. The best part is that it does all this in about 500 bytes of code, which is puny once you've minified and gzipped it. So, let's roll with that.

When in our Redux app should we do the persisting?

Where our persistance code should live, and what should trigger it is something I've waffled on a bit in various apps. Initially, we may be tempted to think about data persistence as an action that should be dispatched; after all, that's how we trigger most things, right? I would suggest that's not ideal for this. Unless we're going to indicate in the UI specifically what was cached for offline use, we probably should think of caching as just a side effect of using the app.

In fact, for the vast majority of applications, you'll likely want to treat this cache as an optional enhancement. If we're unable to read in existing cached data when the app loads, it shouldn't cause the app to error, it should just fail silently. But, if we have cached data that matches our version number (more on this later) and we can read it successfully, great! That will make our app start up more quickly and will give the user something to look at on the first render even if offline.

Let's identify what we'd like this to look like. For me the answer is usually this:

After certain actions, I want to persist the state from certain reducers.

The piece about only persisting the state of certain reducers makes it so much simpler to "re-hydrate" of our app when it starts up again. You see, the way Redux works is that the initial state you give Redux when you createStore doesn't have to include data for every reducer. It will just grab each key individually, and pass it to the reducer with the same name. So, this means if we don't want to persist everything in our application state (which we often don't) that's ok. Instead, it makes sense to only persist the things that will save us time next time the app starts: typically this means data that we've fetched from an API. I try to isolate these pieces of data into their own reducers so they can be persisted and re-inflated as a whole without needing to modify the data at all. Also, if we use the approach of storing the metadata along with the data, it makes perfect sense to cache all of that as well. This way, the app knows exactly how stale the data is when it starts up.

Ok, so how do we actually do this?

First, we have to decide where we want to put the code that makes this happen. My approach is Redux middleware. This is because middleware is essentially the only place we can see what actions are being dispatched. If you'll recall, the callback we provide to store.subscribe() gets called without any arguments; it doesn't tell us what action(s) were just dispatched, it just tells us "Hey, something happened!".

We need some "smart" middleware that can asynchronously persist stuff when certain things happen.

To do this, we need a list of action types and the names of their reducers. This way, we can look at all the action types as they come through, and figure out what (if any) reducers we should persist as a result. Typically, this is something we know ahead of time, and it doesn't necessarily need to be dynamic.

Let's write some middleware:

("Store next action" remember?)

// this little tiny helper library is just
// requestIdleCallback with a fallback if it
// doesn't exist.
import ric from 'ric-shim'
// our caching library (more on this in a bit)
import cacheLibrary from './somewhere/over/the/rainbow'

// By making this an object where the keys
// are the action types we can very quickly
// and efficiently determine whether a given
// action should cause persistence.
// By making the value an array, we can optionally
// persist several different reducers from a single
// action type.
const actionsToPersist = {
  FETCH_USERS_SUCCESS: ['users']
}

const persistMiddleware = store => next => action => {
  const result = next(action)
  const shouldPersist = actionsToPersist[action.type]

  if (shouldPersist) {
    // We often don't have any urgency here
    // we can just tell the browser to do this
    // when it's not busy. Which is exactly
    // what requestIdleCallback lets us do.
    ric(() => {
      const appState = store.getState()
      shouldPersist.forEach(reducerName => {
        const stateToPersist = appState[reducerName]
        cacheLibrary.set(reducerName, stateToPersist)
      })
    })
  }

  return result
}

Re-inflating state when our app starts up

Now we get to use the initialData argument we mentioned when we first introduced createStore. The only unfortunate part here is that we generally have to read our cached state before we can create our store.

There isn't a store.replaceState() or store.mergeState() method on the store, so we have to have our data up front. The sooner in the app "boot process" we can start this cache read, the better. Because likely the app will have to wait a tiny bit for it to happen.

Many of my apps end up starting like this:

import cacheLibrary from './some-cache-helper'

cacheLibrary.getAll().then(data => {
  createStore(rootReducer, data)
})

Note: I should mention that it would be possible to add support for replacing the state. One way to do this would be to write a higher-order reducer to wrap our root reducer and adds support for a REPLACE_STATE action type that swapped out state. Personally, I tend not to do this, because I prefer to know what state I have when the app boots up so that my reactors know whether or not they need to trigger data fetches.

There be dragons

Ok fine, so we can now persist state from specific reducers when specific actions occur and start with that data when the app starts up. We're done, right!?! No.

There are some challenges:

  1. What happens when the "shape" of our reducer state changes?

Let's assume we ship our app to lots of happy users who, as a result, end up persisting a bunch of stuff to their browser's IndexedDB. Later, we update our app to keep slightly differently shaped data in one of the reducers that are getting persisted. We renamed some keys or changed some other aspect of the state in the reducer that is being persisted. Now, our users go to open up our app and since our new code now expects the data to be shaped differently than what we pulled out of cache our app crashes before it can even start up.

This type of error is terrible because it's going to keep crashing! Until they go clear their local IndexedDB, it's not going to work! The same thing is going to happen again if they refresh. My point is, you can really screw things up here.

We have to have some way to version our cache. Without this, we can really make a mess. Fortunately, this isn't all that difficult to do. The more difficult part is realizing when you've made a change that will require you to bump the version number and then remembering to do that along with the release.

  1. What about the age of the data?

As I mentioned, IndexedDB doesn't just expire itself after a while. Sure, the browser may dump that data at some point to conserve space, but generally, we should assume this data will live a long time. When the app starts back up the version is not the only thing that matters. We also want to ensure the data is not older than our specified threshold. So we want to be able to pass a maximum age parameter when we retrieve our data so that we can ignore anything older.

  1. What about personal, user-specific data?

We do not, under any circumstances, want to load data from another user from cache! These days, most people have personal devices which certainly makes this less likely to happen. However, it would cause a tremendous amount of distrust and is therefore completely unacceptable. If the persisted data is public anyway, it's not a big deal, but most apps these days let users do something with their data. So, a lot of times we're persisting data that's tied to a specific user.

The easy answer is to make sure you always "clean up" when a user explicitly logs out. But, that's not always something we can control. Sometimes authentication is handled on a separate URL. So, what if they don't log out, but instead, their session expires then next time the app runs someone else has logged in?

Handling these issues with an abstraction

All three of the issues mentioned above can be handled by using an abstraction layer on top of the simple idb-keyval library. We can wrap it so that whenever we cache something, it also stores a version string and a timestamp. By doing this, we can fairly easily handle the issues I just described above.

By storing a timestamp, we can have the cache.getAll() method only return the keys that are less than some specified maximum age. In this way, we'd simply ignore anything older. Additionally, we can make it so that if you try to retrieve something but pass a version number that is different than what we cached it deletes that cache key instead of returning it.

So as an example, we may have a users key in our cache that contains something like this:

{
  date: 1519938218320,
  version: '1',
  // the content of the reducer
  data: { ... }
}

As I mentioned, if we specified a version when retrieving it we can make it so that the abstraction return null as if nothing was there, while deleting the contents behind the scenes.

With those fairly basic capabilities, we've dealt with the versioning problem and the age problem. What about the user-specific data issue?

Since "version" is just a simple value that is compared to see if it's a perfect match, all we have to do is combine the version number from our config with a value that is associated with a user. Depending on how you handle authentication, there are a few different ways of doing this.

It may be tempting to use a userID for this, but the trouble is that sometimes you don't even have an ID for the user until you've fetched something from the API. So really, something tied to that session is probably best. If your application uses token-based authentication, some slice of the token itself is a simple way to do it. Applications frequently use localStorage for auth tokens. Using localStorage for this kind of thing makes sense to me, by the way. Typically these are short/quick reads. But the point is, if this is the case, you'd likely have a token available on application boot, and then you could use the first few characters from the token along with the version number from config to build a version string! It's probably inadvisable to use the whole token, since there's no reason to put sensitive information in more places than it needs to be.

But you could, for example, do something like this:

import config from './config'
import ms from 'milliseconds'
import createStore from 'redux'
import rootReducer from './config'
let token

try {
  token = localStorage.token
} catch (e) {}

// we can grab a few characters from our token to combine with
// our version number from our config. In token-based auth systems
// that token basically *is* your session, so in this way we've
// successfully tied our data cache to that session.
const version = config.cacheVersion + '|' + token.slice(0, 10)

cacheHelper.getAll({version, maxAge: ms.weeks(4)}})
  .then(data => {
    const store = createStore(rootReducer, data)

    // ...etc
  })

For what it's worth I've recently turned this versioned caching approach into a little library that I use. It's called: money-clip and is available on npm.

Note: There are several popular open-source persistence libraries for Redux. The most prominent being redux-persist and redux-storage. I don't use either. I feel that they make things harder than they need to be, plus both are larger than Redux itself in terms of file size. By contrast, the approach I've described above is a minuscule amount of code and provides a straightforward approach for cache invalidation.

Using session hint cookies instead

If you're using cookies for authentication, you probably have a secure, HTTP-only cookie that contains the session ID. This information should not be made available to JavaScript. The ability to shield items like this from JavaScript is precisely why the HTTP-only flag exists. But, what you can do is also set a second cookie whenever a user successfully authenticates, and store a randomly generated string, like a UUID, that is intentionally exposed to JavaScript (is not HTTP-only). The app can then read this cookie value and use it to build a version string for the cache. In this way, you've again tied the ability to read the cache to that given session without exposing the actual session cookie to JavaScript. Instead, it merely serves as a hint to the app.

This approach can be extended to also inform the application about metadata about the session. For example, I've used a session hint cookie to store a little JSON encoded object that includes the timestamp when the session will expire. By making the app aware of this type of metadata, you can warn users that their session will soon expire, etc.

Chapter recap

  1. Client-side persistence is necessary for building network-independent applications.
  2. In my opinion, this type of data is best stored in IndexedDB.
  3. Instead of dispatching actions specifically for persisting things, persistence should happen as a side effect of using the app.
  4. We can write a piece of Redux middleware that looks for specific action types that will lead to persisting the contents of individual reducers. The actual persistence can be done in a requestIdleCallback to ensure minimal impact on performance.
  5. I recommend the idb-keyval library with a thin wrapper around it for this type of thing.
  6. Cache a timestamp, and version along with the data.
  7. When reading from cache, determine the max acceptable age and pass in a version number.
  8. If there's a version mismatch the data should be deleted.
  9. Combine something tied to the user's particular session along with a version number from config to build the version number passed to the cache.
  10. Use a portion of an auth token as one option
  11. If doing cookie-based sessions, you can create a second "session hint" cookie with a UUID, and optionally session metadata, and read that from the application to build a version string tied to the user.

results matching ""

    No results matching ""