Cache busting in Node.js dynamic ESM imports
I’m porting JSDB to EcmaScript Modules (ESM) and one of the issues I had to look into was module cache invalidation.
JSDB is my little in-memory native JavaScript Database that writes JavaScript operations to append-only JavaScript logs that have UMD headers. And it loads in these tables either via a dynamic require()
call, or, for very large tables, by streaming them in and evaluating them line by line1.
When a table is loaded in, it is stored in the require
cache. Then, during a session, data might be written to it. If the table is then closed, we must invalidate the old version in the cache in case the table is opened again in the same session (otherwise, all the changes that were written to it in that session will be lost as the old version is read in from the cache).
In the current CommonJS version, I lazily used a module called decache without giving it too much thought. The module works as described but is also not entirely necessary for my simple use case (as JSDB modules cannot include other modules so there isn’t a complicated cache hierarchy to navigate). I could easily have removed entries from the module cache manually by deleting keys on the cache
property of the require
function2.
Module cache invalidation in CommonJS
To test this out youself, create a file called module.cjs, with the following content to display a random number:
module.exports = (() => Math.random())()
Then, create a file called index.cjs and require your module twice:
const path = require('path')
const modulePath = path.resolve('./module.cjs')
console.log(require(modulePath))
console.log(require(modulePath))
Run this and note that only one random number is displayed as the immediately-invoked function expression (IIFE) exported from your module is only run the first time your module is loaded in and then its value is cached.
To clear the cache, we simply have to remove its entry from the require.cache
object.3 If you console.log(require.cache)
after the initial call to the require()
function in your code, you will see output similar to the following:
[Object: null prototype] {
'/home/aral/sandbox/common-js-require-cache/index.cjs': Module {
// …
},
'/home/aral/sandbox/common-js-require-cache/module.cjs': Module {
id: '/home/aral/sandbox/common-js-require-cache/module.cjs',
path: '/home/aral/sandbox/common-js-require-cache',
exports: 0.6628156804529053,
//…
}
To remove it from the cache, we simply delete its key. Add the following line of code between the two console.log()
statements:
delete require.cache[modulePath]
(This is also why we used path.resolve()
in the original example so we’d have the absolute path to the module.)
Now, when you run it, you should see two different numbers output as the module is loaded in fresh both times.
Cache invalidation in ESM with CommonJS-style requires
If I wanted to, I could still have JSDB output its tables in CommonJS or UMD modules even though I was using ESM for it. Why would I want to do that?
Well, for one thing, there is a very important difference in dynamic module loading between CommonJS and ESM: the former is synchronous while the latter is asynchronous. And refactoring previously-synchronous code to be asynchronous has cascading effects throughout our application and can complicate your library’s interface.
So if I was lazy or if I didn’t want to make my API asynchronous (which is less of an issue now that we have top-level await), I could simply use the createRequire()
method in the built-in module
module to create my own require()
function that works under ESM. I detail how to do this in my previous CommonJS to ESM in Node.js post.
Here’s what the above example would look if that’s the direction you wanted to go.
Your module.cjs would stay the same and you’d load it in from an index.mjs:
import path from 'path'
import { createRequire } from 'module'
const require = createRequire(import.meta.url)
const modulePath = path.resolve('./module.cjs')
require(modulePath)
delete require.cache[modulePath]
require(modulePath)
Cache invalidation in ESM with dynamic imports
Note: This will leak memory and eventually crash your system. Garbage collecting stale ESM modules is not a solved problem as of Feb, 2021.
The ESM version of JSDB, however, will be 100% ESM and so it will write out ESM modules and use dynamic import()
calls to load them in4.
Cache invalidation with dynamic imports is different and in some ways much easier. You don’t have a require.cache
object to work with like you do in CommonJS. Instead, however, you can use a trick as old as the web itself (well, almost): just tack a random fragment to the URL of the module you’re loading.
So this is what cache invalidation looks like in pure ESM:
// index.mjs
const modulePath = './module.mjs'
async function importFresh(modulePath) {
const cacheBustingModulePath = `${modulePath}?update=${Date.now()}`
return (await import(cacheBustingModulePath)).default
}
console.log(await importFresh(modulePath))
console.log(await importFresh(modulePath))
One caveat is that the relative path I used in the example above works because the importFresh()
function is in the same file. If I was to refactor it out to a common module, you would have to use absolute paths. Otherwise, module resolution would occur from whatever path the file containing your importFresh()
function was itself imported from.
To npm install or not to npm install, that is the question
I actually went ahead and made a tiny Node module called @small-tech/import-fresh.
(But, as I mention in the readme, consider whether you really need to add yet another dependency to your project or whether you can get away with just using vanilla JavaScript based on what you know from above.)
Like this? Fund us!
Small Technology Foundation is a tiny, independent not-for-profit.
We exist in part thanks to patronage by people like you. If you share our vision and want to support our work, please become a patron or donate to us today and help us continue to exist.
-
If you’re using JSDB with datasets of these sizes you will also have to configure your system and Node.js to increase the heap size, etc. While there are legitimate use cases for such large datasets, it’s also possible that you’re up to no good. (Don’t use JSDB to farm people for their data or profile them or I’ll haunt you in your nightmares.) ↩︎
-
Remember that everything is an object in JavaScript, including functions. And yes, functions can have properties attached to them. This is because JavaScript is a prototype-based language. ↩︎
-
In more complicated applications, your modules will require modules and thus you’ll have to navigate the cache hierarchy and invalidate child modules, etc. Save yourself some trouble and use a ready-made module like decache, etc. (there are quite a number of them). ↩︎
-
In my tests with medium-sized datasets (~227MB), I didn’t perceive any substantial performance difference between dynamic imports in CommonJS vs ESM. The former is ever so slightly faster in real terms but they both have the same O(N) linear time complexity. ↩︎