At first, it was called “DLL Hell”. Then “JAR Hell”. “Assembly Hell”. Now, it’s fallen under the label of “NPM-Gate”, but it always comes back to the same basic thing: software developers need to think about their software build and runtime dependencies as a form of Supply Chain Management. Failure to do so—on both the part of the supplier and the consumer—leads to the breakdown of civilization and everything we hold dear.
For those who haven’t seen the disaster that was NPM-Gate, allow me to refer you to a few links:
- The Timeline. Probably the closest thing to an “official” history of what happened.
- ARS Technica’s views on the subject
- Business Insider UK also reported on the mess
The upshot, however, was that the removal (rightly or wrongly) of a package that a core package (ExpressJS, among others) depended on led to an obscene number of sites suffering an outage.
Interpreting the deploy
To some, particularly those coming from the native, Java or .NET persuasion, this missing dependency may seem odd—after all, it would really only make itself felt when the project was compiled, not at runtime. If the project had already been “built” (which, in the NodeJS world, would mean having already downloaded all of the dependencies and deployed), then why would the package’s subsequent removal really make all that much of a difference?
Here, we come to the part where conventions and cultural habits come into play;
when a NodeJS-based system is deployed, it’s typically deployed without its
environments, is an interpreter, not a compiler or virtual machine. Thus, the habit
among the NodeJS community is to deploy the source code along with a manifest file
(package.json) that in turn describes all of the dependencies. Then, as part of the
deploy, one issues the command to pull the dependencies down (
npm install), and
the application is ready to go.
“OK, fine, then each time they deploy, there’s a problem. I still don’t see why this caused so much outage across the Internet”, might be the reasonable reply. And this is where we get into why this matters to people outside the NodeJS world.
The NodeJS community, you see, has been at Ground Zero for a lot of the DevOps movement—many of the ideas and concepts around DevOps have been put into play in NodeJS environments. After all, think about it: write your source, then commit the changes, and boom, everything is ready for a deploy—so it’s trivial (in a way) to wire this up into a full-blown DevOps pipeline, particularly since most NodeJS-based systems are using REST/HTTP APIs for the back-end, which are always much, much easier to automatedly test than other middleware options.
So when you go “all in” on the DevOps thing, and start doing daily—or hourly—releases, suddenly a break in your deployment process becomes a really big deal. Particularly if your “rollback” strategy is based around the idea of doing a re-deploy of old code, as opposed to having a full backup of the server (or server container image) that you can simply spin up without having to actually run the deployment script.
This kind of event, which qualifies as a black swan event if ever there was one, still leaves quite a few lessons that developers can learn from.
First of all, there’s the obvious one that suggests that developers need to pay close attention to their dependencies. Except in this case, when the dependency simply disappeared, there was no real warning, and no way to avoid the problem. (Except for the obvious, “Well, don’t use that package then!”, which wouldn’t have applied in this case, since it wasn’t a direct dependency, but one that was loaded by another dependency—which means that nobody in the NodeJS world actually understood that they were one 11-line package away from being horribly busted.)
Beyond that, though, there’s some interesting questions to be asked, and some important lessons to be learned.
Question: Who owns code in a public repository?
Part of the discussion here is over the rights of code deployed to a repository like npm (or Maven, or NuGet, or …). Assume I put my code under a license model that clearly states that I retain ownership of the IP (as most open-source licenses do), whereas the repository states a license model that states that they obtain ownership of code made available through them. (The company that runs the npm registry maintains their license document here, but I am nowhere close to being a lawyer, and I don’t know what npm’s legal claim is to any module published through them.)
I honestly don’t know the answers to this one, and I’m not entirely sure the legal community does, either. There were no lawsuits filed as a result of this whole debacle, but honestly there probably were grounds for one. From who, against whom, for what, I have no idea.
Question: How reliable is a module in a public repo?
Everyone assumes that any module you fetch out of a public repository is solid and worth using. Sort of. The popular ones, anyway. Right?
Except that left-pad (the 11-line module in question) had been downloaded well over a half-million times, and yet….
How do we judge not just the quality of the code in the module (and who actually does a full quality review on a package referenced from a public repository?), but also the reliability of the developer(s) who published it? To what standard do we hold them?
Question: How much liability does a producer assume?
If I put a package into the repo, and you use it, and then I yank the package back out of the repo, and your system is broken, do I have any liability? Granted, all packages typically carry legal text that says “You are using it all at your own risk”, but frankly those kinds of disclaimers are only as good as the paper they are printed on (figuratively speaking), because a legal disclaimer has never stopped a suit from being filed.
(It was explained to me thusly a number of years ago: If the skating rink at which you rent your skates doesn’t take proper care of the equipment, including and not limited to the skates themselves, then they are liable for injuries you sustain, regardless of the disclaimer you signed. Were the skates properly maintained or not? That’s clearly for a jury to decide, and most judges will not hold the fact that you signed a disclaimer to mean that you thereby agreed to use the skates without any assumption of proper maintenance implied. Hence, the disclaimer/waiver doesn’t eliminate all possibility of a legal suit being successfully brought.)
Question: How much liability do you want to assume?
This applies equally to both producers of open source components, as well as to consumers.
As a consumer, it’s a pretty easy equation: If your app depends on a package that has a 99% reliability factor, that feels pretty good. If, however, your package depends on two packages, each of which have a 99% reliability factor, your application’s reliability is now 0.99 * 0.99, or 0.9801.
Do this for approxmiately a hundred packages, and suddenly your 99% reliability factor realistically isn’t. Fifteen packages is 0.86; fifty packages is 0.60. (It drops off by about 1% per package up until about the 50 mark or so.) By the time you get to a hundred packages, your reliability is now down to 0.36 and some change.
That’s not entirely encouraging.
As a producer, you are taking on social liability, if not actual legal liability, to not only verify that your package works and/or is free of any bugs (up to a reasonable point), but that it will also be maintained over time. This includes making sure your package in turn reflects changes to the packages upon which it depends, too.
Side note: I just ran a Yeoman generator (“node-express-mongo”), and was greeted with the following display, all of which are basically indicating (as near as I can tell—I haven’t tracked any of this down yet) that a package being installed depends on outdated versions of other packages.
npm WARN deprecated firstname.lastname@example.org: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue npm WARN deprecated email@example.com: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue npm WARN deprecated firstname.lastname@example.org: graceful-fs v3.0.0 and before will fail on node releases >= v7.0. Please update to graceful-fs@^4.0.0 as soon as possible. Use 'npm ls graceful-fs' to find it in the tree. npm WARN deprecated email@example.com: lodash@<3.0.0 is no longer maintained. Upgrade to lodash@^4.0.0. npm WARN deprecated CSSselect@0.7.0: the module is now available as 'css-select' npm WARN deprecated CSSwhat@0.4.7: the module is now available as 'css-what' npm WARN deprecated firstname.lastname@example.org: graceful-fs v3.0.0 and before will fail on node releases >= v7.0. Please update to graceful-fs@^4.0.0 as soon as possible. Use 'npm ls graceful-fs' to find it in the tree. npm WARN deprecated email@example.com: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue npm WARN deprecated firstname.lastname@example.org: ReDoS vulnerability parsing Set-Cookie https://nodesecurity.io/advisories/130 npm WARN deprecated email@example.com: this package has been reintegrated into npm and is now out of date with respect to npm npm WARN deprecated firstname.lastname@example.org: lodash@<3.0.0 is no longer maintained. Upgrade to lodash@^4.0.0. npm WARN deprecated email@example.com: Jade has been renamed to pug, please install the latest version of pug instead of jade npm WARN deprecated firstname.lastname@example.org: Please update to minimatch 3.0.2 or higher to avoid a RegExp DoS issue
This collection does not exactly inspire confidence. This list also seems to be getting longer and longer every time I run a Yeoman generator and install packages. What’s worse, there’s no single entity to which we can point the finger—every single package maintainer inside the npm repository needs to commit to keeping up with all the changes across the entire repository, or else this system slowly breaks down due to entropy.
And remember, entropy always wins.
Question: How much do repository entities learn from each other?
Sonatype, the company that “owns” the Maven repository, went so far as to blog about the whole situation, and it’s well worth the read. They take a fairly selfish (meaning, from their own) perspective on the situation, talking about what lessons the corporate entity that owns npm should take away, but there’s some good nuggets in there. How much has the NuGet team read up on this? Or the Haskell community? What was the thought process from the folks who maintain Ruby gems? And so on.
Supply Chain Management
All of these are the kinds of questions that any manufacturing company has had to wrestle with, under the larger term “Supply Chain Management”. If you currently run a software development department, and you currently use libraries that are developed out-of-house (which is to say, everybody), then you owe it to your customers and consumers and operations staff and executive management to read up on this subject and find some ideas for how to manage your software supply chain.
And no, there’s no books on the subject of which I’m aware, because, let’s face it, compared to the latest me-too Single Page Application framework, Software Supply Chain Management is about as sexy as bridge physics or crop insurance.
Until the unthinkable happens, anyway.