NuGet - The New "DLL Hell"
May 30, 2017
.net Programming
Since software first starting relying on external code to function properly, we coders have had problems keeping everything in sync. Since program code relies on Operating System components, the language runtime / compiler and, most likely third-party components there are many moving parts that need to be compatible in order for our final product to function. And more often than not, one or more components gets updated without our software expecting it and all of a sudden we are spending hours or days hunting down bugs - all while keeping anxious users at bay.
Each major iteration of languages and build systems attempts to find the perfect way to eliminate these closely-coupled pieces.
Previous Attempts - OCX / .net
Visual Basic introduced OCX controls that needs to be registered - but only one version of each control could be installed per system. So if a new version got installed by one piece of software (or the OS) then older software referencing a previous version most often not broke.
Microsoft.net promised to fix this by allowing side-by-side installations. This meant that multiple versions of the same component could be installed on the same system, and each software package kept its own copy in the same folder as the other executable code. Except for the runtime itself that was updated at the whim of Windows Update (or some unknowing IT person). And then there is the GAC (Global Assembly Cache) that would register commonly used components system-wide just like an old OCX. Sure, you were supposed to be able to have your own local version, but that usually didn't work.
So what now? The latest trend is to make a bunch of tiny packages that each contain an minute amount of code. The theory is that one can install only those tiny pieces that are required for one's program to run and thus keep the footprint as small as possible. But weren't linkers designed for this? Furthermore, the actual runtime components are part of this series of packages that should provide each program with an isolated runtime. In theory this sounds perfect - no shared .net runtime, no external dependencies (OCX, GAC, or otherwise). Just the OS is now left to chance - but that will always be the case...
Enter NuGet
All of these tiny code packages are hosted in an ecosystem labelled "NuGet" (at least in the Microsoft-centric world). Each "nugget" package is self-describing such that all dependencies are published with the package. To take a widely used example, there is a nugget call ed Newtonsoft.JSON that makes working with modern web-based services a breeze.
The latest version is 10.0.2 as of this writing. A well-formed nugget that utilizes this functionality will allow versions including and after the one it was written against to be used (for example, 9.0.0 and above). But, and this is a big "but", a component writer can specify a version range; e,g, 7.0.0 - 9.0.0. That's great it would seem so that specific versions can be tested and ensure to work properly. And until version 10 came out it was near-perfect.
Now suppose you, as the programmer, need to include two nuggets that both reference Newtonsoft.JSON. One has a limit of versions 7, 8, 9 and one is built against the current version 10. What happens? In short - nothing good. Since two versions of the same package cannot co-exist (which is determined by the name only, NOT the name+version) you are stuck choosing one of the two.
So what about the other? You either need to beg and plead with the author using who chose the older version to update their source, write that functionality yourself, or change the program's requirements. And trust me, none of those options are pretty and become downright unacceptable when deadlines and budgets need to be met.
The Future
So how do we, the software industry, solve this dilemma? I wish I had an answer. I think one possibility is to force NuGet to munge the version number and package name together to allow two major versions; e,g, 9 and 10 to co-exist in the same project.
Another answer would be to compile ALL referenced code into each package and create an actual linker that would reduce final program code size based on code is actually used, rather than forcing all of it to be distributed with each program. I know disk space is cheap these days, but does that really mean we should be inefficient with it? And wouldn't smaller code run faster? I remember back when I was programming in C++ and marveling at how small a final "release" version's byte-code size would be.
In the short term, we will continue to fight "DLL hell" with the tools we have and just budget that in as part of the development process.