Micro-libraries are really good actually, they're highly modular, self-contained code, often making it really easy to understand what's going on.
Another advantage is that because they're so minimal and self-contained, they're often "completed", because they achieved what they set out to do. So there's no need to continually patch it for security updates, or at least you need to do it less often, and it's less likely that you'll be dealing with breaking changes.
The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.
I would argue the problem is how dependencies in general are added to projects, which the blog author pointed out with left-pad. Copy-paste works, but I would argue the best way is to fork the libraries and add submodules to your project. Then if you want to pull a new version of the library, you can update the fork and review the changes. It's an explicit approach to managing it that can prevent a lot of pitfalls like malicious actors, breaking changes leading to bugs, etc.
Micro-libraries anywhere else are everything you said: building blocks that come after a little study of the language and its stdlib and will speed up development of non-trivial programs.
In JS and NPM they are a plague, because they promise to be a substitute for competence in basic programming theory, competence in JS, gaps and bad APIs inside JS, and de-facto standards in the programming community like the oldest operating functions in libc.
There are a lot of ways for padding a number in JS and a decent dev would keep an own utility library or hell a function to copy-paste for that. But no. npm users are taught to fire and forget, and update everything, no concept of vendoring (that would have made incidents like left-pad, faker and colors less maddening, while vendoring is even bolt in npm and it's very good!). They for years copy-pasted in the wrong window, really, they should copypaste blocks of code and not npm commands. And God helps you if you type out your npm commands because bad actors have bought the trend and made millions of libraries with a hundred different scams waiting for fat fingers.
By understanding that JS in the backend is optimizing for reducing cost whatever the price, becoming Smalltalk for the browser and for PHP devs, you would expect some kind of standard to emerge for having a single way to do routine stuff. Instead in JS-world you get TypeScript, and in a future maybe WASM. JS is just doomed. Like, we are doomed if JS isn't, to be honest.
Current web stack is very complicated, HTML and CSS DOM is a rats nest and that's a superficial example. Adding asynchronous rpc pushes that way over the top. Luckily HTTP has been through the production crucible for 30 years. What I see is not a cost center but a reflection of basic truth:
- UI and data representation is hard.
- Developers use the tools they know about.
But about micro-libraries, the first point isn't very important because this is a social problem. There is no standard library for NPM that does what people need. There should be a curated function library of this crap.
In my imaginary vision there are several, and people would look askance when you don't use one without an obvious reason. I very much sympathize with the desire to burn-it-all, but I like that I can use LetsEncrypt and am cognizant that there is a lot of thought and raw technological progress bound up in all this.
They are not designed for modern use cases and certainly not well-designed according to modern understanding of this word. In fact they did not stood the test of time as something worthy of preservation, they only survived and mutated, because it is extremely hard to replace them.
And of course we should blame the technology: the sole purpose of a standard is to be used by people. If people struggle with it the standard is unfit for its purpose.
You are 100% right but it's no reason to pick on web in particular! When you hit a site, you may have a virtual HTML environment running a JIT virtual machine from a server that's running a virtual python environment on a virtual python environment on a virtual machine on a virtual machine, and all that runs on a virtual x86 processor which in reality is a series of microcode processors. Yes, yes, it would be so much simpler to have your web go straight to the source, microcode, and yes, things would be simple and fast... .. but ... but ABSTRACTION!!!! LETS ABSTRACT EVERYTHING EVER AS MANY TIMES AS POSSIBLE WITH AS MANY ABSTRACTION LAYERS AS POSSIBLE IT GENERALIZES TO THE GENERAL CASE ABSTRACTION GENERALIZES ALL CASES WOOOOWWWW THE COMPUTER SCIENCE OF IT ALL
> And of course we should blame the technology: the sole purpose of a standard is to be used by people. If people struggle with it the standard is unfit for its purpose.
'People' is equivocal here. While complex, and (as you point out, in so many words) an evolved standard, the HTML/CSS/JS stack is arguably one of humanity's greatest achievements, up there with the invention of paper, or perhaps cuneiform.
It's imperfect, like all living standards, but it manages to be ubiquitous, expressive, and _useful_. And for those of us who grew up with these standards, they are second nature.
Like piano, mastering these standards gives you the ability to express complex UI concepts with grace and alacrity.
Don't smash up your parents' piano simply because practicing scales is a chore :)
> Don't smash up your parents' piano simply because practicing scales is a chore :)
I witnessed evolution of UIs from Turbo Vision to modern web and mobile frameworks. My first commercial website went live in 1999. No, UI is not hard and doesn’t require any mastery. As a matter of fact, building decent UI for a client-server application is a simple task with the right tools and processes. Modern Web is not parents‘ piano - you can call it elegant only if you have seen nothing else. It’s a fridge with mostly expired cheap food, from which you have to cook a decent meal for a party. It is possible, no doubt. Without food we will die, so we have to cook it. Instead we could just go shopping.
QUIC is hilarious because they ended up fitting everything and the kitchen sink for any need in the proto, just because 15 years ago firewalls blocked this or that and the reimplementation with websockets of the thing that was blocked runs only in a browser.
First draft of HTTP/3 was called "HTTP/2 Semantics Using The QUIC Transport Protocol". I'm sorry, i should've laid that more shoddily and call that "GoogleHTTP".
I think that it's ugly but okay-ish right now. What is very very bad is the tooling, and someone should remember people that Facebook and Google do things that serves Facebook and Google-scale needs (billions of users, thousands of devs working asynchronously, no time).
What I end up thinking (maybe i'm wrong) is that node.js must be nuked out of backend and on frontend maybe some of the devs should use either a number of libraries under 15 and write custom code for the rest, or use a language that transpiles to JS like TS, flutter, nim, Go or what have you.
Maybe JS should be nuked out of tooling too, sometimes it's actively damaging and sometimes dead slow. Use something else if wrangling asset files are a problem.
If you want a DX where backend is frontend, you must use the only three mantained languages that can do that without trying to actively damage you or users, which are a Smalltalk (like Pharo), a Lisp (like Clojure/Clojurescript) or Java.
I‘d prefer to see a completely new renderer backed by containerized JVM with some DSL for UI as a replacement of modern browsers and application-first encrypted by default binary protocol replacing HTTP (one reference implementation for debugging it would fe fine).
Could you link to somebody who is teaching npm users to "fire and forget?" Someone who is promising a substitute for competence in basic programming theory? Clearly you and I do not consume the same content.
This is just a discourse based on "I need to churn out something, I need that fast and I didn't start in the web game when Backbone and E4X were solid corporate choices". If you are not in a hurry, work in a solid team and have a good attention span, a lot of clickbait idiocy around JS may not happen. It's just that the lone inexperienced guy is one of millions inexperienced guys who are taught the wrong ways everyday.
I'm presenting you one of countless examples: a lot of coding bootcamps teach React, maybe with TS, maybe with JS.
The UNIX philosophy is being a bit abused for this argument. Most systems that fall under the UNIX category are more or less like a large batteries-included standard library: lots of little composable units that ship together. UNIX in practice is not about getting a bare system and randomly downloading things from a bunch of disjointed places like tee and cat and head and so on, and then gluing them together and perpetually having to keep them updated independently.
They ship together because all of those small composable units, that were once developed by random people, were turned into a meta-package at some point. I agree with you that randomly downloading a bunch of disjointed things without auditing and forking it isn't good practice.
I'm also not arguing against a large popular project with a lot of contributors if it's made up of a lot of small, modular, self-contained code that's composed together and customizable. All the smaller tools will probably work seamlessly together. I think UNIX still operates under this sort of model (the BSDs).
There's a lot of code duplication and bad code out there, and way too much software that you can't really modify easily or customize very well for your use case because it becomes an afterthought. Even if you did learn a larger codebase, if it's not made up of smaller modular parts, then whatever you modify has a significantly higher chance of not working once the library gets updated, because it's not modular, and you updated internal code, and the library authors aren't going to worry about breaking changes for someone who's maintaining a fork of their library that changes internal code.
> all of those small composable units, that were once developed by random people, were turned into a meta-package at some point
No they weren’t. Every UNIX I used in the 80s and 90s shipped with those little composable building blocks as part of the OS, and GNU bundled them in things like coreutils forever. It’s not like there was some past time when there were independent little things like cat and wc and so on written by random people on the internet that somehow got bundled into a meta-package after they existed. That didn’t happen.
So are different functions and modules in the standard library of a large batteries-included language like Python, Java or Go.
But in fact, most of the traditional utilities you know and love were first implemented by a small team in Bell Labs, with the most "core" of what we now call coreutils (i.e. "ls") being written single-handedly by Ken Thompson and Dennis Ritchie. The other significant chunk of utilities (commands like vi, more, less, netstat, head) were developed by a group of people at UC Berkeley and were released later as the Berkeley Standard Distribution (BSD).
GNU Coreutils, as we known them, are mostly re-implementations of the original Unix commands (incidentally, a lot of them have been initially implemented by two guys as well - Richard Stallman and David Mackenzie). This is no small feat, but the GNU coreutils authors took good care to maintain compatibility with existing UNIX tools (and later with the POSIX standard when it was released). They didn't just randomly implement commands in the vacuum and waited for someone to come up and bundle them together. It's worth noting that if you're using a Mac, you're not even using GNU coreutils. Most of the core commands in macOS are derived from FreeBSD which traces its lineage back to the original UNIX implementation.
The fact is, most of the individual commands that are now in coreutils were never released individually by themselves - they were generally released as a distribution and usually developed together and specifically to interact with other tools. The Unix Philosophy could not have been developed if Unix did not have a basic decent set of core utilities to begin with, and in fact the core utilities like cat predate Unix pipe support (only introduced in version 3).
The availability of a reliable core system commands which are guaranteed to be available and follow a strict IEEC/ISO standard(!) was pretty important for the development of an ecosystem of non-core commands that are built on top of them. Imagine what would have happened if some commands used fd 0 for stdin and fd 1 for stdout, but others used a different file descriptor or perhaps even an entirely different output mechanism, such as using a system call or sending data through a socket. Interoperability would be much harder.
But this is exactly the case in JavaScript, especially within the NPM ecosystem. The lack of a proper standard library means that every developer has to cobble together the most minuscule of utilities that are usually available as part of the standard library in other ecosystems. But what's worse is that all the various libraries don't play very well with each other. You don't have enough standard base types that you can pass between the different libraries.
For instance, there are no reliable types for describing time (besides a crappy built-in date), input/output streams (until recently), HTTP client, sockets, URLs, IP addresses, random number generators, filesystem APIs, etc. Some of this stuff (like Fetch API and Subtle Crypto) exist in Node.js and have been locked behind feature flags since forever, but that effectively means that you cannot use thme. In the earlier days of the JS ecosystems things were much worse since we didn't even have a standard way for specifying byte arrays (Uint8Array) or even Promises.
> randomly downloading things from a bunch of disjointed places like tee and cat and head and so on, and then gluing them together and perpetually having to keep them updated independently.
I have distressing news about my experience using Linux in the '90s
> So there's no need to continually patch it for security updates, or at least you need to do it less often, and it's less likely that you'll be dealing with breaking changes.
Regardless of how supposedly good or small is the library, the frequency at which you need to check for updates is the same. It doesn’t have anything to do with the perceived or original quality of the code. Every 3rd party library has at least the dependency on platform and platforms are big, they have vulnerabilities and introduce breaking changes. Then there’s a question of trust and consistency of your delivery process. You won’t adapt your routines based on specifics of every tiny piece of 3rd party code, so you probably check for updates regularly and for everything at once. Then their size is no longer an advantage.
> Copy-paste works, but I would argue the best way is to fork the libraries and add submodules to your project. Then if you want to pull a new version of the library, you can update the fork and review the changes.
This sounds “theoretical” and is not going to work at scale. You cannot seriously expect application level developers to understand low level details of every dependency they want to use. For a meaningful code review of merges they must be domain experts, otherwise effectiveness of such approach will be very low - they will inevitably have to trust the authors and just merge without going into details.
They don't need to understand the low level dependencies. People can create metapackages of a lot of a bunch of self-contained libraries that have been audited and forked, and devs can pull in the metapackages. The advantage is the modularity, which makes the code easier to audit and is more self-contained.
When's the last time ls, cat, date, tar, etc needed to be updated on your linux system? probably almost never. And composing them together always works. This set of linux tools, call it sbase, ubase, plan9 tools, etc, is one version of a metapackage. How often does a very large package need to be updated for bug fixes, security patches, or new versions?
It's probably still a good example. Looking up the CVEs for various search terms:
coreutils: 17 results
linux kernel: 6752 results
x11: 184 results
qt: 152 results
gtk: 68 results
docker: 340 results
rust: 455 results
python: 940 results
node: 110 results
javascript: 5657 results
firefox: 3268 results
chrome: 3763 results
safari: 1465 results
webkit: 1346 results
The large monolithic codebases have a lot more CVEs. I'd also argue that patching a fix on code made up of small, modular parts is much easier to do, and much lower hanging fruit for any casual developer to submit a PR for a fix.
What you call an audited metapackage is nothing other than a non-micro-library. The property that it has been audited/assembled/designed/tested in conjunction and would be updated as a whole is exactly the benefit that non-macro-libraries provide.
It's not the same as a non-micro-library, because a non-micro-library is a bunch of internal code that isn't as well-documented and isn't self-contained, and maybe come and go as the non-micro-library continues to churn. I can't easily change a large monolithic library or project do better optimize for my use case. I could do that much easier though if the large thing was composed of a bunch of small self-contained things.
I was assuming that the constituents of the metapackage aren’t completely independent. TFA is about micro-libraries the size of is-number and left-pad. If you only consider completely independent libraries of that type, you won’t get very far. A more realistic take would be hundreds of such micro-libraries that aren’t self-contained, but instead build on each other to successively provide more powerful functions. And then the more transitive dependencies become more like “internal code” in regular libraries, and you’ll get the same churn, and you’ll be similarly unable to change things due to the interdependencies. I don’t see how you can end up with a comparable functionality a regular library provides without the same potential of running into the issues you note.
What if you're using a larger library and you wanted to swap out a sorting algorithm for one that's optimized for your use case?
I would say that the API boundary being more modular and explicit makes it possible to actually do those kinds of swaps if the larger library is composed of smaller modular code, in ways that you wouldn't be able to if it's buried in a bunch of internal code -- you would have to fork the library in that case.
> How often does a very large package need to be updated for bug fixes, security patches, or new versions?
I don’t think you understand my comment, because you are asking the wrong question again. It is not how often you need to actually update one dependency, but how often you need to check for updates that matters. That has to be done frequently no matter what and must be automated. E.g. despite low number of CVEs in coreutils you have to check them very often, because the impact can be very high. Once you have this process in place, there’s no advantage in using micro-libraries. I’d actually expect that in a micro-library environment most breaking changes happen when a library becomes unsupported and you need to choose an alternative, making the migration process more complicated than in case of a bigger framework which just changed the API.
Maybe I'm not understanding your argument. Are you saying that if all these larger programs wrote all those utilities from scratch, it makes it so that if someone messed something up, the rest of the large programs are unaffected?
All of those larger programs can mess things up in different ways, opening up all sorts of attack vectors and bugs. And a smaller utility that's much more easily auditable and easier to contribute fixes to is arguably less likely to have attack vectors in the first place.
I'm not sure you have to check large libraries less often. I would argue at least as much if not more often, because it's harder for people to audit larger codebases and to also contribute fixes to them. A significantly smaller number of devs can (or could, if they had bandwidth/time) understand a large self-contained codebase than a very small self-contained codebase.
I think that if a larger library is made of many smaller, modular, self-contained parts, that are composed together, and you can swap out any smaller building block for something that fits your use case (inversion of control), then that's a really good way to write a large library. Sloppier code might find it's way in due to time/resource constraints though, or the authors might not notice that some of the code isn't entirely modular/composable/swappable.
> I would argue at least as much if not more often
The same. The frequency of checks is not dependent on the library size and depends on the risk profile of your application, so library size cannot be considered an advantage with regards to updates. In most cases upgrade to newer version can be done automatically, because it is unrealistic expectation that developers will review the code of every dependency and understand it. Breaking changes likely occur with the same frequency (rare), albeit for different reasons, and impact can be on the same scale for tiny and large depenencies.
If these libraries are so small, self-contained and "completed", why not just copy-paste these functions?
Submodules can work too, but do you really need these extra lines in your build scripts, extra files and directories, and the import lines just for a five line function? Copy-pasting is much simpler, with maybe a comment referring to the original source.
Note: there may be some legal reasons for keeping "micro-libraries" separate, or for not using them at all though but IANAL as they say.
As soon as source code is in your repo it's way more probable to getting touched. I'd never open that box ever because I don't want to waste time with my team touching code that they shouldn't when reviewing.
If you want the same functionality, build it according to the conventions in the codebase and strip out everything else that isn't required for the exact use case (since it's not a library anymore)
">The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things."
The Unix philosophy is also built on willful neglect of systems thinking. The complexity of system isn't in the complexity of its parts but in the complexity of the interaction of its parts.
Putting ten micro-libraries together, even if each is simple, doesn't mean you have a simple program, in fact it doesn't even mean you have a working program, because that depends entirely on how your libraries play together. When you implement the content of micro-libraries yourself you have to be at the very least conscious not just of what, but how your code works, and that's a good first defense against putting parts together that don't fit.
It's not a willful neglect of systems thinking. Functional programmers have been able to build very large programs made primarily of pure functions that are composed together. And it makes it much easier to debug as well, because everything is self-contained and you can easily decompose parts of the program. Same with the effectful code as well, leveraging things like algebraic effects.
> The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.
They have small programs, but that are not of different project. For example all the basic Linux utilities are developed and distributed as part of the GNU coreutils package.
It's the same of having a modular library, with multiple functions in them, that you can choose from. In fact the problem is that these function like isNumber shouldn't even be libraries, but should be in the language standard library itself.
> I would argue the problem is how dependencies in general are added to projects
But you need the functionality anyway, so there are two dependencies: on your own code, or on someone else's code. But you can't avoid a dependency, and it comes at a cost.
If you don't know how to code the functionality, or it will take too much time, a library is an outcome. But if you need leftPad or isNumber as an external dependency, that's so far in the other direction, it's practically a sign of incompentence.
I don't care. The problem in this case is that if you install "isNumber", you use someone else's definition of what a number is. If you ever have to check if something is a number, you should do that according to your own specs, not hope someone else got it right. In this case, strings with all kinds of weirdness seem to be allowed, and perhaps that's not acceptable.
Only the fact that you can write a bunch of slightly different versions because of Javascript's flaws/features should be a sign that you just can't grab an arbitrary implementation and hope it matches your use case, and especially not if you don't/can't pin the version.
The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.
This year I started learning FORTH, and it's very much this philosophy. To build a building, you don't start with a three-story slab of marble. You start with a hundreds of perfect little bricks, and fit them together.
If you come from a technical ecosystem outside the Unix paradigm, it can be hard to grasp.
Yeah, exactly! FORTH looks really awesome, I haven't gotten around to learning it much though. I heard it's addictive and fun.
Yeah, it's all concatenative programming: FORTH, unix pipes, function composition as monoids, effect composition as kliesli composition and monads, etc.
It makes it super useful for code readability (once you're familiar with the paradigm), and debugging, since you can split up and decompose any parts of your program to inspect and test those in isolation.
I mean, sort has put on some weight over the years, sure. But if it were packaged up for npm people would call it a micro-library and tell you to just copy it into your own code.
Yet, they are still a lot bigger than most micro-libraries. And more complex. And most of them tend to be parts of the same package (coreutils). So no, they have nothing in common with microlibraries. Not in concept, not in how they are used, shipped or maintained.
Yeah, the thing is that `yes` isn't a stand alone project, it is usually part of a bigger project such as coreutils (https://github.com/coreutils/coreutils/).
For the comparison to be valid you would have to split up coreutils into roughly 100 individual repositories and replace many of the implementations with ones that are trivial, buggy, and/or unmaintained that pose a supply chain attack risk because it gets hard to keep track of what's maintained, by whom and how. Coreutils is close to 100kLOC and its programs aren't packaged individually. It is far, far from the random mess that are microlibraries in NPM.
less (17kLOC), awk (43kLOC) and grep (4kLOC) are separate projects, but some of those require a bit more insight than much application code these days, so it makes sense that they are individual projects.
> For the comparison to be valid you would have to split up coreutils into roughly 100 individual repositories and replace many of the implementations with ones that are trivial, buggy, and/or unmaintained that pose a supply chain attack risk because it gets hard to keep track of what's maintained, by whom and how.
You could paraphrase that as: core utilities that ship with my operating system should obviously be more reliable than random code fetched from the internet.
> For the comparison to be valid
I was responding to the limited scope of your statement "Yet, they are still a lot bigger than most micro-libraries. And more complex." Utilities like `yes` and `true` are neither big nor complex. The man pages are longer than the source code necessary to replace them.
Right! So if it is indeed so easy to understand what is going on, why would you need to make it an external dependency that can update itself behind your back?
If you understand what is going on, paste it into your tree.
> Micro-libraries are really good actually, they're highly modular, self-contained code
Well I think that is the point, they're not self-contained. You are adding mystery stuff and who knows how deep the chain of dependencies go. See the left-pad fiasco that broke so much stuff, because the chain of transitive dependencies ran deep and wide.
NPM is a dumpster fire in this regard. I try to avoid it - is there a flag you can set to say "no downstream dependencies" or something when you add a dependency? At least that way you can be sure things really are self-contained.
There is a "no downstream dependencies" option; it's called writing/auditing everything yourself. Everything else -- be it libraries, monolithic SaaS platforms, a coworker's PR, etc. -- is a trade off between your time and your trust. Past that, we're all just playing musical chairs with where to place that trust. There's no right answer.
Do you know what else is all of that? Writing the five lines of code by hand. Or just letting a LLM generate it. This and everything else I want to reply has already been covered in the article.
Nothing wrong with that either, like I said copy paste works too. A lot of minimalistic programs will just copy in another project.
Forking the code and using that is arguably nicer though IMO, makes it easier pull in new updates from the code, and to be able to track changes and bug fixes easier. I've tried both and find this approach nicer overall.
Micro libraries are ok - TFA even says you can use self-contained blocks as direct source.
Mirco dependencies are a god damn nuisance, especially with all the transitive micro-dependencies that come along, often with different versions, alternative implementations, etc.
Basically, enforce that all libraries have lock files and when you install a dependency use the exact versions it shipped with.
Edit: Can someone clarify why this doesn't work? Wouldn't it make installing node packages work the same way as it does in python, ruby, and other languages?
I'm not sure why you're getting downvoted. The left-pad incident on npm primarily impacted projects that didn't have lockfiles or were not pinning exact versions of their dependencies. I knew a few functional programmers that would freeze the dependencies to an exact version before lockfiles came around, just to ensure it's reproducible and doesn't break in the future. Part of what was to blame was bad developer practice. I like npm ci.
Well if you think about it then yes, but thinking and making sure is a waste of energy and time. Also typing the boilerplate.
This energy could be spent elsewhere.
Like similarly it is not that hard to pick clothes for the day, but it is much easier if you always have the same clothes easily available and can move on with your day.
Why even stop at micro-libraries? Instead of "return num - num === 0" why not create the concept of pico-libraries people can use like "return isNumberSubtractedFromItselfZero(num)" ? It's basically plain English right?
You could say that if all the popular web frameworks in use today were rewritten to import and use hundreds of thousands of pico-libraries, their codebase would be, as you say, composed of many high modular, self contained pieces that are easy to understand.
Another advantage is that because they're so minimal and self-contained, they're often "completed", because they achieved what they set out to do. So there's no need to continually patch it for security updates, or at least you need to do it less often, and it's less likely that you'll be dealing with breaking changes.
The UNIX philosophy is also build on the idea of small programs, just like micro-libraries, of doing one thing and one thing well, and composing those things to make larger things.
I would argue the problem is how dependencies in general are added to projects, which the blog author pointed out with left-pad. Copy-paste works, but I would argue the best way is to fork the libraries and add submodules to your project. Then if you want to pull a new version of the library, you can update the fork and review the changes. It's an explicit approach to managing it that can prevent a lot of pitfalls like malicious actors, breaking changes leading to bugs, etc.