Why would someone want to use this instead of say base images made specifically for containers? like alpine for eg.?
And for languages like golang (in their examples) - why/how would anyone get such huge container images in the first place? Doesn't go give a neat statically linked binary?
One word... libc :-) There's tons of info on the internet about the gotchas in Alpine. Even fly.io gave up on using/supporting Alpine based container images and they are probably one of the most competent and capable engineering teams out there.
Honestly I didn't know that. I have only used alpine based images for some pure python webservers and was fairly happy with the size and no bugs so far. So what is the go to distro for base image for containers these days? Ubuntu-minimal?
And yeah libc was a pain for us even for AppImages. You'd think that something as fundamental as C library would be standardized on Unixes...
Alpine's libc is intentionally non-standard* and Alpine (at least, some of their members) explicitly state that Alpine is not GNU Linux while most other distros people use are :)
distroless is picking up a lot of interest especially with its recent uptick in its adoption in the kubernetes community.
Good question. Alpine is already small enough that it seems a little odd to go to elaborate measures to reduce image size further. Seems better to me to start with a minimalist image and only add what you need to make your app work than to start with a huge image with everything, install your app, and rely on something like this to find only the things you don't need to remove and not make any mistakes.
you can use whatever base image you want lets say ubuntu:latest (i dont like alpine) and normaly base images tend to include a lot of stuff that doesnt have any place in container think why do i need a tool for ext4 managment inside contianer makes no sense ok for production throw it out thats what docker-slim does
and gets rid of vulnarabilities in programs that are not used by your program by simply getting rid of them
this removes far more then multistage docker build ever would, do you need bash dash or passwd or many other binaries and files in image that are in by default no you dont only way to do anything simular to what docker-slim does is with scratch image which doesnt work if you dont copy everything you need in
Still seems kind of silly. If you base everything on ubuntu minimal, you'll only have the one copy of that base image, which is a fraction of the size of the `docker` and `dockerd` binaries added together. No server running docker will have a problem keeping one or two versions of ubuntu minimal on it.
But if you go around "minifying" all your applications independently, you won't have that shared base layer. One application needs `sh` and another doesn't? Now you get two entire base layers, one with it and one without. Sure, each image's total size will be less, but the size of all your different images added up will be greater because you killed the sharing.
If for some reason the 29 megs of ubuntu minimal (or even fewer for alpine) are a problem (which they aren't on your server that already has over a hundred megs of `docker` binaries), then the right solution is to better control layer sharing. Ensure that you don't have different base layers between your applications. And then--strictly for kicks and giggles--you could minify that base layer to the minimal set of what all your images require. To save a 51K `passwd` binary (woohoo!).
one question is is possible in any kind of way that that passwd or any other binary that stays that you dont need has a security vulnarability that could if someone got into the container in one way or another(most likely your app) cause trouble on the host.
hint yes it is and that could be a problem a giuant one
The problem is not about removing though. The problem is what/who guarantees that nothing broke after all these files are removed? Especially in obscure code paths in nested dependencies?
With something like alpine linux/ubuntu minimal, you trust the package maintainers to make sure that if you use python in your docker image it would work like it worked for them. Out here, it just says "Yes (it is safe)! Either way, you should test your Docker images.".
As a bad example, if a library used by your application uses a different "theme" requiring different files at night and different files during the day, you might still say "it worked during my tests" but things definitely broke and the only thing you can blame is this overzealous tool.
That bad example was from back when i was trying to make AppImages for an application we used. At first all we did was recursively collect all the libraries reported by ldd. Then it turned out some libraries were only being dlopen'ed by other libraries under specific circumstances and we missed them. So we manually added those libraries. Then it turned out that we missed the config files and other resources used by those libraries. Eventually we shipped all the files belonging to all the distro packages used by the libraries we used and left it at that.
in some cases i essentialy ensure my whole app remains using --include-path flags so that i get a removal of you know things that i absolutly dont need.
They don't even contain a kernel - Docker containers use the host kernel. Container runtimes based on VMs like Firecracker's firecracker-containerd typically supply the kernel themselves.
On the other hand, scratch is a special image that contains literally nothing, 0 bytes. Docker doesn't have to talk to the network to download it, it doesn't have to be built, it's just a tarball of nothing.
scratch is thus infinitely smaller than distroless since it has no size.
It's also not suitable for rust since rust applications will commonly link against openssl for tls, against a libc implementation for things like network operations and threads, and so on.
Rust can link to rustls instead of openssl, musl targets are very convenient (and can statically link musl), and https://crates.io/crates/trust-dns-resolver can be used instead of doing dns via libc
Because of such tooling / ecosystem, it's more convenient to use scratch from Rust than from most languages (including C)
And for languages like golang (in their examples) - why/how would anyone get such huge container images in the first place? Doesn't go give a neat statically linked binary?