What is the requirement for privileged containers? The post never explains it.

samnco · on April 19, 2017

privileged containers are required for the GPU to be shared with the containers.

By default, the bundle come with a "auto" tag, which will activate privileged containers just when GPUs are detected.

You can enforce "false" to remove that, but then you won't be able to run GPU workloads.

Or you can enforce "yes" and have them activated all the time.

Does that answer the question? Not sure if I understood it right.

puzzle · on April 19, 2017

The Kubernetes docs don't say anything about having to use privileged containers for GPU support. Privileged containers are given tens of Linux capabilities; which of those are actually needed in your setup? Or, conversely, which specific step would fail for an unprivileged container?

Just because I want to use a GPU shouldn't require the power to change the clock, switch UIDs, chown files, mess with logs, reboot the machine, etc.

marcoceppi · on April 19, 2017

Since the GPU libraries are hosted on the node, privileged flag is typically required to make that possible. I'm sure there will be improvements to not require privileged, but today it's mostly a requirement to get anything useful out of containers tapping into GPU.

That said, if you set the allow-privileged flag to false GPU drivers will still be installed but you may not be able to make use of the cuda cores

puzzle · on April 19, 2017

That's weird, because all the times I tried the experimental support, it didn't need privileged containers. From the YAML files, it looks like it's using hostPath directories, but those don't require special privileges, unless you need to write to them:

https://kubernetes.io/docs/concepts/storage/volumes/#hostpat...

I suspect that there is a bug somewhere.

puzzle · on April 19, 2017

Ah, wait:

https://github.com/madeden/blogposts/blob/master/k8s-gpu-clo...

You don't need to mount the /dev entries into the container at all. The experimental support creates them automatically for you when you are using GPU resources. Perhaps it's device nodes, not the libraries that required privileges?

samnco · on April 20, 2017

Hello,

OK I gave it a try and you are absolutely right. For the nvidia-smi, I could run it the /dev/nvidia0, which is cool.

I was also able to run it unprivileged. I guess my mistake was to believe the example from the docs and not test without.

Thanks for sharing that, I'll update my charts and the post accordingly.

puzzle · on April 20, 2017

Awesome! Happy to hear that more containers will run without unneeded privileges.

samnco · on April 19, 2017

Aaah that is interesting. Let me dive into this later today and test my charts without that. It would actually make my life way easier for charting. I got that from a very early stage work and never questioned it again (the /dev stuff). Thanks for pointing that out.