logoalt Hacker News

nyrikkiyesterday at 11:43 PM1 replyview on HN

Not really s3, and I haven't touched LXC in a long but this may help on the OCI side. I apologize if this is redundant to you.

Remember that UID mapping on namespaces is just a facade with an offset and a range typically based on subuid[0] and subgid[1] today.

In the container `cat /proc/self/uid_map` or by looking at the pid from the host `cat /proc/$PID/uid_map` you can tell what those offsets are.

     $ cat /proc/self/uid_map
         0       1000          1
         1     100000      65536
Here you know that PID =0 in the container maps to PID 1000 in the host, with a length of 1

The Container PID offset of 1 maps to the host offset of 100000 for a length of 65536

With subuid/subgid you can assign ranges to the user that is instantiating the container, in the flowing I have two users that launch containers.

     $ cat /etc/subuid
     debian:100000:65536
     runner:165536:65536

     $ cat /etc/subgid
     debian:100000:65536
     runner:165536:65536
Assuming you pass in the host UID/GID that is how you can configure a compatible user at the entrypoint.

But note that only highly trusted containers should ever really use host bind mounts, it is often much safer to use mount NFS internally.

Host bind mounts of network filesystems, if that is what you are doing, is also fragile as far as dataloss goes. I am an object store fan, but just wanted to give you the above info as it seems hard for people to find.

I would highly encourage you to look into the history of security problems with host bind mounts to see the wack-a-mole that is required with them to see if it fits in with your risk appitite. But if you choose to use them, setting up dedicated uid/gid mappings and setting the external host to the expected effective ID of the container users is a better way than using Idmapped mounts etc...

[0] https://man7.org/linux/man-pages/man5/subuid.5.html [1] https://man7.org/linux/man-pages/man5/subgid.5.html


Replies

zenopraxtoday at 7:23 AM

I appreciate the detailed explanation of ID/GID mapping.

> it is often much safer to use mount NFS internally

This is the config I'm trying to move away from! I don't see how an unprivileged LXC with a bind mount is worse than a privileged container with NFS, FUSE, and nesting enabled (I need all of that if I can't aggregate on the host).

NFS and CIFS within the container requires kernel-level access and therefore the LXC must be privileged. I'd rather have a single defined path.

I tried to get around this using FUSE but it creates its own issues with snapshots/backups (fsfreeze).

If my solutiom works for a regular LXC it will probably work for Podman.