Lab 5.0 - SELinux container integration

Return to Workshop

In this section, we’ll cover the basics of SELinux and containers. SELinux policy prevents a lot of break out situations where the other security mechanisms fail. By default, podman processes are labeled with container_runtime_t and they are prevented from doing (almost) all SELinux operations. But processes within containers do not know that they are running within a container. SELinux-aware applications are going to attempt to do SELinux operations, especially if they are running as root. With RHEL, we write an SELinux policy that says that the container process running as container_runtime_t can only read/write files with the container_file_t label.

!Namespaced

SELinux is not namespaced. Since we do not want SELinux aware applications failing when they run in containers, it was decided to make libselinux appear that SELinux is running to the container processes. In doing so, the libselinux library makes the following checks:

  • The /sys/fs/selinux directory is mounted read/write.

  • The /etc/selinux/config file exists.

If both of these conditions are not met, libselinux will report to calling applications that SELinux is disabled.

Warmup Exercise

Start by running a container that mounts /sys/fs/selinux as read-only then runs a command (id -Z) that requires an SELinux enabled kernel. This is called bind mounting and is done with the -v option. Once the command runs, the container terminates.

podman run --rm -v /sys/fs/selinux:/sys/fs/selinux:ro registry.access.redhat.com/ubi8/ubi id -Z
id: --context (-Z) works only on an SELinux-enabled kernel

As noted above, our workaround to make the container aware that SELinux is running on the host is to mount the /sys/fs/selinux directory as read/write and touch the SELinux configuration file.

Give this a try and confirm the expected SELinux label is reported. You will need to run the container as root with the --privileged flag:

sudo podman run -it --rm -v /sys/fs/selinux:/sys/fs/selinux:rw --privileged registry.access.redhat.com/ubi8/ubi bash

Enter the container and try an SELinux command. it should fail because both conditions have not been met:

id -Z
id: --context (-Z) works only on an SELinux-enabled kernel

Now touch the config file and try again. It should succeed. Note the SELinux label that is returned and the file label:

touch /etc/selinux/config
id -Z
unconfined_u:system_r:spc_t:s0
ls -lZ /etc/selinux/config
-rw-r--r--. 1 root root system_u:object_r:container_file_t:s0:c214,c1004 0 Mar  4 20:16 /etc/selinux/config

Exit the container:

exit

Bind Mounts and Labeling

As you learned above, bind mounts allow a container to mount a directory on the host for general application usage. This lab will help you understand how selinux behaves in different bind mounting scenarios.

On the bastion, let’s create a directory named /data. Make note of the file permissions. Could a container read or write to this directory?

sudo mkdir /data
sudo chmod 755 /data
ls -ld /data
drwxr-xr-x. 2 root root 6 Oct 29 16:33 /data/

Open the permissions on /data to all users:

sudo chmod 777 /data

Run bash in a container and volume mount the /data directory on host to the /data directory in the container’s file system:

podman run --rm -it -v /data:/data registry.access.redhat.com/ubi8/ubi bash

Notice that the shell prompt changes to [root@33659a200e02 /]# when you enter the container namespace.

Did the mount succeed? How can you check?

Run the df command to verify the volume is mounted to the /data directory.

df /data
Filesystem                      Size  Used Avail Use% Mounted on
/dev/mapper/rhel_rhel8b17-root   47G   17G   31G  35% /data

Try to list the contents of /data:

ls /data
ls: cannot open directory /data: Permission denied

Try to create a file in the /data directory? The command will also fail. Why? Hint: Is SELinux is involved? How would you investigate?

date > /data/date.txt
bash: /data/date.txt: Permission denied
exit

The sealert tool will analyze and decode the audit.log file and make suggestions on how to remedy the problem.

To investigate, run the sealert tool on the bastion system:

sudo sealert -a /var/log/audit/audit.log
100% done
found 1 alerts in /var/log/audit/audit.log
--------------------------------------------------------------------------------

SELinux is preventing /usr/bin/bash from write access on the directory data.

..
..
..

There sealert tool produces a lot of information here. It should report that SELinux is preventing /usr/bin/bash from writing to the /data directory. It should also report that to allow bash to have write access to /data, the SELinux label must be changed on that directory.

From the container host, examine the SELinux label of /data and note the type is default_t. From the discussion at the beginning of this lab, you know this is an SELinux label mismatch.

Based on the default SELinux policy for containers, The label of the process does not match the label of the target directory.

ls -ldZ /data
drwxrwxrwx. 2 root root system_u:object_r:default_t:s0 6 Oct 29 16:33 /data/

Take sealerts’s suggestion of changing the label type of the /data directory to container_file_t:

sudo chcon --type container_file_t /data

Confirm that /data is now correctly labeled:

ls -ldZ /data
drwxrwxrwx. 2 root root system_u:object_r:container_file_t:s0 6 Oct 29 16:33 /data/

To allow this container to write to the /data , we also need to change the owner of the directory to ec2-user on the client. Why is this?

sudo chown ec2-user /data

Check the permissions and labels again:

ls -ldZ /data
drwxrwxrwx. 2 ec2-user root system_u:object_r:container_file_t:s0 22 Apr 22 15:54 /data/

Now run the container again and try to write into /data as you did above. Did the write succeed?

podman run --rm -it -v /data:/data registry.access.redhat.com/ubi8/ubi bash
ls /data
date > /data/date.txt

Notice the directory permissions in the container. The owner is root (user namespaces in action):

ls -ldZ /data
drwxr-xr-x. 2 root nobody unconfined_u:object_r:container_file_t:s0 6 May  8 18:39 /data

Time to exit the container namespace:

exit

Finally, check the directory on the host. You should see the file that was created with the correct ownership

ls -lZ /data
total 4
-rw-r--r--. 1 ec2-user users system_u:object_r:container_file_t:s0 29 Apr 22 15:54 date.txt

Private Mounts

Now you’ll let podman create the SELinux labels. To change a label in the container context, you can add either of two suffixes :z or :Z to the volume mount. These suffixes tell podman to relabel file objects on the shared volumes. The :Z option tells podman to label the content with a private unshared label.

Repeat the scenario above but instead add the :Z option to bind mount the /private directory then try to create a file in the /private directory from the container’s namespace.

First examine the default label for any new directory:

sudo mkdir /private
sudo chown ec2-user /private
ls -dlZ /private
drwxr-xr-x. 2 ec2-user root unconfined_u:object_r:default_t:s0 6 Apr  6 13:17 /private

Now run a container in the background that bind mounts /private using the :Z option:

podman run -d --name sleepy -v /private:/private:Z registry.access.redhat.com/ubi8/ubi sleep 9999
07c5aebd894182119668feddf4849d1f75bc5a81a84db222169e5f9b9efa625c

Examine the label again:

ls -dlZ /private
drwxr-xr-x. 2 ec2-user root system_u:object_r:container_file_t:s0:c422,c428 6 Apr  6 13:17 /private

Note the addition of a unique Multi-Category Security (MCS) label (c422,c428) to the directory. SELinux takes advantage of MCS separation to ensure that the processes running in the container can only write to files with the same MCS Label.

Stop and remove the container:

podman rm -f sleepy

Shared Mounts

Repeat the scenario above but instead add the :z option for the bind mount then try to create a file in the /shared directory from the container’s namespace. The :z option tells podman that two containers share the volume content. As a result, podman labels the content with a shared content label. Shared volume labels allow all containers to read/write content.

Create a directory named /shared and examine the label:

sudo mkdir /shared
sudo chown ec2-user /shared
ls -dlZ /shared
drwxr-xr-x. 2 ec2-user root unconfined_u:object_r:default_t:s0 6 Apr  6 14:09 /shared

Now run a container that bind mounts /shared using :z, and then create a file in /shared:

podman run -it --rm --name sleepy -v /shared:/shared:z registry.access.redhat.com/ubi8/ubi bash
date > /shared/file01.txt
exit

On the host, notice the correct SELinux label on the shared directory:

ls -lZ /shared
-rw-r--r--. 1 ec2-user ec2-user system_u:object_r:container_file_t:s0 29 Apr  6 14:11 file01.txt

Repeat with a second container. It should succeed.

podman run -it --rm --name sleepier -v /shared:/shared:z registry.access.redhat.com/ubi8/ubi bash
date > /shared/file02.txt
exit

On the host, confirm the shared directory contains the files created by the containers.

ls -lZ /shared
-rw-r--r--. 1 ec2-user ec2-user system_u:object_r:container_file_t:s0 29 Apr  6 14:11 file01.txt
-rw-r--r--. 1 ec2-user ec2-user system_u:object_r:container_file_t:s0 29 Apr  6 14:15 file02.txt

Read-Only Containers

Imagine a scenario where an application gets compromised. The first thing the bad guy wants to do is to write an exploit into the container, so the next time the application starts up, it starts up with the exploit in place. If the container was read-only it would prevent leaving a backdoor in place and be forced to start the cycle from the beginning.

Container engines added a read-only feature but it presents challenges since many applications need to write to temporary directories like /run or /tmp and when these directories are read-only, the apps fail. Red Hat’s approach leverages tmpfs. It’s a nice solution to this problem because it eliminates data exposure on the host. As a recommended practice, run all applications in production in this mode and only allow write operations to known directories.

To experiment with this feature, run a read-only container and specify a few writable file systems using the --tmpfs option:

podman run --rm -it --name tmpfs --read-only --tmpfs /run --tmpfs /tmp registry.access.redhat.com/ubi8/ubi bash

Now, try the following. What fails and what succeeds? Why?

mkdir /newdir
mkdir: cannot create directory '/newdir': Read-only file system
mkdir /run/newdir
mkdir /tmp/newdir
exit

We’ve covered a lot of ground here on Dan’s favorite topic. You should feel good.


Workshop Details

Domain Red Hat Logo
Workshop
Student ID

Return to Workshop