In addition to authorization policies that control what a user can do, OpenShift Container Platform provides security context constraints (SCC) that control the actions that a pod can perform and what it has the ability to access. Administrators can manage SCCs using the CLI. This is a great way to lock down you individual application to make sure that they have hardened settings. This applies to applications running on OpenShift.
SCCs are objects that define a set of conditions that a pod must run with in order to be accepted into the system. They allow an administrator to control the following:
Running of privileged containers.
Capabilities a container can request to be added.
Use of host directories as volumes.
The SELinux context of the container.
The user ID.
The use of host namespaces and networking.
Allocating an FSGroup that owns the pod’s volumes
Configuring allowable supplemental groups
Requiring the use of a read only root file system
Controlling the usage of volume types
Configuring allowable seccomp profiles
Change to system
.
oc login -u system:admin
Get a current list of SCCs.
oc get scc
Get a closer look at the restricted
SCC.
oc describe scc restricted
Name: restricted
Priority: <none>
Access:
Users: <none>
Groups: system:authenticated
Settings:
Allow Privileged: false
Default Add Capabilities: <none>
Required Drop Capabilities: KILL,MKNOD,SYS_CHROOT,SETUID,SETGID (1)
Allowed Capabilities: <none> (2)
Allowed Volume Types: configMap,downwardAPI,emptyDir,persistentVolumeClaim,secret
Allow Host Network: false
Allow Host Ports: false
Allow Host PID: false
Allow Host IPC: false
Read Only Root Filesystem: false
Run As User Strategy: MustRunAsRange
UID: <none>
UID Range Min: <none>
UID Range Max: <none>
SELinux Context Strategy: MustRunAs (3)
User: <none>
Role: <none>
Type: <none>
Level: <none>
FSGroup Strategy: MustRunAs
Ranges: <none>
Supplemental Groups Strategy: RunAsAny
Ranges: <none>
1 | Linux Capabilities to be dropped. |
2 | Linux Capabilities that could be added. |
3 | SELinux options |
Create a SCC
As most people at this workshop are security minded people we will skip over adding permissions and capabilities to containers, let’s look at how tighten things up. We will look at dropping certain Linux Capabilities in this SCC file.
---
kind: SecurityContextConstraints
apiVersion: v1
metadata:
name: drop-capabilities
allowPrivilegedContainer: false
runAsUser:
type: RunAsAny
seLinuxContext:
type: RunAsAny
fsGroup:
type: RunAsAny
supplementalGroups:
type: RunAsAny
users:
- drop-user
groups:
- drop-group
requiredDropCapabilities:
- KILL
- MKNOD
- SYS_CHROOT
- SETPCAP
- NET_BIND_SERVICE
- NET_RAW
- SYS_CHROOT
- MKNOD
- AUDIT_WRITE
- SETFCAP
Copy the text above. Type vim drop-capabilities.yml , Press i for
Insert, then cut and paste control + v , then escape and write the file esc ,
:wq .
|
oc create -f drop-capabilities.yml
Create Service Account
oc create serviceaccount section9
Describe service account
oc describe serviceaccount section9
Add to service account
drop-capabilities.yml
to the service accountoc adm policy add-scc-to-user drop-capabilities system:serviceaccount:sso:section9
Now lets view the policy again and see that our service account was added.
oc describe scc drop-capabilities
Name: drop-capabilities
Priority: <none>
Access:
Users: dc17-user,system:serviceaccount:sso:section9
Groups: dc17-group
Settings:
Allow Privileged: false
Default Add Capabilities: <none>
Required Drop Capabilities: KILL,MKNOD,SYS_CHROOT,SETPCAP,NET_BIND_SERVICE,NET_RAW,SYS_CHROOT,MKNOD,AUDIT_WRITE,SETFCAP (1)
Allowed Capabilities: <none> (2)
Allowed Volume Types: awsElasticBlockStore,azureDisk,azureFile,cephFS,cinder,configMap,downwardAPI,emptyDir,fc,flexVolume,flocker,gcePersistentDisk,gitRepo,glusterfs,iscsi,nfs,persistentVolumeClaim,photonPersistentDisk,quobyte,rbd,secret,vsphere
Allow Host Network: false
Allow Host Ports: false
Allow Host PID: false
Allow Host IPC: false
Read Only Root Filesystem: false
Run As User Strategy: RunAsAny
UID: <none>
UID Range Min: <none>
UID Range Max: <none>
SELinux Context Strategy: RunAsAny
User: <none>
Role: <none>
Type: <none>
Level: <none>
FSGroup Strategy: RunAsAny
Ranges: <none>
Supplemental Groups Strategy: RunAsAny
Ranges: <none>
1 | Linux Capabilities to be dropped. |
2 | Linux Capabilities are allowed, currently none. |
Seccomp (secure computing mode) is used to restrict the set of system calls applications can make, allowing cluster administrators greater control over the security of workloads running in OpenShift Container Platform. Seccomp is applied at the individual host level. The ability to lock down certain hosts in a cluster can help to make you architecture more secure by running untrusted code or internet facing servers in a more restricted mannor than the rest of the nodes in the cluster.
Seccomp support is achieved via two annotations in the pod configuration:
seccomp.security.alpha.kubernetes.io/pod: profile applies to all containers in the pod that do not override
container.seccomp.security.alpha.kubernetes.io/<container_name>: container-specific profile override
Applications use seccomp
to restrict the set of system calls they can make.
Recently, container runtimes have begun adding features to allow the runtime to
interact with seccomp
on behalf of the application, which eliminates the need
for applications to link against libseccomp
directly. Adding support in the
Kubernetes API for describing seccomp
profiles
will allow administrators
greater control over the security of workloads running in Kubernetes.
The systemd seccomp facility is based on a whitelist of system calls that can be made, rather than a full filter specification.
<div class="alert alert-warning">
<span class="pficon pficon-warning-triangle-o"></span>
Containers are run with unconfined seccomp settings by default.
</div>
cat /boot/config-`uname -r` | grep CONFIG_SECCOMP=
Seccomp support is achieved via two metadata annotations in the pod configuration:
annotations:
seccomp.security.alpha.kubernetes.io/pod (1)
annotations:
container.seccomp.security.alpha.kubernetes.io/<container_name> (2)
1 | profile applies to all containers in the pod that do not override |
2 | container-specific profile override |
Here’s an example of a pod that uses the unconfined profile:
apiVersion: v1
kind: Pod
metadata:
name: trustworthy-pod
annotations:
seccomp.security.alpha.kubernetes.io/pod: unconfined (1)
spec:
containers:
- name: trustworthy-container
image: sotrustworthy:latest
1 | Use Kubernetes Pods metadata lables to define the Seccomp profile. This one is using a unconfined profile. |
Here’s an example of a Pod that uses a profile called example-explorer-profile
. This is a sample program that only can set permissions on files and move them to diffrent locations.
apiVersion: v1
kind: Pod
metadata:
name: explorer
annotations:
container.seccomp.security.alpha.kubernetes.io/explorer: localhost/example-explorer-profile (1)
spec:
containers:
- name: explorer
image: gcr.io/google_containers/explorer:1.0
args: ["-port=8080"]
ports:
- containerPort: 8080
protocol: TCP
volumeMounts:
- mountPath: "/mount/test-volume"
name: test-volume
volumes:
- name: test-volume
emptyDir: {}
1 | This refers to a custom file policy that resides on the localhost and will apply syscall restrictions to the Pod/containers via Secccomp . |
{
"defaultAction": "SCMP_ACT_ERRNO",
"archMap": [
{
"architecture": "SCMP_ARCH_X86_64",
"subArchitectures": [
"SCMP_ARCH_X86",
"SCMP_ARCH_X32"
]
},
{
"architecture": "SCMP_ARCH_AARCH64",
"subArchitectures": [
"SCMP_ARCH_ARM"
]
},
{
"architecture": "SCMP_ARCH_MIPS64",
"subArchitectures": [
"SCMP_ARCH_MIPS",
"SCMP_ARCH_MIPS64N32"
]
},
{
"architecture": "SCMP_ARCH_MIPS64N32",
"subArchitectures": [
"SCMP_ARCH_MIPS",
"SCMP_ARCH_MIPS64"
]
},
{
"architecture": "SCMP_ARCH_MIPSEL64",
"subArchitectures": [
"SCMP_ARCH_MIPSEL",
"SCMP_ARCH_MIPSEL64N32"
]
},
{
"architecture": "SCMP_ARCH_MIPSEL64N32",
"subArchitectures": [
"SCMP_ARCH_MIPSEL",
"SCMP_ARCH_MIPSEL64"
]
},
{
"architecture": "SCMP_ARCH_S390X",
"subArchitectures": [
"SCMP_ARCH_S390"
]
}
],
"syscalls": [
{
"names": [
"accept",
"accept4",
"access",
"adjtimex",
"alarm",
"alarm",
"bind",
"brk",
"capget",
"capset",
"chdir",
"chmod",
"chown",
"chown32",
"clock_getres",
"clock_gettime",
"clock_nanosleep",
"close",
"connect",
"copy_file_range",
"creat",
"dup",
"dup2",
"dup3",
"epoll_create",
"epoll_create1",
"epoll_ctl",
"epoll_ctl_old",
"epoll_pwait",
"epoll_wait",
"epoll_wait_old",
"eventfd",
"eventfd2",
"execve",
"execveat",
"exit",
"exit_group",
"faccessat",
"fadvise64",
"fadvise64_64",
"fallocate",
"fanotify_mark",
"fchdir",
"fchmod",
"fchmodat",
"fchown",
"fchown32",
"fchownat",
"fcntl",
"fcntl64",
"fdatasync",
"fgetxattr",
"flistxattr",
"flock",
"fork",
"fremovexattr",
"fsetxattr",
"fstat",
"fstat64",
"fstatat64",
"fstatfs",
"fstatfs64",
"fsync",
"ftruncate",
"ftruncate64",
"futex",
"futimesat",
"getcpu",
"getcwd",
"getdents",
"getdents64",
"getegid",
"getegid32",
"geteuid",
"geteuid32",
"getgid",
"getgid32",
"getgroups",
"getgroups32",
"getitimer",
"getpeername",
"getpgid",
"getpgrp",
"getpid",
"getppid",
"getpriority",
"getrandom",
"getresgid",
"getresgid32",
"getresuid",
"getresuid32",
"getrlimit",
"get_robust_list",
"getrusage",
"getsid",
"getsockname",
"getsockopt",
"get_thread_area",
"gettid",
"gettimeofday",
"getuid",
"getuid32",
"getxattr",
"inotify_add_watch",
"inotify_init",
"inotify_init1",
"inotify_rm_watch",
"io_cancel",
"ioctl",
"io_destroy",
"io_getevents",
"ioprio_get",
"ioprio_set",
"io_setup",
"io_submit",
"ipc",
"kill",
"lchown",
"lchown32",
"lgetxattr",
"link",
"linkat",
"listen",
"listxattr",
"llistxattr",
"_llseek",
"lremovexattr",
"lseek",
"lsetxattr",
"lstat",
"lstat64",
"madvise",
"memfd_create",
"mincore",
"mkdir",
"mkdirat",
"mknod",
"mknodat",
"mlock",
"mlock2",
"mlockall",
"mmap",
"mmap2",
"mprotect",
"mq_getsetattr",
"mq_notify",
"mq_open",
"mq_timedreceive",
"mq_timedsend",
"mq_unlink",
"mremap",
"msgctl",
"msgget",
"msgrcv",
"msgsnd",
"msync",
"munlock",
"munlockall",
"munmap",
"nanosleep",
"newfstatat",
"_newselect",
"open",
"openat",
"pause",
"pipe",
"pipe2",
"poll",
"ppoll",
"prctl",
"pread64",
"preadv",
"preadv2",
"prlimit64",
"pselect6",
"pwrite64",
"pwritev",
"pwritev2",
"read",
"readahead",
"readlink",
"readlinkat",
"readv",
"recv",
"recvfrom",
"recvmmsg",
"recvmsg",
"remap_file_pages",
"removexattr",
"rename",
"renameat",
"renameat2",
"restart_syscall",
"rmdir",
"rt_sigaction",
"rt_sigpending",
"rt_sigprocmask",
"rt_sigqueueinfo",
"rt_sigreturn",
"rt_sigsuspend",
"rt_sigtimedwait",
"rt_tgsigqueueinfo",
"sched_getaffinity",
"sched_getattr",
"sched_getparam",
"sched_get_priority_max",
"sched_get_priority_min",
"sched_getscheduler",
"sched_rr_get_interval",
"sched_setaffinity",
"sched_setattr",
"sched_setparam",
"sched_setscheduler",
"sched_yield",
"seccomp",
"select",
"semctl",
"semget",
"semop",
"semtimedop",
"send",
"sendfile",
"sendfile64",
"sendmmsg",
"sendmsg",
"sendto",
"setfsgid",
"setfsgid32",
"setfsuid",
"setfsuid32",
"setgid",
"setgid32",
"setgroups",
"setgroups32",
"setitimer",
"setpgid",
"setpriority",
"setregid",
"setregid32",
"setresgid",
"setresgid32",
"setresuid",
"setresuid32",
"setreuid",
"setreuid32",
"setrlimit",
"set_robust_list",
"setsid",
"setsockopt",
"set_thread_area",
"set_tid_address",
"setuid",
"setuid32",
"setxattr",
"shmat",
"shmctl",
"shmdt",
"shmget",
"shutdown",
"sigaltstack",
"signalfd",
"signalfd4",
"sigreturn",
"socket",
"socketcall",
"socketpair",
"splice",
"stat",
"stat64",
"statfs",
"statfs64",
"symlink",
"symlinkat",
"sync",
"sync_file_range",
"syncfs",
"sysinfo",
"syslog",
"tee",
"tgkill",
"time",
"timer_create",
"timer_delete",
"timerfd_create",
"timerfd_gettime",
"timerfd_settime",
"timer_getoverrun",
"timer_gettime",
"timer_settime",
"times",
"tkill",
"truncate",
"truncate64",
"ugetrlimit",
"umask",
"uname",
"unlink",
"unlinkat",
"utime",
"utimensat",
"utimes",
"vfork",
"vmsplice",
"wait4",
"waitid",
"waitpid",
"write",
"writev"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 0,
"valueTwo": 0,
"op": "SCMP_CMP_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 8,
"valueTwo": 0,
"op": "SCMP_CMP_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 131072,
"valueTwo": 0,
"op": "SCMP_CMP_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 131080,
"valueTwo": 0,
"op": "SCMP_CMP_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"personality"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 4294967295,
"valueTwo": 0,
"op": "SCMP_CMP_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {}
},
{
"names": [
"sync_file_range2"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"arches": [
"ppc64le"
]
},
"excludes": {}
},
{
"names": [
"arm_fadvise64_64",
"arm_sync_file_range",
"sync_file_range2",
"breakpoint",
"cacheflush",
"set_tls"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"arches": [
"arm",
"arm64"
]
},
"excludes": {}
},
{
"names": [
"arch_prctl"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"arches": [
"amd64",
"x32"
]
},
"excludes": {}
},
{
"names": [
"modify_ldt"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"arches": [
"amd64",
"x32",
"x86"
]
},
"excludes": {}
},
{
"names": [
"s390_pci_mmio_read",
"s390_pci_mmio_write",
"s390_runtime_instr"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"arches": [
"s390",
"s390x"
]
},
"excludes": {}
},
{
"names": [
"open_by_handle_at"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_DAC_READ_SEARCH"
]
},
"excludes": {}
},
{
"names": [
"bpf",
"clone",
"fanotify_init",
"lookup_dcookie",
"mount",
"name_to_handle_at",
"perf_event_open",
"setdomainname",
"sethostname",
"setns",
"umount",
"umount2",
"unshare"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_ADMIN"
]
},
"excludes": {}
},
{
"names": [
"clone"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 0,
"value": 2080505856,
"valueTwo": 0,
"op": "SCMP_CMP_MASKED_EQ"
}
],
"comment": "",
"includes": {},
"excludes": {
"caps": [
"CAP_SYS_ADMIN"
],
"arches": [
"s390",
"s390x"
]
}
},
{
"names": [
"clone"
],
"action": "SCMP_ACT_ALLOW",
"args": [
{
"index": 1,
"value": 2080505856,
"valueTwo": 0,
"op": "SCMP_CMP_MASKED_EQ"
}
],
"comment": "s390 parameter ordering for clone is different",
"includes": {
"arches": [
"s390",
"s390x"
]
},
"excludes": {
"caps": [
"CAP_SYS_ADMIN"
]
}
},
{
"names": [
"reboot"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_BOOT"
]
},
"excludes": {}
},
{
"names": [
"chroot"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_CHROOT"
]
},
"excludes": {}
},
{
"names": [
"delete_module",
"init_module",
"finit_module",
"query_module"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_MODULE"
]
},
"excludes": {}
},
{
"names": [
"acct"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_PACCT"
]
},
"excludes": {}
},
{
"names": [
"kcmp",
"process_vm_readv",
"process_vm_writev",
"ptrace"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_PTRACE"
]
},
"excludes": {}
},
{
"names": [
"iopl",
"ioperm"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_RAWIO"
]
},
"excludes": {}
},
{
"names": [
"settimeofday",
"stime",
"clock_settime"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_TIME"
]
},
"excludes": {}
},
{
"names": [
"vhangup"
],
"action": "SCMP_ACT_ALLOW",
"args": [],
"comment": "",
"includes": {
"caps": [
"CAP_SYS_TTY_CONFIG"
]
},
"excludes": {}
}
]
}