Slides from my Talk about LXC and Docker.

Overview

  • Introduction
  • LXC and Docker in a Nutshell
  • Isolation and Resource Managment
  • Security Considerations
  • Container Applications
  • Outlook

Introduction


Virtualization Categories

  • full virtualization: enables running an unmodified OS.
    • Examples: Parallels , VirtualBox, XEN
  • paravirtualization: enables running a modified guest system (kernel)
    • Examples: XEN
  • OS-level virtualization: enables running an isolated process (tree)
    • Examples: OpenVZ, LXC, BSD-jails, Linux-VServer, Solaris Zones
    • Virtualized Containers: LXC, Docker

Are containers virtualization at all?


Why should I care about container virtualization?

  • Lightweight (almost no overhead)
  • Docker Inc.: 55M venture capital
  • Docker Supporters:
    • RedHat (OpenShift)
    • Microsoft (Azure, Windows Server)
    • Google (GCE)
    • Amazon (AWS Beanstalk)

What is LXC?

LXC: LinuX Containers

From https://linuxcontainers.org:

  • LXC is a userspace interface for the Linux kernel containment features.
  • … it lets Linux users easily create and manage system or application containers.

Started in 2008, implemented in C/Python


What is Docker?

From http://www.docker.com:

  • Build, Ship and Run Any App, Anywhere
  • Docker - An open platform for distributed applications for developers and sysadmins.
    • portable, lightweight runtime and packaging tool
    • cloud service for sharing applications and automating workflows
  • As a result, IT can ship faster and run the same app, unchanged, on laptops, data center VMs, and any cloud.

Started in 2013, implemented mostly in Go


LXC and Docker in a Nutshell


LXC Installation


LXC System Container

Agenda:

  • Create debian wheezy LXC container
  • Install and test openssh-server
  • Cleanup container

Tools:

  • lxc-create / lxc-start / lxc-stop / lxc-destroy
  • lxc-attach: shell in running container
  • lxc-ls: list container (status)
  • pstree: show process tree on host

** Documentation: LXC-Manpages**


LXC System Container


Docker

Agenda:

  • Create salt-master image
  • Run salt-master container
  • Publish salt-master image

Docker Installation


Images

…consist of:

  • Filesystem (Layer)
  • Meta-Information:
    • Exposed ports
    • Mountable volumes
    • Entrypoint / Command
    • (typically) derived from other layer

…are built from:

  • Template aka Dockerfile

The Dockerfile

FROM ubuntu
RUN export DEBIAN_FRONTEND=noninteractive && \
    apt-get update && \
    apt-get install -y salt-master
    apt-get clean

EXPOSE 4505
COPY master /etc/salt/master

VOLUME ["/etc/salt/pki"]
VOLUME ["/srv/"]

ENTRYPOINT [ "/usr/bin/salt-master" ]

Build Image


Containers

Containers are instances of images

  • Create & Start = Run
  • Can be stopped and restarted
  • Have their own filesystem layer ontop

Run Container


Publish aka “Push” Image

Images with appropriate tag can be pushed / pulled.

  • docker [push|pull] imagename

Tagging:

  • Tag Format: [REGISTRYHOST/][USERNAME/]NAME[:TAG]
    • Public (“official”) busybox, ubuntu:trusty, redis
    • Public (user): martinhoefling/salt-minion:latest
    • Private: localhost:5000/saltmaster

Registries:

  • Public Registry (aka Dockerhub)
  • Private Registry
    • docker run -p 5000:5000 registry

Push/Pull to/from Registry


Execute into running container

One of the most wanted features / new in docker 1.3

Example Usecases:

  • Run a shell to inspect a running container
  • Run a utility script, i.e. mongodump database

Usage:

docker exec -t -i mycontainer /bin/bash

Should not be (ab)used to start multiple daemon processes per container (Docker antipattern)

See also Best Practices


Isolation and Resource Managment

LXC and Docker use same or similar kernel and userland functionality.


cgroups

control groups (CG) allow limiting resources of a process tree. E.g. mem, cpu, device access, …


cgroups


namespace isolation aka “namespaces”

Isolation of:

  • processes (PID namespace), the parent pid ns sees all pids
  • network (network namespace), separates physical and virtual nic
  • hostname (UTS namespace)
  • file system layout (mount namespace)
  • users and groups (user namespace), user mapping including the root user (i.e. to nonpriviledged ones in the parent namespace)
  • sysv inter-process communication (IPC namespace), separates shared memory, pipes, sockets, etc.

Create namespaces via clone syscall instead of fork. Namespaces are found in /proc/$PID/ns/.

Verbose Article


namespaces demo


Aufs

Another union fs allows creting a file system union by merging separate filesystems as layers into one virtual filesystem.

ssd / hdd merged home directory

mount -t aufs -o br=/home_ssd=rw:/mnt/storage/home_hdd=rw,udba=reval none /home

rw volatile home ontop of ro nfs home

mount -t aufs -o br=/volatile=rw:/mnt/home_nfs=ro,udba=none none /home

Further security features used

  • Apparmor / SELinux: provide Mandatory Access Control via profiles
  • seccomp-bpf (LXC): blacklist or whitelist system calls
  • capability drop: dropping capabilities not required

Optional:

  • GRSEC / PAX enabled kernel

Summarizing Talk


Further Reading


Security Considerations


Full / Para-virtualization

Main attack vector is the Hypervisor

Examplary Xen weakness: CVE-2014-7188

  • Improper MSR range used for x2APIC emulation

OS-level / Container Virtualization

Main attack vector: … not so simple …

  • Proper implementation of kernel isolation, i.e. namespaces, cgroups
  • Security of syscalls (kernel)
  • Exotic filesystems / Aufs (Docker)
  • Proper configuration of SELinux / Apparmor
  • Direct device access is dangerous! Avoid if possible.

Some Remarks on Secure Application

  • Providing isolated Environments instead of running on one instance: better than without
  • Isolation of Simple Services, networking only, unprivileged user, no device access: mostly safe in default configuration
  • Service with hardware access: can be difficult to lock down
  • Multi Tenant Platform: quite some effort to lock down / stay secure

Further Reading / Opinions


Container Applications


Discosrv: Container as Build Environment

FROM ubuntu
MAINTAINER Martin Hoefling <martin.hoefling@gmx.de>
ENV GOPATH /root/go
RUN DEBCONF_FRONTEND=noninteractive \
    apt-get update && \ 
    apt-get install -y golang git mercurial && \
    apt-get clean && \
    mkdir /root/go && cd /root/go && \
    go get github.com/syncthing/discosrv && \
    apt-get remove -y --purge golang git mercurial && \
    apt-get autoremove -y && \
    cp bin/discosrv /usr/local/bin && rm -rf /root/go

EXPOSE 22026/udp
ENTRYPOINT /usr/local/bin/discosrv

Benefit:

  • Resulting image (layer) with only 7MB
  • Service with isolated environment ready to start

Aptly: Containerizing tools

FROM ubuntu:trusty
RUN echo "deb http://repo.aptly.info/ squeeze main" > \ 
    /etc/apt/sources.list.d/aptly.list && \
    apt-get update && \
    apt-get install -y aptly ca-certificates && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* /tmp/* /var/tmp/*

COPY aptly.conf /etc/aptly.conf
VOLUME ["/aptly"]
ENTRYPOINT ["/usr/bin/aptly"]

Aptly: Using a containerized tool

Shell wrapper:

#!/bin/bash
docker run --rm -v /mnt/aptly:/aptly aptlyimage $@

Run aptly:

$ aptly mirror create wheezy-security http://security.debian.org/ wheezy/updates main
$ aptly mirror update wheezy-security
$ aptly db cleanup

Benefits:

  • Portable to other machines, running Docker
  • Isolated upgrades / environment

Salt Formula: Configuration Management Testdrive

FROM martinhoefling/salt-minion:debian
MAINTAINER Martin Hoefling <martin.hoefling@gmx.de>

COPY dovecot /srv/salt/dovecot
COPY pillar.example /srv/pillar/example.sls

RUN echo "file_client: local" > /etc/salt/minion.d/local.conf && \
    echo "base:" > /srv/pillar/top.sls  && \
    echo "  '*':" >> /srv/pillar/top.sls  && \
    echo "    - example" >> /srv/pillar/top.sls && \    
    salt-call --local --retcode-passthrough state.sls dovecot

Run Test:

docker build .

Web Project: Dockerize Jenkins Pipeline

Jenkins pipeline for a web project python / javascript.

  • Python:
    • PyLint (Virtualenv)
    • Nosetests (Virtualenv)
  • Javascript:
    • Various Lints (Node packages)
    • Jasmine Tests (Node packages)
  • End to End:
    • CasperJS:
    • api, portal and realtime tornado servers (Virtualenv)
    • frontend build (Node packages, Bower components)
    • elasticsearch, mongo, redis
    • CasperJS client (Node packages)

Web Project: Dockerize Jenkins Pipeline

The Jenkins Dependency Image (JeDI):

FROM myregistry:5000/trusty:latest

RUN DEBIAN_FRONTEND=noninteractive apt-get install -y tar git nodejs python 
RUN npm update -g && npm install -g grunt-cli bower casperjs
RUN pip3 install virtualenv

COPY dependencies/ /opt/
RUN virtualenv -p /usr/bin/python3 /opt/backend && \
    cd /opt/backend/ && ./bin/pip3 install -r requirements-dev.txt
RUN cd /opt && npm install && \
    cd /opt/frontend/portal/ && npm install
RUN cd /opt/frontend/ && bower --allow-root install

COPY start.sh /usr/local/bin/start.sh

EXPOSE 5000 5001 44444

VOLUME ["/sourcetree", "/data/log"]
ENTRYPOINT ["/usr/local/bin/start.sh"]

No sources in container, rebuild only required when dependencies change!


Web Project: Dockerize Jenkins Pipeline

The JeDI in the Build Pipeline

  • Update source tree & copy build dependencies
  • Build image (instantaneous if dependencies are the same)
  • Run all tests / lints / builds in parallel. Each:
    • Spin up container(s) with mounted source tree
    • Run testsuites / app / frontend build in parallel
    • Destroy containers
    • Evaluate Logs

Benefit:

  • ** Full Jenkins Pipeline Run in 2:50 minutes (before ~16 minutes) **

Outlook

  • Rocket (App Container engine): Docker competitor from the CoreOS people.
  • Kubernetes (Docker Orchestration): Google (GoogleCloudPlatform, GCE), based on Salt
  • Docker Machine and Docker Swarm
  • LXD (LXC Orchestration): OpenStack Nova plugin from the makers of LXC.
  • Mesosphere “Datacenter Operating System”

LXC vs. Docker (*)

The Battle May Be Over, but War Has Just Begun!

(*) www.flockport.com/lxc-vs-docker