FOSS4G Boston 2017

### About me

* Software Architect @ [Reinventing Geospatial, Inc.
                        (RG<span style="text-transform:
                            lowercase;">i</span>)](http://rgi-corp.com)
                        * Level 20 CLI Wizard (_can cast 9th level GDAL commands_)
                        * Geospatial API development in Java, Android, Javascript
                        * My [GitHub](https://github.com/stevendlander) and
                        [GitLab](https://gitlab.com/stevendlander)
                        * See my
                        [previous](https://github.com/GitHubRGI/FOSS4G-Seoul-2015/blob/master/Friday/1125-1150_BuildingCI/BuildingCI.pdf)
                        [talks](https://github.com/GitHubRGI/FOSS4G-Seoul-2015/blob/master/Thursday/1125-1150_OGCGeoPackageInPractice/OGCGeoPackageInPractice.pdf)
                        from FOSS4G Seoul 2015

Note:
                        My name is Steve Lander, and I have been in the
                        geospatial domain for going on 7 years now in various
                        roles, most recently as a project technical lead and
                        architect.  Applied research is a good way of
                        describing the sorts of projects I have been working on
                        recently, for example, API development for tile
                        caching and storage. I also happen to dabble in the
                        dark art of GDAL command line utilities.
                        ---
                        ### Outline

* Why even use containers? How will they benefit me?
                        * Ok...what are some ways I can make one?  How do I get
                        started?
                        * What are some more advanced topics?

Note:
                        What will we be discussing today: How are containers
                        fundamentally different from other technologies and why
                        is that a good thing?  What are some easy ways to get
                        started making containers that help me do my job?  And
                        finally, for those of you well-versed in Docker and
                        similar programs already, what are some advanced topics
                        to take you further.

### What is a container?

The delta between the files on your system and another
                        file system running in a separate space.

Note:
                        Through file permissions, SELinux, and I am sure many
                        other methods, containers exist as separate
                        filesystems that use as much of your base operating
                        system as possible.
                        ---
                        ### Container vs. VM

A VM is a _copy_ of an entire computer _within_ your
                        computer.  Efficiency is lost in this duplication.

Note:
                        Do you really need everything a VM brings with it?
                        With VMs you end up paying for things you don't
                        necessarily need.
                        ---
                        ### Relevant use cases

Note:
                        From here on, I am going to assume Docker is your tool
                        of choice for manipulating containers.  There are other
                        options such as Vagrant, however, should you be
                        interested in other offerings.
                        ---
                        ### A build environment

Build your applications consistently and reliably

```Dockerfile
                        # Start from this base linux image...
                        FROM debian:8.9 # aka "jessie", but be specific!

# Install this software to the container from a pkg manager
                        RUN apt-get update -y && apt-get install -y \
                            curl \
                            wget \
                            openjdk-8

# Customize my environment they way I need it
                        ENV JAVA_HOME /usr/bin/java
                        ```

Note:
                        * Containers work well as build environments.  If you
                        have ever used TravisCI through GitHub, you have been
                        using containers already.
                        * Your Dockerfile is a recipe on how to make the same
                        thing over and over exactly the same way, so it pays to
                        ensure specific versions of software are used instead
                        of the latest ones.  You wouldn't want updates to your
                        base OS breaking your software unexpectedly!
                        ---
                        ### A build environment (cont'd)

Avoid polluting your desktop environment with
                        dependencies you don't need outside of your development
                        environment

```Dockerfile
                        FROM debian:8.9

# Move to a new folder to work in, creating it if necessary
                        WORKDIR /workspace

# I install so many packages, it isn't worth listing them all
                        COPY install_packages.sh install_packages.sh
                        RUN /bin/bash install_packages.sh

# Build GDAL from scratch (let's say I need MrSID)...
                        COPY gdal_build_script_of_doom.sh gdal_build_script_of_doom.sh
                        RUN /bin/bash gdal_build_script_of_doom.sh
                        ```

Note:
                        Maybe your program needs tons of python libraries that
                        you would not otherwise use on other projects.  Perhaps
                        you build a library like GDAL from source to get
                        additional driver support not otherwise provided due to
                        licensing restrictions along with some SWIG bindings to
                        other languages. In either of these cases the computer
                        you rely on to do your job stays neat and tidy, and
                        when you don't need the dependencies, simply delete
                        your container.
                        ---
                        ### A build environment (cont'd)

Reduce the time it takes for other people to build,
                        run, test, and contribute to your project

```Dockerfile
                        # A Docker container with my code
                        FROM debian:8.9

WORKDIR /workspace
                        RUN git clone https://github.com/stevendlander/foss4g2017 \
                            && cd foss4g2017

COPY helperScript.sh helperScript.sh

EXPOSE 8080 # My web service port
                        EXPOSE 8000 # Debugging port
                        ENV JPDA_ADDRESS 8000
                        ENV JPDA_TRANSPORT dt_socket
                        ```

Note:
                        Create a container for other developers to be dropped
                        into a command prompt, ready to start debugging from
                        their IDE over HTTP.  If you ever have had to spend
                        days getting your environment configured, I hope you
                        can appreciate how much time and frustration this can
                        save.
                        ---
                        ### Local builds

Use a simple container to build your code and
                        artifacts

```bash
                        docker run --rm \
                            -v /your/code:/workspace \
                            --entrypoint cd /workspace && mvn build \
                                maven:3.5-jdk-7
                        ```

Note:
                        This example will mount your code directory to the
                        Docker container workspace folder, then run a maven
                        build command within it.  Once it is done, the
                        container will be cleaned up.  Since you mounted your
                        code directory to the container, all the files it
                        created during your build will stick around on YOUR
                        file system.
                        ---
                        ### Continuous integration

* Create a build-ready container with static analysis
                        tools, security scan tools, code coverage libraries
                        * Save the container with a friendly name
                        ```bash
                        docker commit <hash> my_java_app:1.0
                        ```
                        * Publish it to [DockerHub](https://hub.docker.com)
                        ```bash
                        docker push my_java_app:1.0
                        ```

Note:
                        Load up a machine with everything you need to automate
                        your project to give you as much information on its
                        health as possible.  Save your machine and push it to
                        either DockerHub or another repository of your
                        choosing.
                        ---
                        ### Continuous integration (cont'd :D)

Source that container into your .gitlab-ci.yml file

```yaml
                        test:my_test_phase:
                            image: my_java_app:1.0
                            script:
                                - ./gradlew build
                                - ./gradlew javadocs
                        ```

Note:
                        Use your new container to automate how you build your
                        application in the future. If you forget something,
                        container versions can be updated by changing to 1.0
                        to a 1.1.

### Container creation

Do this
                        ```Dockerfile
                        ...
                        RUN git clone <repo> \
                            && cd <repo> \
                            && ./gradle assemble
                        ...
                        ```
                        Not this
                        ```Dockerfile
                        ...
                        RUN git clone <repo>
                        RUN cd <repo>
                        RUN ./gradlew assemble
                        ...
                        ```

Note:
                        Docker likes to make what it calls layers.  Layers can
                        be rolled back if they fail to build so you don't have
                        to start a long complicated build from scratch.  This
                        sounds useful, and it is if you plan for it, but
                        normally it just creates unnecessary bloat.
                        ---
                        ### Container creation (cont'd)

Do this
                        ```Dockerfile
                        COPY filename.sh new_filename.sh
                        ```
                        Not this
                        ```Dockerfile
                        ADD filename.sh new_filename.sh
                        ```

Note:
                        The ADD command can behave unexpectedly in some
                        instances, when you probably just wanted to simply add
                        a file to a container.  If that is your intent, use
                        copy.  If you really did want local-only tar file
                        extraction or remote URL support however, then by all
                        means, ADD away.
                        ---
                        ### Container creation (cont'd)

* Use the FROM command to inherit from an image that is 
                        already useful, such as Tomcat, GDAL, PostGIS, etc
                        * Stick to one purpose per container, so scaling
                        becomes easier.  More on that later...
                        * Treat your containers as ephemeral

Note:
                        Don't do extra work if you don't have to, right?  If
                        you don't need a GDAL driver that you can only get by
                        compiling from source, then just start with a known
                        good GDAL container.  Keeping your container's purpose
                        modest becomes very important when you start using them
                        together, as we will talk about later.  Lastly, don't
                        expect a new container to be able to pick up where you
                        last left off with the last one.  Unless you mount a
                        directory with state information, it won't work like
                        you would hope.

### Advanced topics
                        
                        Note:
                        By now you should all be drinking the container
                        kool-aid.  But how can you take your Level 19 CLI
                        Wizarding skills to the 20th level?
                        ---
                        ### Orchestration

Many containers all working together in a coherent way.

Examples:
                        * GeoServer + PostGIS
                        * NodeJS front-end webapp + Tomcat REST endpoint +
                        Postgres data store
                        * Tomcat serving Leaflet + MapProxy + RabbitMQ

Note:
                        Once you separate out your containers by
                        responsibility, you can combine them in ways that make
                        sense.  Maybe your development environment extends
                        GeoServer and uses OSM data stored in PostGIS.  Perhaps
                        you have a NodeJS webapp and want Tomcat and GDAL to
                        handle some raster manipulation.  Maybe Tomcat serves out
                        a Leaflet front end that you have added some custom
                        functionality to create tile packages by sending jobs
                        to a messaging server, which can get picked up by
                        multiple instances of MapProxy.
                        ---
                        ### 12 factor applications

* A dozen helpful considerations to keep in mind when making
                        containers that scale
                        * Some range from the obvious (use SCM, duh) to the
                        kinda-scary (put your credentials in ENV variables)
                        * Quick read: [12factor.net](https://12factor.net)

Note:
                        That brings me to my next point: we should really start
                        to think about building scalable apps from the get-go.
                        These might not all apply to the application you are
                        building but I would urge you to at least keep them in
                        the front of your mind when making a container you
                        think might end up as a legit application.
                        ---
                        ### Alpine

* Small footprint linux distribution, perfect for a
                        production container
                        * No, really, it is SUPER small (_Under 4MB_)
                        * Based on Busybox (apk instead of apt)

Note:
                        To get the most efficiency out of your containerized
                        applications, you want them to be as small as possible.
                        This is where Alpine, a stripped down linux
                        distribution, comes in.  Alpine is based on busybox, so
                        it kinda-sorta acts like android linux.  Many official
                        docker images such as Postgres, Tomcat, and Node, for
                        example, also publish an alpine version of their
                        releases so docker users downstream can use their
                        images in production environments.
                        ---
                        ### Gripes >_<
                        
                        Build environment containers typically run as _root_
                        which a linux host will honor
                        
                        Note:
                        So I find it frustrating when I run my Java build in my
                        build environment container and once it is done, I have
                        a build directory full of files that belong to root.
                        This is because docker containers typically run as the
                        root user.  Windows hosts don't have that problem but I
                        still won't switch, no offense.
                        ---
                        ### Gripes >_< (cont'd)

Containers can only run on like-platforms, i.e.,
                        Windows-Windows, Linux-Linux

Note:
                        If you want to run a Windows host
                        on your linux machine, I am afraid you are out of luck
                        unless you get spicy with a Virtualbox set up.  Windows
                        can get around this problem because the Docker platform
                        there uses Boot2Docker as a base image with Virtualbox.
                        ---
                        ### Gripes >_< (cont'd)

Windows containers are _getting there_...

Note:
                        Windows containers do exist but I admit to having
                        little experience with them.  As far as using them for
                        Windows builds or continuous integration, I can't say
                        they would substantial benefit over a Jenkins machine
                        running your builds.
                        ---
                        ### Gripes >_< (cont'd)

...but they still cause headaches with HyperV and
                        other emulation platforms like Android when your host
                        is Windows

Note:
                        Even when you are on Windows, rocking a linux
                        container, expect some headaches with virtualization
                        bios settings even in modern hardware.  Android
                        emulators don't place nice in the same sandbox. My
                        coworkers have switched to linux on their laptops to
                        avoid this issue.