Testing Nix Packages in Docker

In this blog post, we will cover developing nix packages and testing them in docker.  We will set up a container with the proper environment to build nix packages, and then we will test build existing packages from the nixpkgs repo.  This lays the foundation for using docker to test your own nix packages.

First, a quick introduction to Nix. Nix is a purely functional package manager. Nix expressions, a simple functional language, declaratively define each package.  A Nix expression describes everything that goes into a package build action, known as a derivation.  Because it’s a functional language, it’s easy to support building variants of a package: turn the Nix expression into a function and call it any number of times with the appropriate arguments. Due to the hashing scheme, variants don’t conflict with each other in the Nix store.  More details regarding Nix packages and NixOS can be found here.

To begin, we must write the Dockerfile that will be used to build the development environment.  We need to declare the baseimage to be used (centos7), install a few dependencies in order to install nix and clone nixpkgs repo.  We also need to set up the nix user and permissions:

FROM centos:latest

MAINTAINER “Fernando J Pando” <nando@********.com>

RUN yum -y install bzip2 perl-Digest-SHA git

RUN adduser nixuser && groupadd nixbld && usermod -aG nixbld nixuser

RUN mkdir -m 0755 /nix && chown nixuser /nix

USER nixuser

WORKDIR /home/nixuser

We clone the nixpkgs github repo (this can be set to any fork/branch for testing):

RUN git clone

We download the nix installer (latest stable 1.11.4):

RUN curl -o

We are now ready to set up the environment that will allow us to build nix packages in this container.  There are several environment variables that need to be set for this to work, and we can pull them from the environment script provided by nix installation (~/.nix-profile/etc/profile.d/  For easy reference, here is that file:

# Set the default profile.
if ! [ -L &amp;amp;quot;$NIX_LINK&amp;amp;quot; ]; then
echo &amp;amp;quot;creating $NIX_LINK&amp;amp;quot; &amp;amp;gt;&amp;amp;amp;2
/nix/store/xmlp6pyxi6hi3vazw9821nlhhiap6z63-coreutils-8.24/bin/ln -s &amp;amp;quot;$_NIX_DEF_LINK&amp;amp;quot; &amp;amp;quot;$NIX_LINK&amp;amp;quot;

export PATH=$NIX_LINK/bin:$NIX_LINK/sbin:$PATH

# Subscribe the user to the Nixpkgs channel by default.
if [ ! -e $HOME/.nix-channels ]; then
echo &amp;amp;quot; nixpkgs&amp;amp;quot; &amp;amp;gt; $HOME/.nix-channels

# Append ~/.nix-defexpr/channels/nixpkgs to $NIX_PATH so that
# &amp;amp;lt;nixpkgs&amp;amp;gt; paths work when the user has fetched the Nixpkgs
# channel.
export NIX_PATH=${NIX_PATH:+$NIX_PATH:}nixpkgs=$HOME/.nix-defexpr/channels/nixpkgs

# Set $SSL_CERT_FILE so that Nixpkgs applications like curl work.
if [ -e /etc/ssl/certs/ca-certificates.crt ]; then # NixOS, Ubuntu, Debian, Gentoo, Arch
export SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
elif [ -e /etc/ssl/certs/ca-bundle.crt ]; then # Old NixOS
export SSL_CERT_FILE=/etc/ssl/certs/ca-bundle.crt
elif [ -e /etc/pki/tls/certs/ca-bundle.crt ]; then # Fedora, CentOS
export SSL_CERT_FILE=/etc/pki/tls/certs/ca-bundle.crt
elif [ -e &amp;amp;quot;$NIX_LINK/etc/ssl/certs/ca-bundle.crt&amp;amp;quot; ]; then # fall back to cacert in Nix profile
export SSL_CERT_FILE=&amp;amp;quot;$NIX_LINK/etc/ssl/certs/ca-bundle.crt&amp;amp;quot;
elif [ -e &amp;amp;quot;$NIX_LINK/etc/ca-bundle.crt&amp;amp;quot; ]; then # old cacert in Nix profile
export SSL_CERT_FILE=&amp;amp;quot;$NIX_LINK/etc/ca-bundle.crt&amp;amp;quot;

We pull out the relevant environment given a Docker base Centos7 image:

ENV USER=nixuser

ENV HOME=/home/nixuser

ENV _NIX_DEF_LINK=/nix/var/nix/profiles/default

ENV SSL_CERT_FILE=/etc/pki/tls/certs/ca-bundle.crt

ENV NIX_LINK=$HOME/.nix-profile


ENV NIX_PATH=${NIX_PATH:+$NIX_PATH:}nixpkgs=$HOME/.nix-defexpr/channels/nixpkgs

RUN echo “ nixpkgs” > $HOME/.nix-channels


Now that the environment has been set up, we are ready to install nix:

RUN bash /home/nixuser/

The last step is to set the working directory to the cloned github nixpkgs directory, so testing will execute from there when the container is run:

WORKDIR /home/nixuser/nixpkgs

At this point, we are ready to build the container:

docker build . -t nand0p/nixpkgs-devel

Conversely, you can pull this container image from docker hub:

docker pull nand0p/nixpkgs-devel

(NOTE: The docker hub image will contain a version of the nixpkgs repo from the time the container image was built.   If you are testing on the bleeding edge, always build the container fresh before testing.)

Once the container is ready, we can now begin test building an existing nix package:

docker run -ti nand0p/nixpkgs-devel nix-build -A nginx

The above command will test build the nginx package inside your new docker container. It will also build all the dependencies.  In order to test the package interactively to ensure to resulting binary works as expected, the container can be launched with bash:

docker run -ti nand0p/nixpkgs-devel bash
[nixuser@aa2c8e29c5e9 nixpkgs]$

Once you have shell inside the container, you can build and test run the package:

[nixuser@aa2c8e29c5e9 nixpkgs]$ nix-build -A nginx

[nixuser@aa2c8e29c5e9 nixpkgs]$ /nix/store/7axviwfzbsqy50zznfxb7jzfvmg9pmwx-nginx-1.10.1/bin/nginx -V
nginx version: nginx/1.10.1
built by gcc 5.4.0 (GCC)
built with OpenSSL 1.0.2h 3 May 2016
TLS SNI support enabled
configure arguments: –prefix=/nix/store/7axviwfzbsqy50zznfxb7jzfvmg9pmwx-nginx-1.10.1 –with-http_ssl_module –with-http_v2_module –with-http_realip_module –with-http_addition_module –with-http_xslt_module –with-http_image_filter_module –with-http_geoip_module –with-http_sub_module –with-http_dav_module –with-http_flv_module –with-http_mp4_module –with-http_gunzip_module –with-http_gzip_static_module –with-http_auth_request_module –with-http_random_index_module –with-http_secure_link_module –with-http_degradation_module –with-http_stub_status_module –with-ipv6 –with-file-aio –add-module=/nix/store/vbfb8z3hgymbaz59wa54qf33yl84jii7-nginx-rtmp-module-v1.1.7-src –add-module=/nix/store/ps5s8q9v91l03972gw0h4l9iazx062km-nginx-dav-ext-module-v0.0.3-src –add-module=/nix/store/46sjsnbkx7rgwgn3pgigcfaygrs2cx30-headers-more-nginx-module-v0.26-src –with-cc-opt=’-fPIE -fstack-protector-all –param ssp-buffer-size=4 -O2 -D_FORTIFY_SOURCE=2′ –with-ld-opt=’-pie -Wl,-z,relro,-z,now’

Thanks for reading, and stay tuned for part II, where we will modify a nix package for our custom needs, and test it against our own fork of the nixpkgs repo.

Refactoring CD Pipelines – Part 1: Chef-Solo in AWS AutoScaling Groups

We often overlook similarities between new CD Pipeline code we’re writing today and code we’ve already written. In addition, we might sometimes rely on the ‘copy, paste, then modify’ approach to make quick work of CD Pipelines supporting similar application architectures. Despite the short-term gains, what often results is code sprawl and maintenance headaches. This blog post series is aimed at helping to reduce that sprawl and improve code re-use across CD Pipelines.

This mini-series references two real but simple applications, hosted in Github, representing the kind of organic growth phenomenon we often encounter and sometimes even cause, despite our best intentions. We’ll refactor them each a couple of times over the course of this series, with this post covering CloudFormation template reuse.

Chef’s a great tool… let’s misuse it!

During the deployment of an AWS AutoScaling Group, we typically rely on the user data section of the Launch Configuration to configure new instances in an automated, repeatable way. A common automation practice is to use chef-solo to accomplish this. We carefully chose chef-solo as a great tool for immutable infrastructure approaches. Both applications have CD Pipelines that leverage it as a scale-time configuration tool by reading a JSON document describing the actions and attributes to be applied.

It’s all roses

It’s a great approach: we sprinkle in a handful or two of CloudFormation parameters to support our Launch Configuration, embed the chef-solo JSON in the user data and decorate it with references to the CloudFormation parameters. Voila, we’re done! The implementation hardly took any time (probably less than an hour per application if you could find good examples in the internet), and each time we need a new CD Pipeline, we can just stamp out a new CloudFormation template.

Figure 1: Launch Configuration user data (as plain text)


Figure 2: CloudFormation parameters (corresponding to Figure 1)

Well, it’s mostly roses…

Why is it, then, that a few months and a dozen or so CD Pipelines later, we’re spending all our time debugging and doing maintenance on what should be minor tweaks to our application configurations? New configuration parameters take hours of trial and error, and new application pipelines can be copied and pasted into place, but even then it takes hours to scrape out the previous application’s specific needs from its CloudFormation template and replace them.

Fine, it’s got thorns, and they’re slowing us down

Maybe our great solution could have been better? Let’s start with the major pitfall to our original approach: each application we support has its own highly-customized CloudFormation template.

  • lots of application-specific CFN parameters exist solely to shuttle values to the chef-solo JSON
  • fairly convoluted user data, containing an embedded JSON structure and parameter references, is a bear to maintain
  • tracing parameter values from the CD Pipeline, traversing the CFN parameters into the user data… that’ll take some time to debug when it goes awry

One path to code reuse

Since we’re referencing two real GitHub application repositories that demonstrate our current predicament, we’ll continue using those repositories to present our solution via a code branch named Phase1 in each repository. At this point, we know our applications share enough of a common infrastructure approach that they should be sharing that part of the CloudFormation template.

The first part of our solution will be to extract the ‘differences’ from the CloudFormation templates between these two application pipelines. That should leave us with a common skeleton to work with, minus all the Chef specific items and user data, which will allow us to push the CFN template into an S3 bucket to be shared by both application CD pipelines.

The second part will be to add back the required application specificity, but in a way that migrates those differences from the CloudFormation templates to external artifacts stored in S3.

Taking it apart

Our first tangible goal is to make the user data generic enough to support both applications. We start by moving the inline chef-solo JSON to its own plain JSON document in each application’s pipeline folder (/pipelines/config/app-config.json). Later, we’ll modify our CD pipelines so they can make application and deployment-specific versions of that file and upload it to an S3 bucket.

Figure 3: Before/after comparison (diff) of our Launch Configuration User Data

Screen Shot 2016-08-30 at 4.41.42 PM
Left: original user data; Right: updated user data

The second goal is to make a single, vanilla CloudFormation template. Since we orphaned these Chef only CloudFormation parameters by removing the parts of the user data referencing them, we can remove them. The resulting template’s focus can now be on meeting the infrastructure concerns of our applications.

Figure 4: Before/after comparison (diff) of the CloudFormation parameters required

Screen Shot 2016-08-31 at 6.39.47 PM
Left: original CFN parameters; Right: pared-down parameters


At this point, we have eliminated all the differences between the CloudFormation templates, but now they can’t configure our application! Let’s fix that.

Reassembling it for reuse

Our objective now is to make our Launch Configuration user data truly generic so that we can actually reuse our CloudFormation template across both applications. We do that by scripting it to download the JSON that Chef needs from a specified S3 bucket. At the same time, we enhance the CD Pipelines by scripting them to create application and deploy-specific JSON, and to push that JSON to our S3 bucket.

Figure 5: Chef JSON stored as a deploy-specific object in S3

Screen Shot 2016-08-30 at 4.43.50 PM
The S3 key is unique to the deployment.

To stitch these things together we add back one CloudFormation parameter, ChefJsonKey, required by both CD Pipelines – its value at execution time will be the S3 key where the Chef JSON will be downloaded from. (Since our CD Pipeline has created that file, it’s primed to provide that parameter value when it executes the CloudFormation stack.)

Two small details left. First, we give our AutoScaling Group instances the ability to download from that S3 bucket. Now that we’re convinced our CloudFormation template is as generic as it needs to be, we upload it to S3 and have our CD Pipelines reference it as an S3 URL.

Figure 6: Our S3 bucket structure ‘replaces’ the /pipeline/config folder 

Screen Shot 2016-08-30 at 4.43.36 PM
The templates can be maintained in GitHub.

That’s a wrap

We now have a vanilla CloudFormation template that supports both applications. When an AutoScaling group scales up, the new servers will now download a Chef JSON document from S3 in order to execute chef-solo. We were able to eliminate that template from both application pipelines and still get all the benefits of Chef based server configuration.

See these GitHub repositories referenced throughout the article:

In Part 2 of this series, we’ll continue our refactoring effort with a focus on the CD Pipeline code itself.

Authors: Jeff Dugas and Matt Adams

Interested in working with and sometimes misusing configuration management tools like Chef, Puppet, and Ansible ? Stelligent is hiring!


Containerized CI Solutions in AWS – Part 1: Jenkins in ECS

In this first post of a series exploring containerized CI solutions, I’m going to be addressing the CI tool with the largest market share in the space: Jenkins. Whether you’re already running Jenkins in a more traditional virtualized or bare metal environment, or if you’re using another CI tool entirely, I hope to show you how and why you might want to run your CI environment using Jenkins in Docker, particularly on Amazon EC2 Container Service (ECS). If I’ve done my job right and all goes well, you should have run a successful Jenkins build on ECS well within a half hour from now!

For more background information on ECS and provisioning ECS resources using CloudFormation, please feel free to check out Stelligent’s two-part blog series on the topic.

An insanely quick background on Jenkins

Jenkins is an open source CI tool written in Java. One of its strengths is the very large collection of plugins available, including one for ECS. The Amazon EC2 Container Service Plugin can launch containers on your ECS cluster that automatically register themselves as Jenkins slaves, execute the appropriate Jenkins job on the container, and then automatically remove the container/build slave afterwards.

But, first…why?

But before diving into the demo, why would you want to run your CI builds in containers? First, containers are portable, which, especially when also utilizing Docker for your development environment, will give you a great deal of confidence that if your application builds in a Dockerized CI environment, it will build successfully locally and vice-versa. Next, even if you’re not using Docker for your development environment, a containerized CI environment will give you the benefit of an immutable build infrastructure where you can be sure that you’re building your application in a new ephemeral environment each time. And last but certainly not least, provisioning containers is very fast compared to virtual machines, which is something that you will notice immediately if you’re used to spinning up VMs/cloud instances for build slaves like with the Amazon EC2 Plugin.

As for running the Jenkins master on ECS, one benefit is fast recovery if the Jenkins EC2 instance goes down. When using EFS for Jenkins state storage and a multi-AZ ECS cluster like in this demo, the Jenkins master will recover very quickly in the event of an EC2 container instance failure or AZ outage.

Okay, let’s get down to business…

Let’s begin: first launch the provided CloudFormation stack by clicking the button below:


You’ll have to enter these parameters:

  • AvailabilityZone1: an AZ that your AWS account has access to
  • AvailabilityZone2: another accessible AZ in the same region as AvailabilityZone1
  • InstanceType: EC2 instance type for ECS container instances (must be at least t2.small for this demo)
  • KeyPair: a key pair that will allow you to SSH into the ECS container instances, if necessary
  • PublicAccessCIDR: a CIDR block that will have access to view the public Jenkins proxy and SSH into container instances (ex:
    • NOTE: Jenkins will not automatically be secured by a user and password, so this parameter can be used to secure your Jenkins master by limiting network access to the provided CIDR block.  If you’d like to limit access to Jenkins to only your public IP address, enter “[YOUR_PUBLIC_IP_ADDRESS]/32” here, or if you’d like to allow access to the world (and then possibly secure Jenkins yourself afterwards) enter ““.
It’s almost this easy, but just click Launch Stack once

Okay, the stack is launching—so what’s going on here?

template1-designer (1).png
Ohhh, now I get it

In a nutshell, this CloudFormation stack provisions a VPC containing a multi-AZ ECS cluster, and a Jenkins ECS service that uses Amazon Elastic File System (Amazon EFS) storage to persist Jenkins data. For ease of use, this CloudFormation stack also contains a basic NGINX reverse proxy that allows you to view Jenkins via a public endpoint. Both Jenkins and NGINX each consist of an ECS service, ECS task definition, and classic ELB (internal for Jenkins, and Internet-facing for the proxy).

In actuality, I think that a lot of organizations would choose to keep Jenkins internal in a private subnet and rely on a VPN for outside access to Jenkins. Instead, to keep things relatively simple, this stack only creates public subnets and relies on security groups for network access control.


There are a couple of reasons why running a Jenkins master on ECS is a bit complicated. One is that there is an ECS limitation that allows you to only associate one load balancer with an ECS service, and Jenkins runs as a single Java application that listens for web traffic on one port and for JNLP connections for build slaves on another port (defaults are 8080 and 50000, respectively). When launching a workload in ECS, using an Elastic Load Balancer for service discovery as I’m doing in this example, and provisioning using CloudFormation, you need to use a Classic Load Balancer that is listening on both Jenkins ports (listening on multiple ports is not currently possible with the recently revealed Application Load Balancer).

Another complication is that Jenkins stores its state in XML on disk, as opposed to some other CI tools that allow you to use an external database to store state (examples coming later in this blog series). This is why I chose to use EFS in this stack—when requiring persistent data in an ECS container, you must be able to sync Docker volumes between your ECS container instances because a container for your service can run on any container instance in the cluster. EFS provides a valuable solution to this issue by allowing you to mount an NFS file system that is shared amongst all the container instances in your cluster.

Coffee break!

Depending on how long you took to digest that fancy diagram and my explanation, feel free to grab a cup of coffee; the stack took about 7-8 minutes to complete successfully during my testing. When you see that beautiful CREATE_COMPLETE in the stack status, continue on.

Jenkins configuration

One of the CloudFormation stack outputs is PublicJenkinsURL; navigate to that URL in your browser and you should see the Jenkins home page (at least within a minute, once the instance is in service):


To make things easier, let’s click ENABLE AUTO REFRESH (in the upper-right) right off the bat.

Then click Manage Jenkins > Manage Plugins, navigate to the Available tab, and select these two plugins (you can filter the plugins by each name in the Filter text box):

  • Amazon EC2 Container Service Plugin
  • Git plugin
    • NOTE: there are a number of “Git” plugins, but you’ll want the one that’s just named “Git plugin

And click Download now and install after restart.

Select the Restart Jenkins when installation is complete and no jobs are running checkbox at the bottom, and Jenkins will restart after the plugins are downloaded.


When Jenkins comes back after restarting, go back to the Jenkins home screen, and navigate to Manage Jenkins > Configure System.

Scroll down to the Cloud section, click Add a new cloud > Amazon EC2 Container Service Cloud, and enter the following configuration (substituting the CloudFormation stack output where indicated):

  • Name: ecs
  • Amazon ECS Credential: – none –  (because we’re using the IAM role of the container instance instead)
  • Amazon ECS Region Name: us-east-1  (or the region you launched your stack in)
  • ECS Cluster: [CloudFormation stack output: JenkinsConfigurationECSCluster]
  • Click Advanced…
  • Alternative Jenkins URL: [CloudFormation stack output: JenkinsConfigurationAlternativeJenkinsURL]
  • Click ECS slave templates > Add
    • Label: jnlp-slave-with-java-build-tools
    • Docker Image: cloudbees/jnlp-slave-with-java-build-tools:latest
    • Filesystem root: /home/jenkins
    • Memory: 512
    • CPU units: 512

And click Save at the bottom.

That should take you back to the Jenkins home page again. Now click New Item, and enter an item name of aws-java-sample, select Freestyle project, and click OK.


Enter the following configuration:

  • Make sure Restrict where this project can be run is selected and set:
    • Label Expression: jnlp-slave-with-java-build-tools
  • Under Source Code Management, select Git and enter:


  • Under Build, click Add build step > Execute shell, and set:
    • Command: mvn package

Click Save.

That’s it for the Jenkins configuration. Now click Build Now on the left side of the screen.


Under Build History, you’re going to see a “pending – waiting for next available executor” message, which will switch to a progress bar when the ECS container starts.  When the progress bar appears (it might take a couple of minutes for the first build while ECS downloads the Docker build slave image, but after this it should only take a few seconds when the image is cached on your ECS container instance), click it and you’ll see the console output for the build:


And Success!

Okay, Maven is downloading a bunch of dependencies…and more dependencies…and more dependencies…and finally building…and see that “Finished: SUCCESS?” Congratulations, you just ran a build in an ECS Jenkins build slave container!

Next Steps

One thing that you may have noticed is that we used a Docker image provided by CloudBees (the enterprise backers of Jenkins). For your own projects, you might need to build and use a custom build slave Docker image. You’ll probably want to set up a pipeline for each of these Docker builds (and possibly publish to Amazon ECR), and configure an ECS slave template that uses this custom image. One caveat: Jenkins slaves need to have Java installed, which, depending on your build dependencies, may increase the size of your Docker image somewhat significantly (well, relatively so for a Docker image). For reference, check out the Dockerfile of a bare-bones Jenkins build slave provided by the Jenkins project on Docker Hub.

Next Next Steps

Pretty cool, right? Well, while it’s the most popular, Jenkins isn’t the only player in the game—stay tuned for a further exploration and comparison of containerized CI solutions on AWS in this blog series!

Interested in Docker, Jenkins, and/or working someplace where your artful use of monkey GIFs will finally be truly appreciated? Stelligent is hiring!

DevOps in AWS Radio: Orchestrating Docker containers with AWS ECS, ECR and CodePipeline (Episode 4)

In this episode, Paul Duvall and Brian Jakovich from Stelligent cover recent DevOps in AWS news and speak about the AWS EC2 Container Service (ECS), AWS EC2 Container Registry (ECR), HashiCorp Consul, AWS CodePipeline, and other tools in providing Docker-based solutions for customers. Here are the show notes:

DevOps in AWS News

Episode Topics

  1. Benefits of using ECS, ECR, Docker, etc.
  2. Components of ECS, ECR and Service Discovery
  3. Orchestrating and automating the deployment pipeline using CloudFormation, CodePipeline, Jenkins, etc. 

Blog Posts

  1. Automating ECS: Provisioning in CloudFormation (Part 1)
  2. Automating ECS: Orchestrating in CodePipeline and CloudFormation (Part 2)

About DevOps in AWS Radio

On DevOps in AWS Radio, we cover topics around applying DevOps principles and practices such as Continuous Delivery in the Amazon Web Services cloud. This is what we do at Stelligent for our customers. We’ll bring listeners into our roundtables and speak with engineers who’ve recently published on our blog and we’ll also be reaching out to the wider DevOps in AWS community to get their thoughts and insights.

The overall vision of this podcast is to describe how listeners can create a one-click (or “no click”) implementation of their software systems and infrastructure in the Amazon Web Services cloud so that teams can deliver software to users whenever there’s a business need to do so. The podcast will delve into the cultural, process, tooling, and organizational changes that can make this possible including:

  • Automation of
    • Networks (e.g. VPC)
    • Compute (EC2, Containers, Serverless, etc.)
    • Storage (e.g. S3, EBS, etc.)
    • Database and Data (RDS, DynamoDB, etc.)
  • Organizational and Team Structures and Practices
  • Team and Organization Communication and Collaboration
  • Cultural Indicators
  • Version control systems and processes
  • Deployment Pipelines
    • Orchestration of software delivery workflows
    • Execution of these workflows
  • Application/service Architectures – e.g. Microservices
  • Automation of Build and deployment processes
  • Automation of testing and other verification approaches, tools and systems
  • Automation of security practices and approaches
  • Continuous Feedback systems
  • Many other Topics…

Introduction to ServerSpec: What is ServerSpec and How do we use it at Stelligent? (Part1)

In this three-part series, I’ll explain ServerSpec infrastructure testing and how we use it at Stelligent, provide some concrete examples demonstrating how we use it at Stelligent, and discuss how Stelligent has extended Serverspec with custom AWS resources and matchers. We will also walk through writing some new matchers for an AWS resource. If you’re new to Serverspec, don’t worry, I’ll be covering the basics!

What is Serverspec?

Serverspec is an integration testing framework written on top of the Ruby RSpec dsl with custom resources and matchers that form expectations targeted at infrastructure. Serverspec tests verify the actual state of your infrastructure such as (bare-metal servers, virtual machines, cloud resources) and ask the question are they configured correctly? Test can be driven by many of the popular configuration management tools, like Puppet, Ansible, CFEngine and Itamae.

Serverspec allows for infrastructure code to be written using Test Driven Development (TDD), by expressing the state that the infrastructure must provide and then writing infrastructure code that implements those expectation. The biggest benefit is that with a suite of Serverspec expectations in place, developers can refactor infrastructure code with a high degree of confidence that a change does not produce any undesirable side-effects.

Anatomy of a server spec test

Since Serverspec is an extension on top of RSpec it shares the same DSL syntax and language features as RSpec. The commonly used features of the DSL are outlined below. For suggestions on writing better specs see the appropriately named

describe 'testGroupName' do
  it 'testName' do
    expect(resource).to matcher 'matcher_parameter'

Concrete example:

describe 'web site' do
  it 'responds on port 80' do
    expect(port 80).to be_listening 'tcp'

Describe blocks can be nested to provide a hierarchical description of behavior such as

describe 'DevopsEngineer' do
  describe 'that test infrastructure' do
    it 'can refactor easily' do

However, the context block is an alias for describe and provides a richer semantics. The above example can be re-written using context instead of nested describes.

describe 'DevopsEngineer' do
  context 'that test infrastructure' do
    it 'can refactor easily' do

There are two syntaxes that can be used for expressing expectations, should and expect. The should form was deprecated with RSpec 3.0, because it can produce unexpected results for some edge cases which are detailed in this Rspec blog post. Betterspec advises that new projects always use the expect syntax in expect vs should. The Serverspec website resources section advises that all of the examples use should syntax due to preference and that  Serverspec works well with expect. Even though the should syntax is more concise and arguably more readable, this post we will adhere to the expect syntax since the should syntax is deprecated and has potential for edge case failures. An example of both are provided because the reader is likely to see it both ways in practice.

Expect Syntax:

describe 'web site' do
  it 'responds on port 80' do
    expect(port 80).to be_listening

Should Syntax:

describe port(80) do
  it { should be_listening }

A Resource, typically represented as a Ruby class, is the first argument to the expect block and represents the unit under test about which the assertions must be true or false. Matchers form the basis of positive and negative expectations/assertions as predicates on Resource. They are implemented as custom classes or methods on the Resource and can take parameters so that matchers can be generalized and configurable. In the case of the port resource its matcher be_listening can be passed anyone of the following parameters tcp, udp, tcp6, or udp6 like this be_listening ‘tcp’

Positive and negative case are achieved  when matchers are combined with either expect(…).to matcher or expect(…).not_to. Serverspec provides custom resources and matchers for integration testing of infrastructure code and Stelligent has created some custom resources for AWS resource called serverspec-aws-resources.  The following shows the basic expectation form and a concrete example.

Basic form:

expect (resource).to matcher optional_arguments

Example using port resource:

expect(port 80).to be_listening 'tcp'

Test driven development with Serverspec

At Stelligent, we treat infrastructure as code and that code should be written using the same design principles as traditional application software. In that spirit we employ test driven development (TDD) when writing infrastructure code. TDD helps focus our attention on what the infrastructure code needs to do and sets a clear definition of done. It promotes a lean and agile development process where we only build enough infrastructure code to make the tests pass.

A great example of how Serverspec supports test driven development is when writing our chef cookbooks that are used to build AMIs on some projects at Stelligent. Locally a developer can utilize Serverspec, Test Kitchen, Vagrant and a virtualization provider to write the Chef cookbook. A great tutorial about getting setup with these tools and writing you first test is available in Learning Chef: A Guide to Configuration Management and Automation.

Using this toolchain allows the developer to adhere to the four pillars of test-driven infrastructure — writing, running, provisioning, feedback. These pillars are necessary to accomplish the red-green-refactor technique the basis of TDD.

First, we make it RED by write a failing Serverspec test and run kitchen verify. Test Kitchen then takes care of running the targeted platform on the virtualization provider, provisioning the platform and providing feedback. In this case we have not written the Chef cookbook to pass the Serverspec test so Test Kitchen will report a failure for the feedback.

Then, we make it GREEN by writing a cookbook that minimally satisfies the test requirements. After the node is converged with the Chef cookbook we can run kitchen verify and see that the test passes. We are now safe to refactor with the confidence that even a trivial change will not produce an unintended side-effect.

The same Serverspec tests that were used locally to test are then used during the build pipeline to verify the actual state of the infrastructure resources in AWS. On every commit the continuous delivery pipeline executes the Serverspec integration tests against running provisioned resources and fails the build stage if an error occurs. This automated testing provides short iteration cycles and high degree of confidence on each commit. This supports an agile methodology for infrastructure development and an evolutionary architecture process.

How we utilize Serverspec at Stelligent

Provisioning of AMIs

Provisioning AMIs in the continuous delivery pipeline is done with a combination of Packer and Chef cookbooks. Packer is a tool that is used in our continuous delivery to build AWS machine images using a declarative json document that describes the provisioning. The Chef provisioner is declared in the Packer config file along with the Chef runlist. Building the AMI is typically done in three stages build-test-promote. During the build stage Packer is run which launches the AMI that serves as the base AMI and then executes each of the provisioners in the Packer config file. In the case of the Chef provisioner it converges the running EC2 node. When Packer successfully completes the AMI is stored in AWS and the ID is passed to the test stage.

The test stage launches a new EC2 instance with the AMI ID from the build step and  executes the Serverspec test against the running EC2 instance. If the tests pass then the EC2 instance is promoted and stored in AWS, available for other pipelines to utilize as an artifact. In order to promote code reuse and speed up AMI pipelines we have broken down AMI creation into multiple pipelines that depend on the results of another AMI pipeline. It look something like this  base-ami > hardened-ami > application-ami. Each of these AMI pipelines have there own Serverspec test that ensure the AMI’s requirements are met.  We do the same thing for our Jenkins servers, with additional test to verify that the instances jobs are configured correctly.

Documenting existing servers

We document existing infrastructure before moving it to AWS. A pattern that we use is to elicit a client specification of the existing infrastructure and then translate it to an executable Serverspec definition. This allows us to test the spec against the existing infrastructure to see if there are any errors in the client provided specification.

For example, we may be told they have a Centos OS instance running a Tomcat web server to serve their web application. The specification is translated into a suite of Serverspec test and executed against the existing and results in a failure for a test that Tomcat is installed. After debugging the instance we find that it is actually running JBoss application server. We then update the test to match reality and execute it again. After this explorative process is done we have an executable specification that we can use to test the existing infrastructure, but also use as a guide when building our new infrastructure in AWS.

In conclusion

Serverspec is an invaluable tool in your toolkit to support Test Driven Infrastructure . It’s built on top of the widely adopted Ruby RSpec DSL. It can be executed both locally to support development, and as part of your continuous delivery pipeline. It allows you to take a test driven development approach to writing infrastructure code. The resulting suite of tests form a specification, that when automated as part of CD pipeline provide a high degree confidence in the infrastructure code.

Deploying Kubernetes on CoreOS to AWS

Linux containers are a relatively new abstraction framework with exciting implications for Continuous Integration and Continuous Delivery patterns. They allow appropriately designed applications to be tested, validated, and deployed in an immutable fashion at much greater speed than with traditional virtual machines. When it comes to production use however, an orchestration framework is desirable to maintain a minimum number of container workers, load balance between them, schedule jobs and the like. An extremely popular method of doing this is to use AWS EC2 Container Service (ECS) with the Amazon Linux distribution, however if you find yourself making the “how do we run containers” decision then it pays to explore other technology stacks as well.

In this demo, we’ll launch a Kubernetes cluster of CoreOS on AWS. Kubernetes is a container orchestration platform that utilizes Docker to provision, run, and monitor containers on Linux. It is developed primarily by Google, and is similar to container orchestration projects they run internally. CoreOS is a lightweight Linux distribution optimized for container hosting. Related projects are Docker Inc’s “Docker Datacenter,” RedHat’s Atomic, and RancherOS.

1) Download the kubernetes package. Releases are at . This demo assumes version 1.1.8.

2) Download the coreos-kubernetes package. Releases are at . This demo assumes version 0.4.1.

3) Extract both. core-kubernetes provides a kube-aws binary used for provisioning a kube cluster in AWS (using CloudFormation), while the kubernetes package is used for its kubectl binary

tar xzf kubernetes.tar.gz
tar xzf kube-aws-PLATFORM-ARCH.tar.gz (kube-aws-darwin-amd64.tar.gz for Macs, for instance)

4) Setup your AWS credentials and profiles

5) Generate a KMS key to use for the cluster (change region from us-west-1 if desired, but you will need to change it everywhere). Make a note of the ARN of the generated key, it will be used in the cluster.yaml later

aws kms --region=us-west-1 create-key --description="kube-aws assets"

5) Get a sample cluster.yaml . This is a configuration file later used for generating the AWS CloudFormation scripts and associated resources used to launch the cluster.

mkdir my-cluster; cd my-cluster
~/darwin-amd64/kube-aws init --cluster-name=YOURCLUSTERNAME \
 --external-dns-name=FQDNFORCLUSTER --region=us-west-1 \
 --availability-zone=us-west-1c --key-name=VALIDEC2KEYNAME \

6) Modify the cluster.yaml with appropriate settings. “externalDNSName” wants a FQDN that will either be configured automatically if you provide a Route53 zone id for “hostedZoneId” or that you will configure AFTER provisioning has completed. This becomes the kube controller endpoint used by the Kubernetes control tooling.

Note that a new VPC is created for the Kubernetes cluster unless you configure it to use an existing VPC. You specify a region in the cluster.yaml, and if you don’t specify an Availability Zone then the “A” AZ will be used by default.

7) Render the CFN templates, validate, then launch the cluster

~/darwin-amd64/kube-aws render
~/darwin-amd64/kube-aws validate
~/darwin-amd64/kube-aws up


This will setup a short-term Certificate Authority (365 days) and SSL certs (90 days) for communication and then launch a cluster into CloudFormation. It will also store data about the cluster for use with kubectl

8) After the cluster has come up, an EIP will be output. Assign this EIP to the FQDN you used for externalDNSName in cluster.yaml if you did not allow kube-aws to configure this automatically via Route53. This is important, as it’s how the tools will try to control the cluster.

9) You can then start playing with the cluster. My sample session:

# Display active Kubernetes nodes
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig get nodes
NAME STATUS AGE Ready,SchedulingDisabled 19m Ready 19m Ready 19m
# Display name and EIP of the cluster
~/darwin-amd64/kube-aws status
Controller IP: a.b.c.d
# Launch the "nginx" Docker image as container instance "my-nginx"
# 2 replicas, wire port 80
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig run my-nginx --image=nginx --replicas=2 --port=80
deployment "my-nginx" created
# Show process list
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig get po
my-nginx-2494149703-2dhrr 1/1 Running 0 2m
my-nginx-2494149703-joqb5 1/1 Running 0 2m
# Expose port 80 on the my-nginx instances via an Elastic Load Balancer
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig expose deployment my-nginx --port=80 --type=LoadBalancer
service "my-nginx" exposed
# Show result for the service
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig get svc my-nginx -o wide
my-nginx 10.x.0.y 80/TCP 3m run=my-nginx
# Describe the my-nginx service. This will show the CNAME of the ELB that
# was created and which exposes port 80
~/kubernetes/platforms/darwin/amd64/kubectl --kubeconfig=kubeconfig describe service my-nginx
Name: my-nginx
Namespace: default
Labels: run=my-nginx
Selector: run=my-nginx
Type: LoadBalancer
IP: 10.x.0.y
LoadBalancer Ingress:
Port: <unset> 80/TCP
NodePort: <unset> 31414/TCP
Endpoints: 10.a.b.c:80,10.d.e.f:80
Session Affinity: None
 FirstSeen LastSeen Count From SubobjectPath Type Reason Message
 --------- -------- ----- ---- ------------- -------- ------ -------
 4m 4m 1 {service-controller } Normal CreatingLoadBalancer Creating load balancer
 4m 4m 1 {service-controller } Normal CreatedLoadBalancer Created load balancer

Thus we have created a three node Kubernetes cluster (one controller, two workers) running two copies of the nginx container. We then setup an ELB to balance traffic between the instances.

Kubernetes certainly has operational complexity to trade off against features and robustness. There are a lot of moving parts to maintain, and at Stelligent we tend to recommend AWS ECS for situations in which it will suffice.


Stelligent is hiring! Do you enjoy working on complex problems like figuring out ways to automate all the things as part of a deployment pipeline? Do you believe in the “everything-as-code” mantra? If your skills and interests lie at the intersection of DevOps automation and the AWS cloud, check out the careers page on our website.

Stelligent Bookclub: “Building Microservices” by Sam Newman

At Stelligent, we put a strong focus on education and so I wanted to share some books that have been popular within our team. Today we explore the world of microservices with “Building Microservices” by Sam Newman.

Microservices are an approach to distributed systems that promotes the use of small independent services within a software solution. By adopting microservices, teams can achieve better scaling and gain autonomy, that allows teams to chose their technologies and iterate independently from other teams.

As a result, a change to one part of the system could unintentionally break a different part, which in turn might lead to hard-to-predict outages

Microservices are an alternative to the development of a monolithic codebase in many organizations – a codebase that contains your entire application and where new code piles on at alarming rates. Monoliths become difficult to work with as interdependencies within the code begin to develop.

As a result, a change to one part of the system could unintentionally break a different part, which in turn might lead to hard-to-predict outages. This is where Newman’s argument about the benefits of microservices really comes into play.

  • Reasons to split the monolith
    • Increase pace of change
    • Security
    • Smaller team structure
    • Adopt the proper technology for a problem
    • Remove tangled dependencies
    • Remove dependency on databases for integration
    • Less technical debt

By splitting monoliths at their seams, we can slowly transform a monolithic codebase into a group of microservices. Each service his loosely coupled and highly cohesive, as a result changes within a microservice do not change it’s function to other parts of the system. Each element works in a blackbox where only the inputs and outputs matter. When splitting a monolith, databases pose some of the greatest challenge; as a result, Newman devotes a significant chunk of the text/book to explaining various useful techniques to reduce these dependencies.

Ways to reduce dependencies

  • Clear well documented api
  • Loose coupling and high cohesion within a microservice
  • Enforce standards on how services can interact with each other

Though Newman’s argument for the adoption of microservices is spot-on, his explanation on continuous delivery and scaling micro-services is shallow. For anyone who has a background in CD or has read “Continuous Delivery” these sections do not deliver. For example, he takes the time to talk about machine images at great length but lightly brushes over build pipelines. The issue I ran into with scaling microservices is Newman suggests that ideally each microservice should ideally be put on its own instance where it exists independently of all other services. Though this is a possibility and it would be nice to have this would be highly unlikely to happen in a production environment where cost is a consideration. Though he does talk about using traditional virtualization, Vagrant, linux containers, and Docker to host multiple services on a single host he remains platform agnostic and general. As a result he misses out on the opportunity to talk about services like Amazon ECS, Kubernetes, or Docker Swarm. Combining these technologies with reserved cloud capacity would be a real world example that I feel would have added a lot to this section

Overall Newman’s presentation of microservices is a comprehensive introduction for IT professionals. Some of the concepts covered are basic but there are many nuggets of insight that are worth reading for. If you are looking to get a good idea about how microservices work, pick it up. If you’re looking to advance your microservice patterns or suggest some, feel free to comment below!

Interested in working someplace that gives all employees an impressive book expense budget? We’re hiring.

Automating Habitat with AWS CodePipeline

This article outlines a proof-of-concept (POC) for automating Habitat operations from AWS CodePipeline. Habitat is Chef’s new application automation platform that provides a packaging system that results in apps that are “immutable and atomically deployed, with self-organizing peer relationships.”  Habitat is an innovative technology for packaging applications, but a Continuous Delivery pipeline is still required to automate deployments.  For this exercise I’ve opted to build a lightweight pipeline using CodePipeline and Lambda.

An in-depth analysis of how to use Habitat is beyond the scope for this post, but you can get a good introduction by following their tutorial. This POC essentially builds a CD pipeline to automate the steps described in the tutorial, and builds the same demo app (mytutorialapp). It covers the “pre-artifact” stages of the pipeline (Source, Commit, Acceptance), but keep an eye out for a future post which will flesh out the rest.

Also be sure to read the article “Continuous deployment with Habitat” which provides a good overview of how the developers of Habitat intend it to be used in a pipeline, including links to some repos to help implement that vision using Chef Automate.

Technology Overview


The application we’re automating is called mytutorialapp. It is a simple “hello world” web app that runs on nginx. The application code can be found in the hab-demo repository.


The pipeline is provisioned by a CloudFormation stack and implemented with CodePipeline. The pipeline uses a Lambda function as an Action executor. This Lambda function delegates command execution to  an EC2 instance via an SSM Run Command: aws:runShellScript. The pipeline code can be found in the hab-demo-pipeline repository. Here is a simplified diagram of the execution mechanics:



The CloudFormation stack that provisions the pipeline also creates several supporting resources.  Check out the pipeline.json template for details, but here is a screenshot to show what’s included:


Pipeline Stages

Here’s an overview of the pipeline structure. For the purpose of this article I’ve only implemented the Source, Commit, and Acceptance stages. This portion of the pipeline will get the source code from a git repo, build a Habitat package, build a Docker test environment, deploy the Habitat package to the test environment, run tests on it and then publish it to the Habitat Depot. All downstream pipeline stages can then source the package from the Depot.

  • Source
    • Clone the app repo
  • Commit
    • Stage-SourceCode
    • Initialize-Habitat
    • Test-StaticAnalysis
    • Build-HabitatPackage
  • Acceptance
    • Create-TestEnvironment
    • Test-HabitatPackage
    • Publish-HabitatPackage

Action Details

Here are the details for the various pipeline actions. These action implementations are defined in a “pipeline-runner” Lambda function and invoked by CodePipeline. Upon invocation, the scripts are executed on an EC2 box that gets provisioned at the same time as the code pipeline.

Commit Stage


Pulls down the source code artifact from S3 and unzips it.


Sets Habitat environment variables and generates/uploads a key to access my Origin on the Habitat Depot.


Runs static analysis on using bash -n.


Builds the Habitat package

Acceptance Stage


Creates a Docker test environment by running a Habitat package export command inside the Habitat Studio.


Runs a Bats test suite which verifies that the webserver is running and the “hello world” page is displayed.


Uploads the Habitat package to the Depot. In a later pipeline stage, a package deployment can be sourced directly from the Depot.

Wrapping up

This post provided an early look at a mechanism for automating Habitat deployments from AWS CodePipeline. There is still a lot of work to be done on this POC project so keep an eye out for later posts that describe the mechanics of the rest of the pipeline.

Do you love Chef and Habitat? Do you love AWS? Do you love automating software development workflows to create CI/CD pipelines? If you answered “Yes!” to any of these questions then you should come work at Stelligent. Check out our Careers page to learn more.


DevOps in AWS Radio: Automating Compliance with AWS Config and Lambda (Episode 3)

In this episode, Paul Duvall and Brian Jakovich from Stelligent cover recent DevOps in AWS news and speak about Automating Compliance using AWS Config, Config Rules and AWS Lambda. Here are the show notes:

About DevOps in AWS Radio

On DevOps in AWS Radio, we cover topics around applying DevOps principles and practices such as Continuous Delivery in the Amazon Web Services cloud. This is what we do at Stelligent for our customers. We’ll bring listeners into our roundtables and speak with engineers who’ve recently published on our blog and we’ll also be reaching out to the wider DevOps in AWS community to get their thoughts and insights.

The overall vision of this podcast is to describe how listeners can create a one-click (or “no click”) implementation of their software systems and infrastructure in the Amazon Web Services cloud so that teams can deliver software to users whenever there’s a business need to do so. The podcast will delve into the cultural, process, tooling, and organizational changes that can make this possible including:

  • Automation of
    • Networks (e.g. VPC)
    • Compute (EC2, Containers, Serverless, etc.)
    • Storage (e.g. S3, EBS, etc.)
    • Database and Data (RDS, DynamoDB, etc.)
  • Organizational and Team Structures and Practices
  • Team and Organization Communication and Collaboration
  • Cultural Indicators
  • Version control systems and processes
  • Deployment Pipelines
    • Orchestration of software delivery workflows
    • Execution of these workflows
  • Application/service Architectures – e.g. Microservices
  • Automation of Build and deployment processes
  • Automation of testing and other verification approaches, tools and systems
  • Automation of security practices and approaches
  • Continuous Feedback systems
  • Many other Topics…

ChefConf 2016

Highlights and Announcements

ChefConf 2016 was held in Austin, TX last week, and as usual, Stelligent sent a few engineers to take advantage of the learning and networking opportunities. This year’s conference was a 4 day event with a wide variety of workshops, lectures, keynote speeches, and social events. Here is a summary of the highlights and announcements.

Chef Automate

Chef Automate was one of the big announcements at ChefConf 2016. It is billed as “One platform that delivers DevOps workflow, automated compliance, and end-to-end pipeline visibility.” It brings together Chef for infrastructure automation, InSpec for compliance automation, and Habitat for application automation, and delivers a full-stack deployment pipeline along with customizable dashboards for comprehensive visibility. Chef Compliance and Chef Delivery have been phased out as stand-alone products and replaced by Chef Automate as the company’s flagship commercial offering.



Habitat was actually announced back in June, but it was a big focus of ChefConf 2016. There were five info sessions and a Habitat Zone for networking with and learning from other community members. Habitat is an open source project that focuses on application automation and provides a packaging system that results in apps that are “immutable and atomically deployed, with self-organizing peer relationships.” Here are the key features listed on the project website:

  • Habitat is unapologetically app-centric. It’s designed with best practices for the modern application in mind.
  • Habitat gives you a packaging format and a runtime supervisor with deployment coordination and service discovery built in.
  • Habitat packages contain everything the app needs to run with no outside dependencies. They are isolated, immutable, and auditable.
  • The Habitat supervisor knows the packaged app’s peer relationships, upgrade strategy, and policies for restart and security. The supervisor is also responsible for runtime configuration and connecting to management services, such as monitoring.

Habitat packages have the following attributes:


Chef Certification

A new Chef Certification program was also announced at the conference. It is a badge-based program where passing an exam for a particular competency earns you a badge. A certification is achieved by earning all the required badges for that learning track. The program is in an early adopter phase and not all badges are available yet. Here’s what those tracks look like right now:

Chef Certified Developer



Chef Certified Windows Developer



Chef Certified Architect



Join Us

Do you love Chef? Do you love AWS? Do you love automating software development workflows to create CI/CD pipelines? If you answered “Yes!” to any of these questions then you should come work at Stelligent. Check out our Careers page to learn more.