Stelligent Bookclub: “Building Microservices” by Sam Newman

At Stelligent, we put a strong focus on education and so I wanted to share some books that have been popular within our team. Today we explore the world of microservices with “Building Microservices” by Sam Newman.

Microservices are an approach to distributed systems that promotes the use of small independent services within a software solution. By adopting microservices, teams can achieve better scaling and gain autonomy, that allows teams to chose their technologies and iterate independently from other teams.

As a result, a change to one part of the system could unintentionally break a different part, which in turn might lead to hard-to-predict outages

Microservices are an alternative to the development of a monolithic codebase in many organizations – a codebase that contains your entire application and where new code piles on at alarming rates. Monoliths become difficult to work with as interdependencies within the code begin to develop.

As a result, a change to one part of the system could unintentionally break a different part, which in turn might lead to hard-to-predict outages. This is where Newman’s argument about the benefits of microservices really comes into play.

  • Reasons to split the monolith
    • Increase pace of change
    • Security
    • Smaller team structure
    • Adopt the proper technology for a problem
    • Remove tangled dependencies
    • Remove dependency on databases for integration
    • Less technical debt

By splitting monoliths at their seams, we can slowly transform a monolithic codebase into a group of microservices. Each service his loosely coupled and highly cohesive, as a result changes within a microservice do not change it’s function to other parts of the system. Each element works in a blackbox where only the inputs and outputs matter. When splitting a monolith, databases pose some of the greatest challenge; as a result, Newman devotes a significant chunk of the text/book to explaining various useful techniques to reduce these dependencies.

Ways to reduce dependencies

  • Clear well documented api
  • Loose coupling and high cohesion within a microservice
  • Enforce standards on how services can interact with each other

Though Newman’s argument for the adoption of microservices is spot-on, his explanation on continuous delivery and scaling micro-services is shallow. For anyone who has a background in CD or has read “Continuous Delivery” these sections do not deliver. For example, he takes the time to talk about machine images at great length but lightly brushes over build pipelines. The issue I ran into with scaling microservices is Newman suggests that ideally each microservice should ideally be put on its own instance where it exists independently of all other services. Though this is a possibility and it would be nice to have this would be highly unlikely to happen in a production environment where cost is a consideration. Though he does talk about using traditional virtualization, Vagrant, linux containers, and Docker to host multiple services on a single host he remains platform agnostic and general. As a result he misses out on the opportunity to talk about services like Amazon ECS, Kubernetes, or Docker Swarm. Combining these technologies with reserved cloud capacity would be a real world example that I feel would have added a lot to this section

Overall Newman’s presentation of microservices is a comprehensive introduction for IT professionals. Some of the concepts covered are basic but there are many nuggets of insight that are worth reading for. If you are looking to get a good idea about how microservices work, pick it up. If you’re looking to advance your microservice patterns or suggest some, feel free to comment below!

Interested in working someplace that gives all employees an impressive book expense budget? We’re hiring.

Serverless Delivery: Orchestrating the Pipeline (Part 3)

In the second post of this series, we focused on how to get our serverless application running with Lambda, API Gateway and S3. Our application is now able to run on a serverless platform, but we still have not applied the fundamentals of continuous delivery that we talked about in the first part of this series.

In this third and final part of this series on serverless delivery, we will implement a continuous delivery pipeline using serverless technology. Our pipeline will be orchestrated by CodePipeline with actions implemented in Lambda functions. The definition of the CodePipeline resource as well as the Lambda functions that support it are all defined in the same CloudFormation stack that we looked at last week.

Visualize The Goal

To help visualize what we are building, here is a picture of what the final pipeline looks like.


If you’re new to CodePipeline, let’s go over a few important terms:

  • Job – An execution of the pipeline.
  • Stage – A group of actions in the pipeline. Stages never run in parallel to each other and only one job can actively be running in a stage at a time. If a stage is currently running and a new job arrives at the stage it will block until the prior job completes. If multiple new jobs arrive, only the newest will be run while the rest will be dropped.
  • Action – A step to be performed. Actions can be in parallel or series to each other. In this pipeline, all our actions will be implemented by Lambda invocations.
  • Artifact – Each action can declare input and output artifacts that will be stored in an S3 bucket. These are objects that it will either expect to have before it runs, or objects that it will produce and make available after it runs.

The pipeline we have built for our application consists of the following four stages:

  • Source – The source stage has only one action to acquire the source code for later stages.
  • Commit – The commit stage has two actions that are responsible for:
    • Resolving project dependencies
    • Processing (e.g., compile, minify, uglify) the source code
    • Static analysis of the code
    • Unit testing of the application
    • Packaging the application for subsequent deployment
  • Acceptance – The acceptance stage has actions that will:
    • Update the Lambda function from latest source
    • Update S3 bucket with latest artifacts
    • Update API Gateway
    • End-to-end testing of the application
  • Production – The production stage performs the same steps as the Acceptance stage but against the production Lambda, S3 and API Gateway resources

Here is a more detailed picture of the pipeline. We will spend the rest of this post breaking down each step of the pipeline.


Start with Source Stage

Diagram Step: 1

The source stage only has one action in it, a 3rd party action provided by GitHub. The action will register a hook with the repo that you provide to kickoff a new job for the pipeline whenever code is pushed to the GitHub repository. Additionally, the action will pull the latest code from the branch you specified and zip it up into an object in an S3 bucket for later actions to reference.

  "Name": "Source",
  "Actions": [
      "InputArtifacts": [],
      "Name": "Source",
      "ActionTypeId": {
        "Category": "Source",
        "Owner": "ThirdParty",
        "Version": "1",
        "Provider": "GitHub"
      "Configuration": {
        "Owner": "stelligent",
        "Repo": "dromedary",
        "Branch": "serverless",
        "OAuthToken": "XXXXXX",
      "OutputArtifacts": [
          "Name": "SourceOutput"
      "RunOrder": 1


This approach helps solve a common challenge with source code management using Lambda. Obviously no one wants to upload code through the console, so many end up using CloudFormation to manage their Lambda functions. The challenge is that the CloudFormation Lambda resource expects your code to be zipped in an S3 bucket. This means you either need to use S3 as the “source of truth” for your source code, or have a process to keep it in sync fro the real “source of truth”. By building a pipeline, you can keep your source in GitHub and use the next actions that we are about to go through to deploy the Lambda function.

Build from Commit Stage

Diagram Steps: 2,3,4

The commit stage of the pipeline consists of two actions that are implemented with Lambda invocations. The first action is responsible for resolving the application dependencies via NPM. This can be an expensive operation taking many minutes, and is needed by many downstream actions, so the dependencies are zipped up and become an output artifact of this first action. Here are the details of the action:

  • Download & Unzip – Get the source artifact from S3 and unzip into a temp directory
  • Run NPM – Run npm install  the extracted source folder
  • Zip & Upload – Zip up the source folder with its dependencies in node_modules and upload the artifact to S3

Download the input artifact is accomplished with the following code:

var artifact = null; (a) {
  if ( == artifactName && a.location.type == 'S3') {
    artifact = a;

if (artifact != null) {
  var params = {
    Bucket: artifact.location.s3Location.bucketName,
    Key: artifact.location.s3Location.objectKey
  return getS3Object(params, destDirectory);
} else {
  return Promise.reject("Unknown Source Type:" + JSON.stringify(sourceOutput));

Likewise, the output artifact is uploaded with the following:

var artifact = null; (a) {
  if ( == artifactName && a.location.type == 'S3') {
    artifact = a;

if (artifact != null) {
  var params = {
    Bucket: artifact.location.s3Location.bucketName,  
    Key: artifact.location.s3Location.objectKey
  return putS3Object(params, zipfile);
} else {
  return Promise.reject("Unknown Source Type:" + JSON.stringify(sourceOutput));


Diagram Steps: 5,6,7

The second action in the commit stage is responsible for acquiring the source and dependencies, processing the source code, performing static analysis, running unit tests and packaging the output artifacts. This is accomplished by an Lambda action that invokes a Gulp task on the project. This allows the details of these steps to be defined in Gulp alongside the source code and able to change at a different pace than the pipeline. Here is the CloudFormation for this action:

      "Name": "SourceInstalledOutput"
    "UserParameters": "task=package&DistSiteOutput=dist/”

  "OutputArtifacts": [
      "Name": "DistSiteOutput"
      "Name": "DistLambdaOutput"

Notice the UserParameters  setting defined in the resource above. CodePipeline treats it as an opaque string that is passed into the Lambda function. I chose to use a query string format to pass multiple values into the Lambda function. The task  parameter defines what gulp task to run and the DistSiteOutput  and DistLambdaOutput  parameters tell the Lambda function where to expect to find artifacts to then upload to S3.

For more details on how to implement CodePipeline actions in Lambda, check out the entire source of these functions at index.js or read the post Mocking CodePipeline with Lambda.

Test in Acceptance Stage

Diagram Steps: 8,9,10,11

The Acceptance stage is responsible for acquiring the packaged application artifacts and deploying the application to a test environment and then running a Gulp task to execute the end-to-end tests against that environment. Let’s look at the details of each of these four actions in this stage:

  • Deploy App – The Lambda for the application is updated with the code from the Commit stage as a new version. Additionally, the test  alias is moved to this new version. As you may recall from part 2 this alias is used by the test stage of the API Gateway to determine which version of the Lambda function to invoke.


  • Deploy API – At this point, this is a no-op. My goal is to have this action use a Swagger file in the source code to update the API Gateway resources, methods, and integrations. This will allow these API changes to affect the API Gateway on each build, where with this current solution would require an update of the CloudFormation stack outside the pipeline to change the API Gateway.
  • Deploy Site – This action publishes all static content (HTML, CSS, JavaScript and images) to a test S3 bucket. Additionally, it published a config.json file to the bucket that the application uses to determine the endpoint for the APIs. Here’s a sample of the file that is created:
  • End-to-end Testing – This action invokes a Gulp action to run the functional tests. Additionally, it sets and an environment variable with the endpoint URL for the application for the Gulp process to test against.

Sidebar: Continuation Token

One challenge of using Lambda for actions is the current 300 second function execution timeout limit. If you have an action that will take longer than 300 seconds (e.g., launching a CloudFormation stack) you can utilize the continuation token. A continuation token is an opaque value that you can return to CodePipeline to indicate that you are not complete with your action yet. CodePipeline will then reinvoke your action, passing in the continuation token you provided in the prior invocation.

The following code uses the UserParameters  as a maximum number of attempts and uses continuationToken  as a number of attempts. If the action needs more time, it compares the maxAttempts  with the priorAttempts  and if there are still more attempts available, it calls into CodePipeline to signal success and passes a continuation token to indicate that the action needs to be reinvoked.

var jobData = event["CodePipeline.job"].data;
var maxAttempts = parseInt(jobData.actionConfiguration.configuration.UserParameters) || 0
var priorAttempts = parseInt(jobData.continuationToken) || 0;

if(priorAttempts < maxAttempts) {
    console.log("Retrying later.");

    var params = {
        jobId: event["CodePipeline.job"].id,
        continuationToken: (priorAttempts+1).toString()


Deploy from Production Stage

The Production stage uses the same action definitions from the Acceptance stage to deploy and test the application. The only difference is that it passes in the production S3 bucket name and Lambda ARN to deploy to.

I spent time considering how to do a Blue/Green deployment with this environment. Blue/Green deployment is an approach to reduce deployment risk by launching a duplicate environment for code changes (green environment) and then cutting over traffic from the existing (blue environment) to the new environment. This also affords a safe and quick rollback by switching traffic back to the old (blue) environment.

I looked into doing a DNS based Blue/Green using Route53 Resource Records. This would be accomplished by creating a new API Gateway and Lambda function for each job and using weighted routing to move traffic over from the old API Gateway to the new API Gateway.

I’m not convinced this level of complexity would provide much value however, because given the way Lambda manages version and API Gateway manages deployments, you can easily roll changes back very quickly by moving the Lambda version alias. One limitation though is you cannot do a canary deployment with a single API Gateway and Lambda version aliases. I’m curious what your thoughts are on this, ping me on Twitter @Stelligent with #ServerlessDelivery.

Sidebar: Gulp + CloudFormation

You’ll also notice that there is a gulpfile.js in the dromedary-serverless repo to make it easier to launch and manage the CloudFormation stack. Specifically, you can run gulp pipeline:up  to launch the stack, gulp pipeline:wait  to wait for the pipeline creation to complete and gulp pipeline:status  to see the status of the stack and its outputs. This code has been factored out into its own repo named serverless-pipeline if you’d like to add this type of integration between Gulp and CloudFormation in your own project.

Try it out!

Want to experiment with this stack in your own account? The CloudFormation templates are available for you to run with the link below. First, you’ll want to fork the dromedary repository into your GitHub account. Then you’ll need to provide the following parameters to the stack:

  • Hosted Zone – you’ll need to setup a Route53 hosted zone in your account before launching the stack for the Route53 resource records to be created in.
  • Test DNS Name – a fully qualified hostname (within the Hosted Zone you created) for the test resources ( ).
  • Production DNS Name – a fully qualified hostname (within the Hosted Zone you created) for the production resources ( ).
  • OAuth2 Token – your OAuth2 token (see here for details)
  • User Name – your GitHub username



In this series, we have addressed how achieve the fundamentals of continuous delivery in a serverless environment. Let’s review those fundamentals and how we addressed them:

  • Continuous – We used CodePipeline to run a series of actions against every commit to our GitHub repository.
  • Quality – We built static analysis, unit tests and end-to-end tests into our pipeline and ran them for every commit.
  • Automated – The provisioning of the pipeline and the application was done from a single CloudFormation stack
  • Reproducible – Other than creation of a Route53 Hosted Zone, there were no prerequisites to running this CloudFormation stack
  • Serverless – All the tools chosen where AWS managed services, including Lambda, API Gateway, S3, Route53 and CodePipeline. No servers were harmed in the making of this series.

Please follow us on Twitter to be informed of future articles on Serverless Delivery and other exciting topics.  Also, keep your eye out for a new book set to be released later this year by Stelligent CTO, Paul Duvall, on Continuous Delivery in AWS – which will contain a chapter on serverless delivery.



Serverless Delivery: Bootstrapping the Pipeline (Part 2)

In the first of this three part series on Serverless Delivery, we took a look at the high level architecture of running a continuous delivery pipeline with CodePipeline + Lambda. Our objective is to run the Dromedary application in a serverless environment with a serverless continuous delivery pipeline.

Before we can build the pipeline, we need to have the platform in place to deploy our application to. In this second part of the series we will look at what changes need to be made to a Node.js Express application to run in Lambda and the CloudFormation templates needed to create the serverless resources to host our application.

Prepare Your Application

Lambdas are defined as a function that takes in an event object containing data elements mapped to it in the API Gateway Integration Request. An application using Express in Node.js however expects its request to be initiated from an HTTP request on a socket. In order to run your Express application as a Lambda function, you’ll need some code to mediate between the two frameworks.

Although the best approach would be to have your application natively support the Lambda event, this may not always be feasible. Therefore, I have created a small piece of code to serve as a mediator and put it outside of the Dromedary application in its own module named lambda-express for others to leverage.


Install the module with npm install –save lambda-express  and then use it in your Express application to define the Lambda handler:

var lambdaExpress = require('lambda-express');
var express = require('express');
var app = express();

// ... initialize the app as usual ...

// create a handler function that maps lambda inputs to express
exports.handler = lambdaExpress.appHandler(app);

In the dromedary application, this is available in a separate index.js file. You’ll also notice in the dromedary application, that it passes a callback function rather than the express app to the appHandler function. This allows it to use information on the event to configure the application, in this case via environment variables:

exports.handler = lambdaExpress.appHandler(function(event,context) {
process.env.DROMEDARY_DDB_TABLE_NAME = event.ddbTableName;

var app = require('./app.js');
return app;

You now have an Express application that will be able to respond to Lambda events that are generated from API Gateway. Now let’s look at what resources need to be defined in your CloudFormation templates. For the dromedary application, these templates are defined in a separate repository named dromedary-serverless in the pipeline/cfn directory.

Define Website Resources

Sample – site.json

Buckets need to be defined for test and production stages of the pipeline to host the static content of the application. This includes the HTML, CSS, images and any JavaScript that will run in the browser. Each bucket will need a resource like what you see below.

"TestBucket" : {
  "Type" : "AWS::S3::Bucket",
  "Properties" : {
    "AccessControl" : "PublicRead",
    "BucketName" : “”,
    "WebsiteConfiguration" : {
      "IndexDocument" : "index.html"

Here are the important pieces to pay attention to:

  • AccessControl – set to to PublicRead  assuming this is a public website.
  • WebsiteConfiguration – add an IndexDocument  entry to define the home page for the bucket.
  • BucketName – needs to match exactly the Name  for the Route53 ResourceRecord you create. For example, if I’m going to setup a DNS record for , then the bucket name should also be .

We will also want Route53 resource records for each of the test and production buckets. Here is a sample record:

"TestSiteRecord": {
  "Type": "AWS::Route53::RecordSetGroup",
  "Properties": {
    "HostedZoneId": “Z00ABC123DEF”,
    "RecordSets": [{
      "Name": “”,
      "Type": "A",
      "AliasTarget": {
        "HostedZoneId": “Z3BJ6K6RIION7M”,
        "DNSName": “"

Make sure that your record does the following:

  • Name – must be the same as the bucket name above with a period at the end.
  • HostedZoneId – should be specific to your account and for the zone you are hosting ( in this example).
  • AliasTarget – will be reference the zone id and endpoint for S3 in the region you created the bucket. The zone ids and endpoints can be found in the AWS General Reference Guide.

Declare Lambda Functions

Sample – app.json

Lambda functions will need to be declared for test and production stages of the pipeline to serve the Express application. Each function will only be stubbed out in the CloudFormation template so the API Gateway resources can reference it. Each execution of a CodePipeline job will deploy the latest version of the code as a new version of the function. Here is a sample declaration of the Lambda function in CloudFormation:

"TestAppLambda": {
  "Type" : "AWS::Lambda::Function",
  "Properties" : {
    "Code" : {
      "ZipFile": { "Fn::Join": ["n", [
        "exports.handler = function(event, context) {",
        " Error(500));",
    "Description" : "serverless application",
    "Handler" : "index.handler",
    "MemorySize" : 384,
    "Timeout" : 10,
    "Role" : {“Ref”: “TestAppLambdaTrustRole”},
    "Runtime" : "nodejs"

Notice the following about the Lambda resource:

  • ZipFile – the default implementation of the function is provided inline. Notice it just returns a 500 error. This will be replaced by real code when CodePipeline runs.
  • MemorySize – this is the only control you have over the system resources allocated to your function. CPU performance is determined by the amount of memory you allocate, so if you need more CPU, increase this number. Your cost is directly related to this number as is the duration of each invocation. There is a sweet spot you need to find where you get the shortest durations for the system resources.
  • Timeout – max time (in seconds) for a given invocation of the function to run before it is forcibly terminated. The maximum value for this is 300 seconds.
  • Role – reference the ARN of the IAM role that you want to assign to your function when it runs. You’ll want to have “Principal”:{“Service”:[“”]} in the AssumePolicyDocument to grant the Lambda service access to the sts:AssumeRole action. You’ll also want to include in the policy access to CloudWatch Logs with “Action”: [“logs:CreateLogGroup”,”logs:CreateLogStream”,”logs:PutLogEvents”]

Define API Gateway and Stages

Sample – api-gateway.json

Our Lambda function requires something to receive HTTP requests and deliver them as Lambda events. Fortunately, the AWS API Gateway is a perfect solution for this need. A single API Gateway definition with two stages, one for test and one for production will be defined to provide the public access to your Lambda function defined above. Unfortunately, CloudFormation does not have support yet for API Gateway. However, Andrew Templeton has created a set of custom resources that do a great job of filling the gap. For each package, you will need to create a Lambda function in your CloudFormation template, for example:

"ApiGatewayRestApi": {
  "Type" : "AWS::Lambda::Function",
  "Properties" : {
    "Code" : {
      "S3Bucket": “dromedary-serverless-templates”,
      "S3Key": ""
    "Description" : "Custom CFN Lambda",
    "Handler" : "index.handler",
    "MemorySize" : 128,
    "Timeout" : 30,
    "Role" : { "Ref": "ApiGatewayCfnLambdaRole" },
    "Runtime" : "nodejs"


Make sure the role that you create and reference from the Lambda above contains policy statements to give it access to apigateway:*  actions, as well as granting the custom resource access to PassRole  the role defined for API Integration.

  "Effect": "Allow",
  "Resource": [
    { "Fn::GetAtt": [ "ApiIntegrationCredentialsRole", "Arn" ] }
  "Action": [


Defining the API Gateway above consists of the following six resources. For each one, I will highlight only the important properties to be aware of as well as a picture from the console of what the CloudFormation resource creates:

  1. cfn-api-gateway-restapi – this is the top level API definition.
    • Name – although not required, you ought to provide a unique name for the API
  2. cfn-api-gateway-resource – a resource (or path) for the API. In the reference application, I’ve created just one root resource that is a wildcard and captures all sub paths.  The sub path is then passed into lambda-express as a parameter and mapped into a path that Express can handle.
    • PathPart – define a specific path of your API, or {subpath}  to capture all paths as a variable named subpath
    • ParentId – This must reference the RootResourceId from the restapi resource
      "ParentId": { "Fn::GetAtt": [ "RestApi", "RootResourceId" ] }
  3. cfn-api-gateway-method – defines the contract for a request to an HTTP method (e.g., GET or POST) on the path created above.
    • HttpMethod – the method to support (GET)
    • RequestParameters – a map of parameters on the request to expect and pass down to the integration
      "RequestParameters": {
        "method.request.path.subpath": true
  4. cfn-api-gateway-method-response – defines the contract for the response from an HTTP method defined above.
    • HttpMethod – the method to support
    • StatusCode – the code to return for this response (e.g., 200)
    • ResponseParameters – a map of the headers to include in the response
      "ResponseParameters": {
        "method.response.header.Access-Control-Allow-Origin": true,
        "method.response.header.Access-Control-Allow-Methods": true,
        "method.response.header.Content-Type": true
  5. cfn-api-gateway-integration – defines where to send the request for a given method defined above.
    • Type – for Lambda function integration, choose AWS
    • IntegrationHttpMethod – for Lambda function integration, choose POST
    • Uri – the AWS service URI to integrate with. For Lambda use the example below. Notice that the function name and version are not in the URI, but rather there are variables in their place. This way we can allow the different stages (test and production) to control which Lambda function and which version of that function to call:
    • Credentials – The role to run as when invoking the Lambda function
    • RequestParameters – The mapping of parameters from the request to integration request parameters
      "RequestParameters": {
        "integration.request.path.subpath": "method.request.path.subpath"
    • RequestTemplates – The template for the JSON to pass to the Lambda function. This template captures all the context information from API Gateway that lambda-express will need to create the request that Express understands:
      "RequestTemplates": {
        "application/json": {
          "Fn::Join": ["n",[
            " "stage": "$context.stage",",
            " "request-id": "$context.requestId",",
            " "api-id": "$context.apiId",",
            " "resource-path": "$context.resourcePath",",
            " "resource-id": "$context.resourceId",",
            " "http-method": "$context.httpMethod",",
            " "source-ip": "$context.identity.sourceIp",",
            " "user-agent": "$context.identity.userAgent",",
            " "account-id": "$context.identity.accountId",",
            " "api-key": "$context.identity.apiKey",",
            " "caller": "$context.identity.caller",",
            " "user": "$context.identity.user",",
            " "user-arn": "$context.identity.userArn",",
            " "queryString": "$input.params().querystring",",
            " "headers": "$input.params().header",",
            " "pathParams": "$input.params().path",",
          ] ]
  6. cfn-api-gateway-integration-response – defines how to send the response back to the API client
    • ResponseParameters – a mapping from the integration response to the method response declared above. Notice the CORS headers that are necessary since the hostname for the API Gateway is different from the hostname provided in the Route53 resource record for the S3 bucket. Without these, the browser will deny AJAX request from the site to these APIs. You can read more about CORS in the API Gateway Developer Guide. Also notice that the response Content-Type is pulled from the contentType attribute in the JSON object returned from the Lambda function:
      "ResponseParameters": {
        "method.response.header.Access-Control-Allow-Origin": "'*'",
        "method.response.header.Access-Control-Allow-Methods": "'GET, OPTIONS'",
        "method.response.header.Content-Type": "integration.response.body.contentType"
    • ResponseTemplates – a template of how to create the response from the Lambda invocation. In the reference application, the Lambda function returns a JSON object with a payload attribute containing a Base64 encoding of the response payload:
      "ResponseTemplates": {
        "application/json": "$util.base64Decode( $input.path('$.payload') )"


Additionally, there are two deployment resources created, one for test and one for production. Here is an example of one:

  • cfn-api-gateway-deployment – a deployment has a name that will be the prefix for all resources defined.
    • StageName – “test” or “prod”
    • Variables – a list of variables for the stage. This is where the function name and version are defined for the integration URI defined above:
      "Variables": {
        "AppFunctionName": "MyFunctionName”,
        "AppVersion": "prod"


Deploy via Swagger

I would prefer to replace most of the above API Gateway CloudFormation with a Swagger file in the application source repository and have the pipeline use the import tool to create API Gateway deployments from the Swagger. There are a few challenges with this approach. First, the creation of the Swagger requires including the AWS extensions which has a bit of a learning curve. This challenge is made easier by the fact that you can create the API Gateway via the console and then export Swagger. The other challenge is that the import tool is a Java based application that requires Maven to run. This may be difficult to get working in a Lambda invocation from CodePipeline especially given the 300 second timeout.  I will however be spending some time researching this option and will blog about the results.

Stay Tuned!

Now that we have all the resources in place to deploy our application to, we can build a serverless continuous delivery pipeline with CodePipeline and Lambda. Next week we conclude this series with the third and final part looking at the details of the CloudFormation template for the pipeline and each stage of the pipeline as well as the Lambda functions that support them. Be sure to check it out!


Serverless Delivery: Architecture (Part 1)

If your application tech stack doesn’t need servers, why should your continuous delivery pipeline? Serverless applications deserve serverless delivery!

The software development discipline of continuous delivery has had a tremendous impact on decreasing the cost and risk of delivering changes while simultaneously increasing code quality by ensuring that software systems are always in a releasable state. However, when applying the tools and techniques that exist for this practice to serverless application frameworks and platforms, sometimes existing toolsets do not align well with these new approaches. This post is the first in a three-part series that looks at how to implement the same fundamental tenets of continuous delivery while utilizing tools and techniques that complement the serverless architecture in Amazon Web Services (AWS).

Here are the requirements of the serverless delivery pipeline:

  • Continuous – The pipeline must be capable of taking any commit on master that passes all test cases to production
  • Quality – The pipeline must include unit testing, static code analysis and functional testing of the application
  • Automated – The provisioning of the pipeline and the application must be done from a single CloudFormation command
  • Reproducible – The CloudFormation template should be able to run on a new AWS account with no additional setup other than creation of a Route53 Hosted Zone
  • Serverless – All layers of the application must run on platforms that meet the definition of serverless as described in the next section

What is Serverless?

What exactly is a serverless platform? Obviously, there is still hardware at some layer in the stack to host the application. However, what Amazon has provided with Lambda is a platform where developers no longer need to think about the following:

  • Operating System  – no need to select, secure, configure, administer or patch the OS
  • Servers– no cost risk of over-provisioning and no performance risk of under-provisioning
  • Capacity – no need to monitor utilization and scale capacity based on load
  • High Availability– compute resources are available across multiple AZs

In summary, a serverless platform is one in which an application can be deployed on top of without having to provision or administer any of the resources within the platform. Just show up with some code and the platform handles all the ‘ilities’.


The diagram above highlights the components that are used in this serverless application. Let’s look at each one individually:

  • Node.js + Express – In the demo application for this blog series, we will be deploying a Node.js JavaScript application that leverages the Express framework into Lambda. There is a small bit of “glue” code that we will highlight later in the series to adapt the Lambda contract to the Express contract.
  • Lambda – This is where your application logic runs. You deploy your code here but do not need to specify the number of servers or size of the servers to run the code on. You only pay for the number of requests and amount of time those requests take to execute.
  • API Gateway – The API Gateway exposes your Lambda function at an HTTP endpoint. It provides capabilities such as authorization, policy enforcement, rate limiting and data transformation as a service that is entirely managed by Amazon.
  • DynamoDB – Dynamic data is stored in a DynamoDB table. DynamoDB is a NoSQL datastore that has both infinitely scalable storage capacity and throughput capacity which is entirely managed by Amazon.
  • S3 – All static content including HTML, CSS, images and client-side JavaScript is stored in an S3 bucket and served as a static website directly from S3.

The diagram below compares the pricing for running a Node.js application with Lambda and API Gateway versus a pair of EC2 instances and an ELB. Notice that for the m4.large, the break even is around two million requests per day. It is important to mention that 98.8% of the cost of the serverless deployment is the API Gateway. The cost of running the application out of Lambda is insignificant relative to the costs in using API Gateway.


This cost analysis shows that applications/environments with low transaction volume can realize cost savings by running on Lambda + API Gateway, but the cost of API Gateway will become cost prohibitive at higher scale.

What is Serverless Delivery?

Serverless delivery is just the application of serverless platforms to achieve continuous delivery fundamentals. Specifically, a serverless delivery pipeline does not include tools such as Jenkins or resources such as EC2, Autoscaling Groups, and ELBs.


The diagram above shows the technology used to accomplish serverless delivery for the sample application. Let’s look at what each component provides:

  • AWS CodePipeline – Orchestrates various Lambda tasks to move code that was checked into GitHub forward towards production.
  • AWS S3 Artifact Bucket – Each action in the pipeline can create a new artifact. The artifact becomes an output from the action in the pipeline and is stored in an S3 bucket to become an input for the next action.
  • AWS Lambda – You create Lambda functions to do the work of individual actions in the pipeline. For example, running a gulp task on a repository is handled by a Lambda function.
  • NPM and Gulp – NPM is used for resolving all the dependencies of a given repository. Gulp is used for defining the tasks of the repository, such as running unit tests and packaging artifacts.
  • Testing toolsJSHint is used for performing static analysis of the code, Mocha is a unit test framework for JavaScript and Chai is a BDD assertion library for describing the unit tests.
  • AWS CloudFormation – CFN templates are used to create and update all these resources in a serverless stack.

Serverless delivery for traditional architectures?

Although this serverless delivery architecture could be applied to more traditional application architectures (e.g., a Java application on EC2 and ELB resources) the challenge might be having pipeline actions complete within the 300 second Lambda maximum. For example, running Maven phases within a single Lambda function invocation, including the resolving of dependencies, compilation, unit testing and packaging would likely be difficult. There may be opportunities to split up the goals into multiple invocations and persist state to S3, but that is beyond the scope of this series.


The pricing model for Lambda is favorable for applications that have an idle time and the cost grows linearly with the number of executions. The diagram below compares the pricing for running the pipeline with Lambda and CodePipeline against a Jenkins server running on an EC2 instance. For best performance, the Jenkins server ought to run on an m4.large instance, but just to highlight the savings, m3.medium and t2.micro instances were evaluated as well. Notice that for the m4.large, the break even happens after you are doing over 600 builds per day and even with a t2.micro, the break even doesn’t happen until well over 100 builds per day.

In conclusion, running a continuous delivery pipeline with CodePipeline + Lambda is very attractive based on the cost efficiency of utility pricing, the simplicity of using a managed services environment, and the tech stack parity of using Node.js for both the application and the pipeline.

Stay Tuned!

Next week we will dive into part two of this series, looking at what changes need to be made for an Express application to run in Lambda and the CloudFormation templates needed to create a serverless delivery pipeline. Finally, we will conclude with part three going into the details of each stage of the pipeline and the Lambda functions that support them.


Continuous Delivery in the Cloud: Dynamic Configuration (Part 4 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application. In part 2 I went over how we use this CD pipeline to deliver software from checkin to production. In part 3, we focused on how CloudFormation is used to script the virtual AWS components that create the Manatee infrastructure. A list of topics for each of the articles is summarized below:

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – In-depth look at the CD Pipeline;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration –  What you’re reading now;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

In this part of the series, I am going to explain how we dynamically generate our configuration and avoid property files whenever possible. Instead of using property files, we store and retrieve configuration on the fly – as part of the CD pipeline – without predefining these values in a static file (i.e. a properties file) ahead of time. We do this using two methods: AWS SimpleDB and CloudFormation.

SimpleDB is a highly available non-relational data storage service that only stores strings in key value pairs. CloudFormation, as discussed in Part 3 of the series, is a scripting language for allocating and configuring AWS virtual resources.

Using SimpleDB

Throughout the CD pipeline, we often need to manage state across multiple Jenkins jobs. To do this, we use SimpleDB. As the pipeline executes, values that will be needed by subsequent jobs get stored in SimpleDB as properties. When the properties are needed we use a simple Ruby script script to return the key/value pair from SimpleDB and then use it as part of the job. The values being stored and retrieved range from IP addresses and domain names to AMI (Machine Images) IDs.

So what makes this dynamic? As Jenkins jobs or CloudFormation templates are run, we often end up with properties that need to be used elsewhere. Instead of hard coding all of the values to be used in a property file, we create, store and retrieve them as the pipeline executes.

Below is the CreateTargetEnvironment Jenkins job script that creates a new target environment from a CloudFormation script production.template

if [ $deployToProduction ] == true

# Create Cloudformaton Stack
ruby /usr/share/tomcat6/scripts/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby /usr/share/tomcat6/scripts/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby /usr/share/tomcat6/scripts/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

Referenced above in the CreateTargetEnvironment code snippet. This is the load_domain.rb script that iterates over a file and sends key/value pairs to SimpleDB.

require 'rubygems'
require 'aws-sdk'
load File.expand_path('../../config/aws.config', __FILE__)


file ="/tmp/properties", "r")

sdb =

AWS::SimpleDB.consistent_reads do
  domain =["stacks"]
  item = domain.items["#{stackname}"]

  file.each_line do|line|
    key,value = line.split '='
      "#{key}" => "#{value}")

Referenced above in the CreateTargetEnvironment code snippet. This is the showback_domain.rb script which connects to SimpleDB and returns a key/value pair.

load File.expand_path('../../config/aws.config', __FILE__)


sdb =

AWS::SimpleDB.consistent_reads do
  domain =["stacks"]
  item = domain.items["#{item_name}"]

  item.attributes.each_value do |name, value|
    if name == "#{key}"
      puts "#{value}".chomp

In the above in the CreateTargetEnvironment code snippet, we store the outputs of the CloudFormation stack in a temporary file. We then iterate over the file with the load_domain.rb script and store the key/value pairs in SimpleDB.

Following this, we make a call to SimpleDB with the showback_domain.rb script and return the instance IP address (created in the CloudFormation template) and store it in the host variable. host is then used by cucumber to ssh into the target instance and run the acceptance tests.

Using CloudFormation

In our CloudFormation templates we allocate multiple AWS resources. Every time we run the template, a different resource is being used. For example, in our jenkins.template we create a new IAM user. Every time we run the template a different IAM user with different credentials is created. We need a way to reference these resources. This is where CloudFormation comes in. You can reference resources within other resources throughout the script. You can define a reference to another resource using the Ref function in CloudFormation. Using Ref, you can dynamically refer to values of other resources such as an IP Address, domain name, etc.

In the script we are creating an IAM user, referencing the IAM user to create AWS Access keys and then storing them in an environment variable.

"CfnUser" : {
  "Type" : "AWS::IAM::User",
  "Properties" : {
    "Path": "/",
    "Policies": [{
      "PolicyName": "root",
      "PolicyDocument": {

"HostKeys" : {
  "Type" : "AWS::IAM::AccessKey",
  "Properties" : {
    "UserName" : { "Ref": "CfnUser" }

"# Add AWS Credentials to Tomcatn",
"echo "AWS_ACCESS_KEY=", { "Ref" : "HostKeys" }, "" >> /etc/sysconfig/tomcat6n",
"echo "AWS_SECRET_ACCESS_KEY=", {"Fn::GetAtt": ["HostKeys", "SecretAccessKey"]}, "" >> /etc/sysconfig/tomcat6n",

We can then use these access keys in other scripts by referencing the $AWS_ACCESS_KEY and $AWS_SECRET_ACCESS_KEY environment variables.

How is this different from typical configuration management?

Typically in many organizations, there’s a big property with hard coded key/value pairs that gets passed into the pipeline. The pipeline executes using the given parameters and cannot scale or change without a user modifying the property file. They are unable to scale or adapt because all of the properties are hard coded, if the property file hard codes the IP to an EC2 instance and it goes down for whatever reason, their pipeline doesn’t work until someone fixes the property file. There are more effective ways of doing this when using the cloud. The cloud is provides on-demand resources that will constantly be changing. These resources will have different IP addresses, domain names, etc associated with them every time.

With dynamic configuration, there are no property files, every property is generated as part of the pipeline.

With this dynamic approach, the pipeline values change with every run. As new cloud resources are allocated, the pipeline is able to adjust itself and automatically without the need for users to constantly modify property files. This leads to less time spent debugging those cumbersome property file management issues that plague most companies.

In the next part of our series – which is all about Deployment Automation – we’ll go through scripting and testing your deployment using industry-standard tools. In this next article, you’ll see how to orchestrate deployment sequences and configuration using Capistrano.

Continuous Delivery in the Cloud: CD Pipeline (Part 2 of 6)

In part 1 of this series, I introduced the Continuous Delivery (CD) pipeline for the Manatee Tracking application and how we use this pipeline to deliver software from checkin to production. In this article I will take an in-depth look at the CD pipeline. A list of topics for each of the articles is summarized below.

Part 1: Introduction – Introduction to continuous delivery in the cloud and the rest of the articles;
Part 2: CD Pipeline – What you’re reading now;
Part 3: CloudFormation – Scripted virtual resource provisioning;
Part 4: Dynamic Configuration – “Property file less” infrastructure;
Part 5: Deployment Automation – Scripted deployment orchestration;
Part 6: Infrastructure Automation – Scripted environment provisioning (Infrastructure Automation)

The CD pipeline consists of five Jenkins jobs. These jobs are configured to run one after the other. If any one of the jobs fail, the pipeline fails and that release candidate cannot be released to production. The five Jenkins jobs are listed below (further details of these jobs are provided later in the article).

  1. 1) A job that set the variables used throughout the pipeline (SetupVariables)
  2. 2) Build job (Build)
  3. 3) Production database update job (StoreLatestProductionData)
  4. 4) Target environment creation job (CreateTargetEnvironment)
  5. 5) A deployment job (DeployManateeApplication) which enables a one-click deployment into production.

We used Jenkins plugins to add additional features to the core Jenkins configuration. You can extend the standard Jenkins setup by using Jenkins plugins. A list of the plugins we use for the Sea to Shore Alliance Continuous Delivery configuration are listed below.

Paramterized Trigger:
Copy Artifact:
Build Pipeline:

The parameterized trigger, build pipeline and S3 plugins are used for moving the application through the pipeline jobs. The Ant, Groovy, and Grails plugins are used for running the build for the application code. Subversion for polling and checking out from version control.

Below, I describe each of the jobs that make up the CD pipeline in greater detail.

SetupVariables: Jenkins job used for entering in necessary property values which are propagated along the rest of the pipeline.

Parameter: STACK_NAME
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CloudFormation Stack name and SimpleDB property domain associated with the CloudFormation stack.

Parameter: HOST
Type: String
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Defines the CNAME of the domain created in the CreateTargetEnvironment job. The DeployManateeApplication job uses it when it dynamically creates configuration files. For instance, in, test would be the HOST

Type: String
Where: Used in the StoreProductionData job
Purpose: Sets the production IP for the job so that it can SSH into the existing production environment and run a database script that exports the data and uploads it to S3.

Parameter: deployToProduction
Type: Boolean
Where: Used in both CreateTargetEnvironment and DeployManateeApplication jobs
Purpose: Determines whether to use the development or production SSH keypair.

In order for the parameters to propagate through the pipeline, we pass the current build parameters using the parametrized build trigger plugin

Build: Compiles the Manatee application’s Grails source code and creates a WAR file.

To do this, we utilize a Jenkins grails plugin and run grails targets such as compile and prod war. Next, we archive the grails migrations for use in the DeployManateeApplication job and then the job pushes the Manatee WAR up to S3 which is used as an artifact repository.

Lastly, using the trigger parametrized build plugin, we trigger the StoreProductionData job with the current build parameters.

StoreProductionData: This job performs a pg dump (PostgreSQL dump) of the production database and then stores it up in S3 for the environment creation job to use when building up the environment. Below is a snippet from this job.

ssh -i /usr/share/tomcat6/development.pem -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no ec2-user@${PRODUCTION_IP} ruby /home/ec2-user/database_update.rb

On the target environments created using the CD pipeline, a database script is stored. The script goes into the PostgreSQL database and runs a pg_dump. It then pushes the pg_dump SQL file to S3 to be used when creating the target environment.

After the SQL file is stored successfully, the CreateTargetEnvironment job is triggered.

CreateTargetEnvironment: Creates a new target environment using a CloudFormation template to create all the AWS resources and calls puppet to provision the environment itself from a base operating system to a fully working target environment ready for deployment. Below is a snippet from this job.

if [ $deployToProduction ]

# Create Cloudformaton Stack
ruby ${WORKSPACE}/config/aws/create_stack.rb ${STACK_NAME} ${WORKSPACE}/infrastructure/manatees/production.template ${HOST} ${JENKINSIP} ${SSH_KEY} ${SGID} ${SNS_TOPIC}

# Load SimpleDB Domain with Key/Value Pairs
ruby ${WORKSPACE}/config/aws/load_domain.rb ${STACK_NAME}

# Pull and store variables from SimpleDB
host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`

# Run Acceptance Tests
cucumber ${WORKSPACE}/infrastructure/manatees/features/production.feature host=${host} user=ec2-user key=/usr/share/tomcat6/.ssh/id_rsa

# Publish notifications to SNS
sns-publish --topic-arn $SNS_TOPIC --subject "New Environment Ready" --message "Your new environment is ready. IP Address: $host. An example command to ssh into the box would be: ssh -i development.pem ec2-user@$host This instance was created by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

Once the environment is created, a set of Cucumber tests is run to ensure it’s in the correct working state. If any test fails, the entire pipeline fails and the developer is notified something went wrong. Otherwise if it passes, the DeployManateeApplication job is kicked off and an AWS SNS email notification with information to access the new instance is sent to the developer.

DeployManateeApplication: Runs a Capistrano script which uses steps in order to coordinate the deployment. A snippet from this job is displayed below.

if [ !$deployToProduction ]


cap deploy:setup stack=${STACK_NAME} key=${SSH_KEY}

sed -i "s@manatee0@${HOST}@" ${WORKSPACE}/deployment/features/deployment.feature

host=`ruby ${WORKSPACE}/config/aws/showback_domain.rb ${STACK_NAME} InstanceIPAddress`
cucumber deployment/features/deployment.feature host=${host} user=ec2-user key=${SSH_KEY} artifact=

sns-publish --topic-arn $SNS_TOPIC --subject "Manatee Application Deployed" --message "Your Manatee Application has been deployed successfully. You can view it by going to http://$host/wildtracks This instance was deployed to by $JENKINS_DOMAIN" --aws-credential-file /usr/share/tomcat6/aws_access

This deployment job is the final piece of the delivery pipeline, it pulls together all of the pieces created in the previous jobs to successfully deliver working software.

During the deployment, the Capistrano script SSH’s into the target server, deploys the new war and updated configuration changes and restarts all services. Then the Cucumber tests are run to ensure the application is available and running successfully. Assuming the tests pass, an AWS SNS email gets dispatched to the developer with information on how to access their new development application

We use Jenkins as the orchestrator of the pipeline. Jenkins executes a set of scripts and passes around parameters as it runs each job. Because of the role Jenkins plays, we want to make sure it’s treated the same way as application – meaning versioning and testing all of our changes to the system. For example, if a developer modifies the create environment job configuration, we want to have the ability to revert back if necessary. Due to this requirement we version the Jenkins configuration. The jobs, plugins and main configuration. To do this, a script is executed each hour using cron.hourly that checks for new jobs or updated configuration and commits them up to version control.

The CD pipeline that we have built for the Manatee application enables any change in the application, infrastructure, database or configuration to move through to production seamlessly using automation. This allows any new features, security fixes, etc. to be fully tested as it gets delivered to production at the click of a button.

In the next part of our series – which is all about using CloudFormation – we’ll go through a CloudFormation template used to automate the creation of a Jenkins environment. In this next article, you’ll see how CloudFormation procures AWS resources and provisions our Jenkins CD Pipeline environment.

Continuous Delivery in the Cloud Case Study

A Case Study on using 100% Cloud-based Resources with Automated Software Delivery

We help – typically large – organizations create one-click software delivery systems so that they can deliver software in a more rapid, reliable and repeatable manner (AKA Continuous Delivery). The only way this works is when Development works with Operations. As has been written elsewhere in this series, this means changing the hearts and minds of people because most organizations are used to working in ‘siloed’ environments. In this entry, I focus on implementation, by describing a real-world case study in which we have brought Continuous Delivery Operations to the Cloud consisting of a team of Systems and Software Engineers.  

For years, we’ve helped customers in Continuous Integration and Testing so more of our work was with Developers and Testers. Several years ago, we hired a Sys Admin/Engineer/DBA who was passionate about automation. As a result of this, we began assembling multiple two-person “DevOps” teams consisting of a Software Engineer and a Systems Engineer both of whom being big-picture thinkers and not just “Developers” or “Sys Admins”. These days, we put together these targeted teams of Continuous Delivery and Cloud experts with hands-on experience as Software Engineers and Systems Engineers so that organizations can deliver software as quickly and as often as the business requires.

A couple of years ago we already had a few people in the company who were experimenting with using Cloud infrastructures so we thought this would be a great opportunity in providing cloud-based delivery solutions. In this case study, I cover a project we are currently working on for a large organization. It is a new Java-based web services project so we’ve been able to implement solutions using our recommended software delivery patterns rather than being constrained by legacy tools or decisions. However, as I note, we aren’t without constraints on this project. If I were you, I’d call “BS!” on any “case study” in which everything went flawlessly and assume it was an extremely small or a theoretical project in the author’s mind. This is the real deal. Enough said, on to the case study.      

AWS Tools

Fast Facts

Industry: Healthcare, Public Sector
Profile: The customer is making available to all, free of charge, a series of software specifications and open source software modules that together make up an oncology-extended Electronic Health Record capability.
Key Business Issues: The customer was seeking that all team members are provided “unencumbered” access to infrastructure resources without the usual “request and wait” queued-based procedures present in most organizations
Stakeholders: Over 100 people consisting of Developers, Testers, Analysts, Architects, and Project Management.
Solution: Continuous Delivery Operations in the Cloud
Key Tools/Technologies: Amazon Web Services  – AWS (Elastic Computer Cloud (EC2), (Simple Storage Service (S3), Elastic Block Storage (EBS), etc.), Jenkins, JIRA Studio, Ant, Ivy, Tomcat and PostgreSQL

The Business Problem
The customer was used to dealing with long drawn-out processes with Operations teams that lacked agility. They were accustomed to submitting Word documents via email to an Operations teams, attending multiple meetings and getting their environments setup weeks or months later. We were compelled to develop a solution that reduced or eliminated these problems that are all too common in many large organizations (Note: each problem is identified as a letter and number, for example: P1, and referred to later):

  1. Unable to deliver software to users on demand (P1)
  2. Queued requests for provisioned instances (P2)
  3. Unable to reprovision precise target environment configuration on demand (P3)
  4. Unable to provision instances on demand (P4)
  5. Configuration errors in target environments presenting deployment bottlenecks while Operations and Development teams troubleshoot errors (P5)
  6. Underutilized instances (P6)
  7. No visibility into purpose of instance (P7)
  8. No visibility into the costs of instance (P8)
  9. Users cannot terminate instances (P9)
  10. Increased Systems Operations personnel costs (P10)

Our Team
We put together a four-person team to create a solution for delivering software and managing the internal Systems Operations for this 100+ person project. We also hired a part-time Security expert. The team consists of two Systems Engineers and two Software Engineers focused on Continuous Delivery and the Cloud. One of the Software Engineers is the Solutions Architect/PM for our team.

Our Solution
We began with the end in mind based on the customer’s desire for unencumbered access to resources. To us, “unencumbered” did not mean without controls; it meant providing automated services over queued “request and wait for the Ops guy to fulfill the request” processes. Our approach is that every resource is in the cloud: Software as a Service (SaaS), Platform as a Service (PaaS) or Infrastructure as a Service (IaaS) to reduce operations costs (P10) and increase efficiency. In doing this, effectively all project resources are available on demand in the cloud. We have also automated the software delivery process to Development and Test environments and working on the process of one-click delivery to production. I’ve identified the problem we’re solving – from above – in parentheses (P1, P8, etc.). The solution includes:

  • On-Demand Provisioning – All hardware is provided via EC2’s virtual instances in the cloud, on demand (P2). We’ve developed a “Provisioner” (PaaS) that provides any authorized team member the capability to click a button and get their project-specific target environment (P3) in the AWS’ cloud – thus, providing unencumbered access to hardware resources. (P4) The Provisioner provides all authorized team members the capability to monitor instance usage (P6) and adjust accordingly. Users can terminate their own virtual instances (P9).
  • Continuous Delivery Solution so that the team can deliver software to users on demand (P1):
    • Automated build script using Ant – used to drive most of the other automation tools
    • Dependency Management using Ivy. We will be adding Sonatype Nexus
    • Database Integration/Change using Ant and Liquibase
    • Automated Static Analysis using Sonar (with CheckStyle, FindBugs, JDepend, and Cobertura)
    • Test framework hooks for running JUnit, etc.
    • Reusing remote Deployment custom Ant scripts that use Java Secure Channel and Web container configuration. However, we will be starting a process of using a more robust tool such as ControlTier to perform deployment
    • Automated document generation using Grand, SchemaSpy (ERDs) and UMLGraph
    • Continuous Integration server using Hudson
    • Continuous Delivery pipeline system – we are customizing Hudson to emulate a Deployment Pipeline
  • Issue Tracking – We’re using the JIRA Studio SaaS product from Atlassian (P10), which provides issue tracking, version-control repository, online code review and a Wiki. We also manage the relationship with the vendor and perform the user administration including workflow management and reporting.
  • Development Infrastructure – There were numerous tools selected by the customer for Requirements Management and Test Management and Execution including HP QC, LoadRunner, SoapUI, Jama Contour. Many of these tools were installed and managed by our team onto the EC2 instances
  • Instance Management – Any authorized team member is able to monitor virtual instance usage by viewing a web-based dashboard (P6, P7, P8) we developed. This helps to determine instances that should no longer be in use or may be eating up too much money. There is a policy that test instances (e.g. Sprint Testing) are terminated no less than every two weeks. This promotes ephemeral environments and test automation.
  • Deployment to Production – Much of the pre-production infrastructure is in place, but we will be adding some additional automation features to make it available to users in production (P1). The deployment sites are unique in that we aren’t hosting a single instance used by all users and it’s likely the software will be installed at each site. One plan is to deploy separate instances to the cloud or to virtual instances that are shipped to the user centers

    System Monitoring and Disaster Recovery – Using CloudKick to notify us of instance errors or anomalies. EC2 provides us with some monitoring as well. We will be implementing a more robust monitoring solution using Nagios or something similar in the coming months. Through automation and supporting process, we’ve implemented a disaster recovery solution.

The benefits are primarily around removing the common bottlenecks from processes so that software can be delivered to users and team members more often. Also, we think our approach to providing on-demand services over queued-based requests increases agility and significantly reduces costs. Here are some of the benefits:

  • Deliver software more often – to users and internally (testers, managers, demos)
  • Deliver software more quickly – since the software delivery process is automated, we identify the SVN tag and click a button to deliver the software to any environment
  • Software delivery is rapid, reliable and repeatable. All resources can be reproduced with a single click – source code, configuration, environment configuration, database and network configuration is all checked in and versioned and part of a single delivery system.
  • Increased visibility to environments and other resources – All preconfigured virtual hardware instances are available for any project member to provision without needing to submit forms or attend countless meetings

Here are some of the tools we are using to deliver this solution. Some of the tools were chosen by our team exclusively and some by other stakeholders on the project.

  • AWS EC2 – Cloud-based virtual hardware instances
  • AWS S3 – Cloud-based storage. We use S3 to store temporary software binaries and backups
  • AWS EBS – Elastic Block Storage. We use EBS to attach PostgreSQL data volumes
  • Ant – Build Automation
  • CloudKick – Real-time Cloud instance monitoring
  • ControlTier – Deployment Automation. Not implemented yet.
  • HP LoadRunner – Load Testing
  • HP Quality Center (QC) – Test Management and Orchestration
  • Ivy – Dependency Management
  • Jama Contor – Requirements Management
  • Jenkins – Continuous Integration Server
  • JIRA Studio – Issue Tracking, Code Review, Version-Control, Wiki
  • JUnit – Unit and Component Testing
  • Liquibase – Automated database change management
  • Nagios – or Zenoss. Not implemented yet
  • Nexus – Dependency Management Repository Manager (not implemented yet)
  • PostgreSQL – Database used by Development team. We’ve written script that automate database change management
  • Provisioner (Custom Web-based) – Target Environment Provisioning and Virtual Instance Monitoring
  • Puppet – Systems Configuration Management
  • QTP – Test Automation
  • SoapUI – Web Services Test Automation
  • Sonar – code quality analysis (Includes CheckStyle, PMD, Cobertura, etc.)
  • Tomcat/JBoss – Web container used by Development. We’ve written script to automate the deployment and container configuration

Solutions we’re in the process of Implementing
We’re less than a year into the project and have much more work to do. Here are a few projects we’re in the process or will be starting to implement soon:

  • System Configuration Management – We’ve started using Puppet, but we are expanding how it’s being used in the future
  • Deployment Automation – The move to a more robust Deployment automation tool such as ControlTier
  • Development Infrastructure Automation – Automating the provisioning and configuration of tools such as HP QC in a cloud environment. etc.

What we would do Differently
Typically, if we were start a Java-based project and recommend tools around testing, we might choose the following tools for testing, requirements and test management based on the particular need:

  • Selenium with SauceLabs
  • JIRA Studio for Test Management
  • JIRA Studio for Requirements Management
  • JMeter – or other open source tool – for Load Testing

However, like most projects there are many stakeholders who have their preferred approach and tools they are familiar in using, the same way our team does. Overall, we are pleased with how things are going so far and the customer is happy with the infrastructure and approach that is in place at this time. I could probably do another case study on dealing with multiple SaaS vendors, but I will leave that for another post.

There’s much more I could have written about what we’re doing, but I hope this gives you a decent perspective of how we’ve implemented a DevOps philosophy with Continuous Delivery and the Cloud and how this has led our customer to more a service-based, unencumbered and agile environment. 

DevOps in the Cloud LiveLessons (Video)

DevOps in the Cloud LiveLessons walks viewers through the process of putting together a complete continuous delivery platform for a working software application written in Ruby on Rails along with examples in other development platforms such as Grails and Java on the companion website. These applications are deployed to Amazon Web Services (AWS), which is an infrastructure as a service, commonly referred to as “the cloud”. Paul M. Duvall goes through the pieces that make up this platform including infrastructure and environments, continuous integration, build and deployment scripting, testing and database. Also, viewers will learn configuration management and collaboration practices and techniques along with what those nascent terms known as DevOps, continuous delivery and continuous deployment are all about. Finally, since this LiveLesson focuses on deploying to the cloud, viewers will learn the ins and outs of many of the services that make up the AWS cloud infrastructure. DevOps in the Cloud LiveLessons includes contributions by Brian Jakovich, who is a Continuous Delivery Engineer at Stelligent.

DevOps in the Cloud LiveLessons

Visit to download the complete continuous delivery platform examples that are used in these LiveLessons.

Lesson 1:
Deploying a Working Software Application to the Cloud provides a high-level introduction to all parts of the Continuous Delivery system in the Cloud. In this lesson, you’ll be introduced to deploying a working software application into the cloud.

Lesson 2:
DevOps, Continuous Delivery, Continuous Deployment and the Cloud covers how to define motivations and differentiators around Continuous Delivery, DevOps, Continuous Deployment and the Cloud. The lesson also covers diagramming software delivery using spaghetti diagrams and value-stream maps.

Lesson 3:
Amazon Web Services covers the basics of the leading Infrastructure as a Service provider. You’ll learn how to use the AWS Management Console, launch and interact with Elastic Compute Cloud (EC2) instances, define security groups to control access to EC2 instances, set up an elastic load balancer to distribute load across EC2 instances, set up Auto Scaling, and monitor resource usage with CloudWatch.

Lesson 4:
Continuous Integration shows how to set up a Continuous Integration (CI) environment, which is the first step to Continuous Integration, Continuous Delivery and Continuous Deployment.

Lesson 5:
Infrastructure Automation covers how to fully script an infrastructure so that you can recreate any environment at any time utilizing AWS CloudFormation and the infrastructure automation tool, Puppet. You’ll also learn about the “Chaos Monkey” tool made popular by NetFlix – a tool that randomly and automatically terminates instances.

Lesson 6:
Building and Deploying Software teaches the basics of building and deploying a software application. You’ll learn how to create and run a scripted build, create and run a scripted deployment in Capistrano, manage dependent libraries using the Bundler and an Amazon S3-backed repository, and deploy the software to various target environments including production using the Jenkins CI server. You will also learn how anyone on the team can use Jenkins to perform self-service deployments on demand.

Lesson 7:
Configuration Management covers the best approaches to versioning everything in a way where you have a single source of truth and can look at the software system and everything it takes to create the software as a holistic unit. You’ll learn how to work from the canonical version, version configurations, set up a dynamic configuration management database that reduces the repetition, and develop collective ownership of all artifacts.

Lesson 8:
Database, covers how to entirely script a database, upgrade and downgrade a database, use a database sandbox to isolate changes from other developers and finally, to version all database changes so that they can run as part of a Continuous Delivery system.

Lesson 9:
Testing, covers the basics of writing and running various tests as part of a Continuous Integration process. You’ll learn how to write simple unit tests that will run fast tests at the code level, infrastructure tests, and deployment tests – sometimes called smoke tests. You will also learn how to get feedback on the test results from the CI system.

Lesson 10:
Delivery Pipeline demonstrates how to use the Build Pipeline plug-in in Jenkins to create a delivery pipeline for the commit, acceptance, load & performance and Production stages so that software can potentially be delivered to users with every change.

Continuous Delivery with Jenkins, CloudFormation and Puppet Online Course

This 3-hour online course is ideal for Developers or Sys Admins who want to know how to script a complete Continuous Delivery platform in the Cloud using Jenkins, AWS and Puppet. By the end of the course, you will have a working Continuous Integration system on AWS with all of the source code. The course is led by award-winning author of Continuous Integration and Stelligent CTO, Paul Duvall. The next course is Tuesday August 14, 2012 at 1PM EDT.

Attendees should be at least a junior-level Developer or Linux Sys Admin

In the course, you will:

  1. Use AWS CloudFormation to script the provisioning of AWS resources such as EC2, Route 53 and S3
  2. Script the entire Jenkins and target environment infrastructure using Puppet
  3. Create a fully-functioning Continuous Delivery platform using Jenkins and AWS
  4. Participate in discussions on Continuous Delivery patterns and best practices

You will also be able to email the trainer after the course with questions on setting up your CI platform in AWS.

Register now

There will be 15+ separate exercises that you will go through in creating a working Jenkins system in AWS. If you do not have an AWS account, you will need sign up by registering with Amazon Web Services as a part of the course pre-work (a credit card to pay AWS is required). The course material will be provided to students prior to the session. You will spend approximately $10 on AWS fees during the 3-hour training session.

You will be provided access to all of the source code and instructional material prior to, during and after the course If you have any questions, contact us at A website with some of the example content for the course is located at

Test-Driven Everything

In a lot of ways, developer-time testing is a solved problem. Not to say that test-driven development is always easy. There are still plenty of people and technical issues to sort through when you decide to start testing your application aggressively. That said, I think that bringing testing into every aspect of your software development lifecycle presents a much greater challenge, and therefore fun to tackle.

I get asked frequently how to automate testing for every team member. The reasoning usually goes something like this, “If test-driven development is good for programmers, why wouldn’t it be good for business analysts?” Feel free to replace “business analysts” with anyone on the project team – customer, project manager, quality assurance engineer, etc. It may seem a little radical. Fortunately, I agree with them in principle.

In fact, I like to say that tests are the primary artifacts that drive the communication from one team member to another. More precisely, I mean an executable test that gets run with every build and fails when the application no longer does what we want. In this post, I describe how my team and I got this working on a very successful project.

I’ll set the stage by describing the application, team makeup and development process. The application was a “back-end” system, meaning that it had no graphical user interface. Messages flowed into the system, which in turn updated the state of the database and orchestrated various interactions with other external systems. After much processing, the application spit out a message to indicate success, failure or some variation of the two. The entire project ran fulltime for a little more than one year. The team consisted of about 15 people consistently, sometimes going as high as 25. We had a fantastic war room, large enough for everyone on the team. And, we had a very well defined workflow for our Stories. At each step in this process, there was a heavy focus on testing.


Yep, that’s right, two passes through Quality Assurance. I’ll come back to why. And, just in case you think this sounds a little heavyweight, a Story would flow through this process in less than 2 weeks with lots of opportunity for the all-important feedback loop. We usually had about 6 Stories flowing in parallel to keep everyone on the team fully utilized.

Step one was prioritization with the customer. As the business customers reviewed and prioritized Stories, they would conduct meetings to determine what each Story really meant. Inevitably during this meeting interesting edge cases would arise. If the edge case was in scope for the Story, it would be documented in the body of the Story card. Those edge cases were the first stab at testing. Each case was essentially a comment from the customers saying “…and don’t forget to test that the application will handle this scenario”. Over time, the customers learned that by documenting these interesting variations right alongside the Story, they were helping to make the application better.

Next, the Story would get a thorough analysis. That meant looking into the guts of the functionality to make sure that every aspect was written down for the developer. As a part of that, the analysts would create tables of test scenarios. In fact some Stories consisted mostly of test cases. It’s surprising how well one can communicate complex functionality by giving examples. The amount of verbiage in the Story decreased as the analysts started to think more about how to test the functionality, instead of how to describe the functionality.

The analysts would also incorporate those edge cases identified by the business customers into the test cases. Frequently, the analysts would realize even more unique variants, and those too…would be documented in the Story as an example test case.

Ok, so far, this doesn’t sound much different than what everyone does. The difference was, those tables of test cases identified by the analysts were executable! Ok…you caught me…they were nearly executable. We taught the analysts how to create test cases consumable by the testing framework Fit, and FitNesse. The output from the analysis process was a textual description of the desired functionality with embedded executable (nearly) example test cases.

The third step in the process was a stop by the quality assurance folks. They would review the Story and think about how to test the functionality. They frequently discovered even more variations and potential impact to the rest of the system functionality. They possessed an intimate understanding of the custom FitNesse fixtures defined by the team. Therefore, they were able to streamline the testing, by using meaningful variables, random number generators to increase the test’s completeness, etc.

At this point, the Story was truly executable. Meaning, we could run it in FitNesse. It would of course fail, because the functionality hadn’t been implemented by the development team, but we would get a red bar. Red bars make a good starting point for writing code.

And here we are! Developers get to write some code to make that test pass.

We referred to our development approach as “Fit-First Development”. Just like test-driven development, where you start by writing a broken unit test, we would start with a broken Fit test and write just enough code to make it pass. Actually, we would drill down from the Fit test into the code, writing tests all the way along. We would sometimes expand the Fit test. We would always write unit tests, in this case using JUnit. As we got the unit tests passing we would pop our way back up to the Fit test. Sometimes, we would get the Fit test working one table row at a time until the correct functional implementation emerged.

As the tech lead on the team that pair programmed to write code all day long, I found the approach to be exhilarating. Never before had I felt so assured that the code my pair and I wrote actually did what the business wanted.

After the developers completed the code, i.e. made the Fit test green bar, they would pass it back to the QA team. One point here…The QA team was sitting in the same room. As mentioned, everyone including the customer was sitting in the same war room. Although my description might make it sound like chunks of work were being lobbed over the wall, that wasn’t the case. We talked back and forth continuously. There were times when the developers would encounter a contradiction between the plain English description and the test scenarios. In those cases, it was a matter of minutes before the analysts and QA people were hunched over the developer work station determining the correct answer.

So, the Story, along with the code flowed back into the hands of the QA team. They would do a thorough review of the functionality. Sometimes they would add more test scenarios that had become obvious, which might in turn result in the Story heading back into development (the aforementioned feedback loop). Once QA was satisfied, they would call the customer representative over to do a final functionality walk-through. Since it was a back-end system, the Fit test plan provided a visual mechanism that even a technically challenged person could relate to. When the customer representative was satisfied, the final sign-off was provided. In this manner, each and every Story went from start to finish.

A side-effect of this approach was that a huge regression test suite was created step-by-step throughout the course of the project. The executable Stories were stored in Subversion, right alongside the code. The Stories, and the tests contained therein, became first-class artifacts that warranted ongoing maintenance to ensure that functionality developed months ago didn’t inadvertently get broken.

The other aspect of this was that we included the entire FitNesse wiki in Subversion, too. We used lightweight, in-memory containers to simulate the production environment. That meant that anyone on the team could checkout the project from Subversion and run the entire regression test suite, locally. Nobody had to connect to a central server. Nobody needed their own database schema. You could checkout from Subversion, double-click a batch/shell script, and off you go. Everything just worked!

Was it easy to make everything just work? Yes. It was easy because we started the project knowing that we wanted to make everything just work. It was the culmination of a couple years of trying to get better and better about automating the entire process. But now, I would seldom knowingly do it any other way. Automate everything. Check in everything. Make your Stories an executable description. And treat those Test Scenarios just like you would your application’s code.

Originally authored by Paul Julius at