Docker Swarm Mode on AWS

Docker Swarm Mode is the latest entrant in a large field of container orchestration systems. Docker Swarm was originally released as a standalone product that ran master and agent containers on a cluster of servers to orchestrate the deployment of containers. This changed with the release of Docker 1.12 in July of 2016. Docker Swarm Mode is now officially part of docker-engine, and built right into every installation of Docker. Swarm Mode brought many improvements over the standalone Swarm product, including:

  • Built-in Service Discovery: Docker Swarm originally included drivers to integrate with Consul, etcd or Zookeeper for the purposes of Service Discovery. However, this required the setup of a separate cluster dedicated to service discovery. The Swarm Mode manager nodes now assign a unique DNS name to each service in the cluster, and load balances between the running containers in those services.
  • Mesh Routing: One of the most unique features of Docker Swarm Mode is  Mesh Routing. All of the nodes within a cluster are aware of the location of every container within the cluster via gossip. This means that if a request arrives on a node that is not currently running the service for which that request was intended, the request will be routed to a node that is running a container for that service. This makes it so that nodes don’t have to be purpose built for specific services. Any node can run any service, and every node can be load balanced equally, reducing complexity and the number of resources needed for an application.
  • Security: Docker Swarm Mode uses TLS encryption for communication between services and nodes by default.
  • Docker API: Docker Swarm Mode utilizes the same API that every user of Docker is already familiar with. No need to install or learn additional software.
  • But wait, there’s more! Check out some of the other features at Docker’s Swarm Mode Overview page.

For companies facing increasing complexity in Docker container deployment and management, Docker Swarm Mode provides a convenient, cost-effective, and performant tool to meet those needs.

Creating a Docker Swarm cluster


For the sake of brevity, I won’t reinvent the wheel and go over manual cluster creation here. Instead, I encourage you to follow the fantastic tutorial on Docker’s site.

What I will talk about however is the new Docker for AWS tool that Docker recently released. This is an AWS Cloudformation template that can be used to quickly and easily set up all of the necessary resources for a highly available Docker Swarm cluster, and because it is a Cloudformation template, you can edit the template to add any additional resources, such as Route53 hosted zones or S3 buckets to your application.

One of the very interesting features of this tool is that it dynamically configures the listeners for your Elastic Load Balancer (ELB). Once you deploy a service on Docker Swarm, the built-in management service that is baked into instances launched with Docker for AWS will automatically create a listener for any published ports for your service. When a service is removed, that listener will subsequently be removed.

If you want to create a Docker for AWS stack, read over the list of prerequisites, then click the Launch Stack button below. Keep in mind you may have to pay for any resources you create. If you are deploying Docker for AWS into an older account that still has EC2-Classic, or wish to deploy Docker for AWS into an existing VPC, read the FAQ here for more information.


Deploying a Stack to Docker Swarm

With the release of Docker 1.13 in January of 2017, major enhancements were added to Docker Swarm Mode that greatly improved its ease of use. Docker Swarm Mode now integrates directly with Docker Compose v3 and officially supports the deployment of “stacks” (groups of services) via docker-compose.yml files. With the new properties introduced in Docker Compose v3, it is possible to specify node affinity via tags, rolling update policies, restart policies, and desired scale of containers. The same docker-compose.yml file you would use to test your application locally can now be used to deploy to production. Here is a sample service with some of the new properties:

version: "3"

    image: dockersamples/examplevotingapp_vote:before
      - 5000:80
      - frontend
      replicas: 2
        parallelism: 1
        delay: 10s
        condition: on-failure
        constraints: [node.role == worker]

While most of the properties within this YAML structure will be familiar to anyone used to Docker Compose v2, the deploy property is new to v3. The replicas field indicates the number of containers to run within the service. The update_config  field tells the swarm how many containers to update in parallel and how long to wait between updates. The restart_policy field determines when a container should be restarted. Finally, the placement field allows container affinity to be set based on tags or node properties, such as Node Role. When deploying this docker-compose file locally, using docker-compose up, the deploy properties are simply ignored.

Deployment of a stack is incredibly simple. Follow these steps to download Docker’s example voting app stack file and run it on your cluster.

SSH into any one of your Manager nodes with the user 'docker' and the EC2 Keypair you specified when you launched the stack.

curl -O

docker stack deploy -c docker-stack.yml vote

You should now see Docker creating your services, volumes and networks. Now run the following command to view the status of your stack and the services running within it.

docker stack ps vote

You’ll get output similar to this:


This shows the container id, container name, container image, node the container is currently running on, its desired and current state, and any errors that may have occurred. As you can see, the vote_visualizer.1 container failed at run time, so it was shut down and a new container spun up to replace it.

This sample application opens up three ports on your Elastic Load Balancer (ELB): 5000 for the voting interface, 5001 for the real-time vote results interface, and 8080 for the Docker Swarm visualizer. You can find the DNS Name of your ELB by either going to the EC2 Load Balancers page of the AWS console, or viewing your Cloudformation stack Outputs tab in the Cloudformation page of the AWS Console. Here is an example of the Cloudformation Outputs tab:


DefaultDNSTarget is the URL you can use to access your application.

If you access the Visualizer on port 8080, you will see an interface similar to this:


This is a handy tool to see which containers are running, and on which nodes.

Scaling Services

Scaling services is as simple as running the command docker service scale SERVICENAME=REPLICAS, for example:

docker service scale vote_vote=3

will scale the vote service to 3 containers, up from 2. Because Docker Swarm uses an overlay network, it is able to run multiple containers of the same service on the same node, allowing you to scale your services as high as your CPU and Memory allocations will allow.

Updating Stacks

If you make any changes to your docker-compose file, updating your stack is incredibly easy. Simply run the same command you used to create your stack:

docker stack deploy -c docker-stack.yml vote

Docker Swarm will update any services that were changed from the previous version, and adhere to any update_configs specified in the docker-compose file. In the case of the vote service specified above, only one container will be updated at a time, and a 10 second delay will occur once the first container is successfully updated before the second container is updated.

Next Steps

This was just a brief overview of the capabilities of Docker Swarm Mode in Docker 1.13. For further reading, feel free to explore the Docker Swarm Mode and Docker Compose docs. In another post, I’ll be going over some of the advantages and disadvantages of Docker Swarm Mode compared to other container orchestration systems, such as ECS and Kubernetes.

If you have any experiences with Docker Swarm Mode that you would like to share, or have any questions on any of the materials presented here, please leave a comment below!


Docker Swarm Mode

Docker for AWS

Docker Example Voting App Github

Introduction to Amazon Lightsail

At re:Invent 2016 Amazon introduced Lightsail, the newest addition to the list of AWS Compute Services. It is a quick and easy way to launch a virtual private server within AWS.

As someone who moved into the AWS world from an application development background, this sounded pretty interesting to me.  Getting started with deploying an app can be tricky, especially if you want to do it with code and scripting rather than going through the web console.  CloudFormation is an incredible tool but I can’t be the only developer to look at the user guide for deploying an application and then decide that doing my demo from localhost wasn’t such a bad option. There is a lot to learn there and it can be frustrating because you just want to get your app up and running, but before you can even start working on that you have to figure how how to create your VPC correctly.

Lightsail takes care of that for you.  The required configuration is minimal, you pick a region, an allocation of compute resources (i.e. memory, cpu, and storage), and an image to start from.  They even offer images dedicated to tailored to common developer setups so it is possible to just log in, download your source code, and you are off to the races.

No you can’t use Lightsail to deploy a highly available load balanced application, but if you are new to working with AWS it is a great way to get a feel for what you can do without being overwhelmed by all the possibilities. Plus once you get the hang of it you have a better foundation for branching out to some of the more comprehensive solutions offered by Amazon.

Deploying a Node App From Github

Let’s look at a basic example.  We have source code in Github, we want it deployed on the internet.  Yes, this is the monumental kind of challenge that I like to tackle every day.


You will need:

  • An AWS Account
    • At this time Lightsail is only supported in us-east-1 so you will have to try it out there.
  • The AWS Command Line Interface
    • The Lightsail CLI commands are relatively new so please make sure you are updated to the latest version
  • About ten minutes

Step 1 – Create an SSH Key

First of all let’s create an SSH key for connecting to our instance.  This is not required, Lightsail has a default key that you can use, but it is generally better to avoid using shared keys.

Step 2 – Create a User Data Script

The user data script is how you give instructions to AWS to tailor an instance to your needs.   This can be as complex or simple as you want it to be. For this case we want our instance to run a Node application that is in Github.  We are going to use Amazon Linux for our instance so we need to install Node and Git then pull down the app from Github.

Take the following snippet and save it as  Feel free to modify if you have a different app that you would like to try deploying.

A user data script is not the only option here.  Lightsail supports a variety of preconfigured images.  For example it has a WordPress image, so if you needed a wordpress server you wouldn’t have to do anything but launch it.  It also supports creating an instance snapshot.  So you could start an instance, log in, do any necessary configuration manually, and then save that snapshot for future use.

That being said once you start to move beyond what Lightsail provides you will find yourself working with instance user data for a variety of purposes and it is nice to get a feel for it with some basic examples

Step 3 – Launch an Instance

Next we have to actually create the instance, we simply call create and refer to the key and userdata script we just made.

This command has several parameters so let’s run through them quickly:

  • instance-names – Required: This is the name for your server.
  • availability-zone – Required: An availability zones is an isolated datacenter in a region.  Since Lightstail isn’t concerned with high availability deployments, we just have to choose one.
  • blueprint-id – Required: The blueprint is the reference to the server image
  • bundle-id – Required: The set of specs that describe your server
  • [user-data-file] – Optional: This is the file we created above.  If no script is specified your instance will have the functionality provided by the blueprint, but no capabilities tailored to your needs.
  • [key-pair-name] – Optional: This is the private key that we will use to connect to the instance.  If this is not specified there is a default key that is available through the Lightsail console.

It will take about a minute for your instance to be up and running.  If you have the web console open you can see when it ready:
Screen Shot 2017-01-11 at 12.23.58 PM.png

Screen Shot 2017-01-11 at 12.25.07 PM.png
Once we are running it is time check out our application…
Screen Shot 2017-01-11 at 12.26.21 PM.png

Or not.

Step 4 – Troubleshooting

Let’s log in and see where things went wrong.  We will need the key we created in the first step and the IP address of our instance.  You can get see that in the web console or through another cli command to pull the instance data.

When you are troubleshooting any sort of AWS virtual server a good place to start is by checking out: /var/log/cloud-init-output.log

Screen Shot 2017-01-11 at 2.42.44 PM.png

Cloud-init-output.log contains the output from the instance launch commands.  That includes the commands run by Amazon to configure the instance as well as any commands from user data script.  Let’s take a look…

Screen Shot 2017-01-11 at 12.26.54 PM.png

Ok… that actually that looks like it started the app correctly.  So what is the problem?  Well if you looked at the application linked above and actually read the README (which, frankly, sounds exhausting) you probably already know…
Screen Shot 2017-01-11 at 2.46.41 PM.png

If we take a look at our firewall settings for the instance networking:

Screen Shot 2017-01-11 at 2.50.41 PM.png

Step 5 – Update The Instance Firewall

We can fix this!  AWS is managing the VPC that your instance is deployed in but you still have the ability to control the access.  We just have to open the port that our application is listening on.

Then if we reload the page: Success!

Screen Shot 2017-01-11 at 2.54.43 PM.png

Wrapping Up

That’s it, you are deployed!  If you’re familiar with Heroku you are probably not particularly impressed right now, but if you tried to use AWS to script out a simple app deployment in the past and got fed up the third time your CloudFormation stack rolled back due to having your subnets configured incorrectly I encourage you to give Lightsail a shot.

Did you find this post interesting? Are you passionate about working with the latest AWS technologies? If you are Stelligent is hiring and we would love to hear from you!

More Info

Lightsail Annoucement

Full Lightsail CLI Reference

DevOps in AWS Radio: AWS CodeBuild (Episode 5)

In this episode, Paul Duvall and Brian Jakovich from Stelligent cover recent DevOps in AWS news and speak about the release of AWS CodeBuild and how you can integrate the service with other services on AWS.

Here are the show notes:

DevOps in AWS News

Episode Topics

  1. Benefits of using AWS CodeBuild along with alternatives
  2. Which programming languages/platforms and operating systems does CodeBuild support?
  3. What’s the pricing model?
  4. How does the buildspec.yml file work?
  5. How do you run CodeBuild (SDK, CLI, Console, CloudFormation)?
  6. Any Jenkins integrations?
  7. Which source providers does CodeBuild support?
  8. How to integrate CodeBuild with the rest of Developer Tools Suite

Related Blog Posts

  1. An Introduction to AWS CodeBuild
  2. Deploy to Production using AWS CodeBuild and the AWS Developer Tools Suite

About DevOps in AWS Radio

On DevOps in AWS Radio, we cover topics around applying DevOps principles and practices such as Continuous Delivery in the Amazon Web Services cloud. This is what we do at Stelligent for our customers. We’ll bring listeners into our roundtables and speak with engineers who’ve recently published on our blog and we’ll also be reaching out to the wider DevOps in AWS community to get their thoughts and insights.

The overall vision of this podcast is to describe how listeners can create a one-click (or “no click”) implementation of their software systems and infrastructure in the Amazon Web Services cloud so that teams can deliver software to users whenever there’s a business need to do so. The podcast will delve into the cultural, process, tooling, and organizational changes that can make this possible including:

  • Automation of
    • Networks (e.g. VPC)
    • Compute (EC2, Containers, Serverless, etc.)
    • Storage (e.g. S3, EBS, etc.)
    • Database and Data (RDS, DynamoDB, etc.)
  • Organizational and Team Structures and Practices
  • Team and Organization Communication and Collaboration
  • Cultural Indicators
  • Version control systems and processes
  • Deployment Pipelines
    • Orchestration of software delivery workflows
    • Execution of these workflows
  • Application/service Architectures – e.g. Microservices
  • Automation of Build and deployment processes
  • Automation of testing and other verification approaches, tools and systems
  • Automation of security practices and approaches
  • Continuous Feedback systems
  • Many other Topics…

Sharing for the People: Stelligentsia Publications

Many moons ago, Jonny coined the term “Stelligentsia” to refer to our small, merry band of technologists at the time. Times have changed and the team has grown by a factor of 10 but we strive to live up to the name as all things DevOps and AWS continues to evolve.


We find the best way to do this is not only to continually learn and improve, but to also share our knowledge with others – a core value at Stelligent. We believe that you are more valuable when sharing knowledge with as many people as possible than keeping it locked inside your head only letting paying customers access to the principles and practices derived from these experiences.

Over the years at Stelligent, we’ve published many resources that are available to everyone. In this post, we do our best to list some of the blogs, articles, open source projects, refcardz, books, and other publications authored by Stelligentsia. If we listed everything, it’d likely be less useful, so we did our best to curate the publications listed in this post. We hope this collection provides a valuable resource to you and your organization.

Open Source

You can access our entire GitHub organization by going to The listing below represents the most useful, relevant, repositories in our organization.



Here are the top 10 blog entries from Stelligentsia in 2016:

For many more detailed posts from Stelligentsia, visit the Stelligent Blog.

Series and Categories

  • Continuous Security – Five-part Stelligent Blog Series on applying security into your deployment pipelines
  • CodePipeline – We had over 20 CodePipeline-related posts in 2016 that describe how to integrate the deployment pipeline orchestration service with other tools and services
  • Serverless Delivery – Parts 1, 2, and 3

AWS Partner Network



Over the years, we have published many articles across the industry. Here are a few of them.

IBM developerWorks: Automation for the People

You can find a majority of the articles in this series by going to Automation for the People. Below, we provide a list to all accessible articles:

IBM developerWorks: Agile DevOps

You can find 7 of the 10 articles in this series by going to Agile DevOps. Below, we provide a list to all 10 articles:


  • Continuous Delivery Patterns – A description and visualization of Continuous Delivery patterns and antipatterns derived from the book on Continuous Delivery and Stelligent’s experiences.
  • Continuous Integration Patterns and Antipatterns – Reviews Patterns (a solution to a problem) and Anti-Patterns (ineffective approaches sometimes used to “fix” a problem) in the CI process.

Videos and Screencasts

Collectively, we have spoken at large conferences, meetups, and other events. We’ve included video recordings of some of these along with some recorded screencasts on DevOps, Continuous Delivery, and AWS.

AWS re:Invent

How-To Videos

  • DevOps in the Cloud – DevOps in the Cloud LiveLessons walks viewers through the process of putting together a complete continuous delivery platform for a working software application written in Ruby on Rails
  • DevOps in AWS – DevOps in AWS focuses on how to implement the key architectural construct in continuous delivery: the deployment pipeline. Viewers receive step-by-step instructions on how to do this in AWS based on an open-source software system that is in production. They also learn the DevOps practices teams can embrace to increase their effectiveness.

Stelligent’s YouTube



  • DevOps in AWS Radio – How to create a one-click (or “no click”) implementation of software systems and infrastructure in the Amazon Web Services cloud so that teams can deliver software to users whenever there’s a business need to do so. The podcast will delve into the cultural, process, tooling, and organizational changes

Social Media


  • Continuous Integration: Improving Software Quality and Reducing Risk – Through more than forty CI-related practices using application examples in different languages, readers learn that CI leads to more rapid software development, produces deployable software at every step in the development lifecycle, and reduces the time between defect introduction and detection, saving time and lowering costs. With successful implementation of CI, developers reduce risks and repetitive manual processes, and teams receive better project visibility.

Additional Resources

Stelligent Reading List – A reading list for all Stelligentsia that we update from time to time.

Stelligent is hiring! Do you enjoy working on complex problems like figuring out ways to automate all the things as part of a deployment pipeline? Do you believe in the “one-button everything” mantra? If your skills and interests lie at the intersection of DevOps automation and the AWS cloud, check out the careers page on our website.


AWS CodeBuild is Here

At re:Invent 2016 AWS introduced AWS CodeBuild, a new service which compiles from source, runs tests, and produces ready to deploy software packages.  AWS CodeBuild handles provisioning, management, and scaling of your build servers.  You can either use pre-packaged build environments to get started quickly, or create custom build environments using your own build tools.  CodeBuild charges by the minute for compute resources, so you aren’t paying for a build environment while it is not in use.

AWS CodeBuild Introduction

Stelligent engineer, Harlen Bains has posted An Introduction to AWS CodeBuild to the AWS Partner Network (APN) Blog.  In the post he explores the basics of AWS CodeBuild and then demonstrates how to use the service to build a Java application.

Integrating AWS CodeBuild with AWS Developer Tools

In the follow-up post:  Deploy to Production using AWS CodeBuild and the AWS Developer Tools Suite, Stelligent CTO and AWS Community Hero,  Paul Duvall expands on how to integrate and automate the orchestration of CodeBuild with the rest of the AWS Developer Tools suite – including AWS CodeDeploy, AWS CodeCommit, and AWS CodePipeline using AWS’ provisioning tool, AWS CloudFormation.  He goes over the benefits of automating all the actions and stages into a deployment pipeline, while also providing an example with detailed screencast.

In the Future

Look to the Stelligent Blog for announcements, evaluations, and guides on new AWS products.  We are always looking for engineers who love to make things work better, faster, and just get a kick out of automating everything.  If you live and breathe DevOps, continuous delivery, and AWS, we want to hear from you.

Big Data at AWS re:Invent 2016

AWS re:Invent 2016 has kicked off for me in the realm of Big Data. It’s a challenging topic and one of great interest to companies around the globe so it was a no-brainer to be hanging around with folks at The Mirage for the Big Data talks. This blog post will be a quick write up on some interesting topics, announcements and features of the various tools covered today.

Big Data in AWS

The Big Data Mini Con had no announcements for new services. However, Amazon’s ecosystem for Big Data tools is growing rapidly and we got a sweet introduction to what is currently available, here’s some of the more interesting ones:

  • Import/Export Snowball – A nearly indestructible petabyte scale means of importing or exporting data into or out of Amazon S3.
  • Kinesis – AWS flavor of data stream and real-time analytics processing.
  • Redshift – A petabyte scale data warehouse solution as a service.
  • EMR – Apache Ecosystem / Hadoop as a service.
  • Data Pipeline – Data orchestration service for inter-AWS and on-premise workflows.
  • S3 – Durable, infinitely scalable, distributed object storage in the cloud.
  • Direct Connect – Up to 10GB direct connection from your VPC to your on-premise network.
  • Machine Learning – A real-time predictive modeling service.
  • Quick Sight – A business intelligence, data visualization and analytics tool.

Announcement: Data Transformation with Lambda on Kinesis Streams

A new feature that is coming to Kinesis — the ability for you to transform your streaming data with Lambda. The idea is to have a lambda function transform your data as it comes in instead of relying on an application running in EC2 processing the stream. It’s another very effective way of controlling costs and reducing the overhead of dealing with scaling yourself. In order to kickstart acceptance, Amazon intends to provide a library of templates for common transformation use-cases.

Amazon EMR

Recently, autoscaling in Amazon ElasticMapReduce became generally available. You can now configure your EMR clusters to scale based on metrics in AWS CloudWatch. It’s no dummy either, not only will it perform scaling operations on your actually processing throughput but it will also optimize your instance time.

  • The amount of metrics available in CloudWatch for EMR clusters is staggering and it’s this level of integration that makes autoscaling super intelligent. Instead of relying on abstract information about CPU and Memory — which can be hit or miss based on your work loads — you can configure scaling events to happen on REAL throughput metrics such as MapSlotsOpen, ReduceSlotsOpen or AppsPending based on which tools you’re running.
  • Instance time optimization is built into your EMR cluster’s autoscaling. It will automatically give you full-utilization of your instance before terminating due to a scale-down event. So when you scale-up and purchase an hour of EC2 capacity, you will get the entire hour of extra horsepower before it scales down. This way you get all of the capacity you paid for versus paying for the full hour and only utilizing a few minutes of it.

Announcement: Advanced Spot Provisioning & Spot Block support

A feature coming soon to EMR is Advanced Spot Provisioning, an extension to the spot instance reservations specifically tailored for distributed systems in AWS. This new feature will allow you to configure spot instance reservations for a list of instance types. You will be able to have a range of instance sizes running in your cluster and have spot instances requested differently for both your core node fleet and your task node fleet. The provisioning tool will select the most optimal instance and availability zone based on the capacity and price you have configured.

In addition to the optimizations of spot provisioning I mentioned above, EMR will also take advantage of Spot Instance Blocks. With traditional spot instances, you can reserve instances with great discounts at the risk of losing that capacity when normal demand increases. With Spot Instance Blocks, you can block off 1 to 6 hours of spot instance capacity. Spot Instance Blocks are priced differently than Spot Instances, but can be a big source of cost reduction in your data processing architecture for larger workloads.

Compute & Storage Decoupling

The final concept with EMR that was really driven home today was the decoupling of your compute and storage resources. In a traditional setup you typically have storage and compute bound together — meaning when you need more storage you end up with more compute and vis-a-vis.

With the latest iteration of Amazon EMR (5.2.0 recently released) HBase is now able to be fully integrated with EMRFS, which uses Amazon S3 for storage. By moving your storage into an S3 backed solution, you no longer have to scale your cluster for storage demands, get effectively infinite scalability, and you take advantage of S3’s eleven 9’s of durability. Traditional HDFS is still installed onto EMR so you can take advantage of a distributed local data store as needed.

Amazon Redshift

Amazon Redshift is an exceptional tool for Data Warehousing and it’s one of my favorite services offered by AWS. If you haven’t already, take some time to dive deep into the documentation and understand the complexities behind maximizing your architecture. This session was an excellent starter into concepts like data processing, distribution keys, sort keys, and compression.


Optimizing your queries in Redshift, like any other database, is going to be based around your keys. Since this is a lengthy topic, I’ll give you a quick overview of the important keys in Redshift. I recommend watching this session online or reviewing the documentation to fully understand how to architect them.

  • Distribution keys are used to physically store rows together and collate. Using the distribution key in all your JOINs is a best practice for performance — even though it may seem redundant.
  • Sort Keys are columns you specify for Redshift to optimize queries with. Redshift can skip entire blocks of data by referencing header data internally thus dramatically improving query performance.
  • Interleaved Sort Keys are designed for very large data sets and provide more performance as the table size increases. You can create interleaved sort keys with up to eight columns.

AWS Schema Conversion Tool

The schema conversion tool can be pointed to an existing database to copy its schema into a Redshift cluster and recommend schema changes for compatibility. Currently, it works with Oracle, Netezza, Greenplum, Teradata and more recently Redshift itself.

The schema conversion tool now supports Redshift -to- Redshift conversion as a means to optimizing your existing data structure. By analyzing your existing redshift cluster, the tool can provide recommendations for distribution keys and sort keys. Since the optimization service only provides recommendations based on existing usage and a very flat view of your data, you’ll want to test extensively before making a hard switchover.

Wrap Up

Today’s talks were enlightening and I eagerly await for new big data-as-a-service products to be announced in the coming days. If you want more information about these topics or want to hear the case studies yourself, I recommend you watch the sessions when they are released:

  • BDM205 – Big Data Mini Con State of the Union
  • BDM401 – Deep Dive: Amazon EMR Best Practices & Design Patterns
  • BDM402 – Best Practices for Data Warehousing with Amazon Redshift

For those of you at re:Invent, some of these sessions will be repeated and I highly recommend them.

If you see my ugly mug the next couple days, be sure to say hello — I’d love to know what your doing with Big Data, DevOps or AWS.

Going to AWS re:Invent is one of many excellent perks for Stelligent engineers. Good news! We’re hiring, check out our Careers page.

Stelligent CTO and Co-Founder, Paul Duvall named an AWS Community Hero

Paul Duvall, Stelligent CTO and Co-Founder, has been named an AWS Community Hero. The AWS Community Heroes program is designed to recognize and honor individuals who have had a real impact within the AWS community. Among those are AWS experts who go above and beyond to share their experience and knowledge through social media, blogs, events, user groups, and workshops.

“To be first nominated and ultimately selected as an AWS Community Hero is a great honor,” said Duvall. “I have to add that it is very natural for me to promote AWS, which we have recommended to all who know Stelligent since we first gained exposure to it in 2009, and to which we have exclusively focused our efforts since 2013. AWS is, without a doubt, the best cloud service provider.”

Duvall is constantly exploring how to build solutions in AWS that are in line with continuous delivery principles, and he is currently working on a new book, Enterprise DevOps in AWS, a spiritual successor to his critically acclaimed Continuous Integration released in 2007.

“Paul has established crucial tenets at Stelligent that have shaped everyone here, and we couldn’t be more pleased that AWS has recognized his impact, not just here at Stelligent, but within the AWS community-at-large,” noted Rob Daly, Stelligent CEO and Co-Founder. Two of Stelligent’s core values are “sharing” and “continuous improvement,” and all Stelligent employees are held to those values, with Duvall leading through example. For instance, he shares his AWS expertise and experience through frequent posts on Stelligent’s blog and helps his colleagues craft theirs, as well. Duvall also leads efforts with various AWS product teams, offering insight and R&D effort to explore and contribute feedback regarding various AWS services, both in beta and general availability.

Along with Paul, Stelligent engineers have been responsible for dozens of AWS and Devops blog related posts in the last year alone.  “Every post on our blog is dedicated to achieving continuous delivery while adhering to AWS best practices,” added Jonny Sywulak, Stelligent’s Blog Czar and Senior DevOps Automation Engineer. “We obsess over customers, and we dedicate ourselves to applying what we believe are essential practices to achieve the aims of continuous delivery. This acknowledgement of Paul underscores the value of sharing experiences to advance technology and to create awareness about how Stelligent can help.

More details are available at our press release.

About Stelligent
Stelligent is a technology services company that provides DevOps Automation in the Amazon Web Services (AWS) cloud. We aim for “one-click deployment.” Our reason for being is to help our customers gain the ability to continuously deploy their software, when they want to, and with confidence. We’ve been providing Continuous Delivery solutions in AWS since 2009. Follow @Stelligent on Learn more at

Designing Applications for Failure

I recently had the opportunity to attend an AWS bootcamp Herndon, VA office and a short presentation given by their team on Designing for Failure. It opened my eyes to the reality of application design when dealing with failure or even basic exception handling.

One of the defining characteristics between a good developer and a great one is how they deal with failures. Good developers will handle the obvious examples in their code – checking for unexpected input, catching library exceptions, and sometimes edge cases. Why do we build resilient applications and what about the end user?

In this blog post, I’ll share with you the key points that a great developer follows when designing resilient applications.

Why build resilient applications?


There are two main reasons that we design applications for failure. As you can probably guess from the horrifying image above, the first reason is User Experience. It’s no secret that you will have user attrition and lost revenue if you cannot shield your end users from issues outside their control. The second reason is Business Services. All business critical systems require resiliency and the difference between a 99.7% uptime and 99.99% could be hours of lost revenue or interrupted business services.

Given an application load of 1 billion requests per month, a 99.7% downtime is 2+ hours versus just 4 minutes for 99.99%. Ouch!

Werner Vogels, the CTO of Amazon Web Services once said at a previous re:Invent “Everything fails, all the time.” It’s a devastating reality and it’s something we all must accept. No matter how mathematically improbable, we simply cannot eliminate all failures. It’s how we reduce the impact of those failures that improves the overall resiliency of our applications.

Graceful Degradation

The way we reduce the impact of failure on our users and business is through graceful degradation. Conceptually it’s very simple – we want to continue to operate in lieu of a failure in some degraded capacity. Keeping with the premise that applications fail all the time, you’ve probably experienced degraded services without even realizing it – and that is the ultimate goal.


Caching is the first layer of defense when dealing with a failure. Depending on your applications reliance on up-to-date bleeding edge information you should consider caching everything. It’s very easy for developers to reject caching because they always want the freshest information for their users. However, when the difference between a happy customer and a sad one is using some few-minute old data… choose the latter.

As an example, imagine you have a fairly advanced web application. What can you cache?

  • Full HTML pages with CloudFront
  • Database records with ElastiCache
  • Page Fragments with tools such as Varnish
  • Remote API calls from your backend with ElastiCache


As applications get more complex we rely on more external services than ever before. Whether it’s a 3rd party service provider or your microservices architecture at work, failures are common and often transient. A common pattern for dealing with transient failures on these types of requests is to implement retry logic. Using exponential back off or a Fibonacci sequence you can retry for some time before eventually throwing an exception. It’s important to fail fast and not trigger rate limiting on your source, so don’t continue indefinitely.

Rate Limiting

In the case of denial of service attacks, self-imposed or otherwise, your primary defense is rate limiting based on a context. You can limit the amount of requests to your application based on user data, source address or both. By imposing a limit on requests you can improve your performance during a failure by reducing the actual load and the load imposed by your retry logic. Also consider using exponential back off or a Fibonacci increase to help mitigate particularly demanding services.

For example, during a peak in capacity that cannot be met immediately, a reduction in load would allow your applications infrastructure to respond to the demand (think auto scaling) before completely failing.

Fail Fast

When your application is running out of memory, threads or other resources you can help recovery time by failing fast. You should return an error as soon as possible when it’s detected. Not only will your users be happier not waiting on your application to respond, you will also not cascade the delay into dependent services.

Static Fallback

Whether you’re rate limiting or simply cannot fail silently, you’ll need something to fallback to. A static fallback is a way to provide at least some response to your end users without leaving them to the wind with erroneous error output or no response at all. It’s always better to return content that makes sense to the context of the user and you’ve probably seen this before if you’re a frequent user of sites like Reddit or Twitter.


In the case of our example web application, you can configure Route53 to fallback to HTML pages and assets served from Amazon S3 with very little headache. You could set this up today!

Fail Silently

When all of your layers of protection have failed to preserve your service, it’s time to fail silently. Failing silently is when you rely on your logging, monitoring and other infrastructure to respond to your errors with the least impact to the end user. It’s a best practice to return a 200 OK with no content and log your errors on the backend than to return a 500 Internal Server Error, similar HTTP status code or worse yet, a nasty stack trace/log dump.

Failing Fast and You

There are two patterns that you can implement to improve your ability to fail fast: Circuit Breaking and Load Shedding. Generally, you want to leverage your monitoring tools such as Cloudwatch and your logs to detect failure early and begin mitigating the impact as soon as possible. At Stelligent, we strongly recommend automation in your infrastructure, and these two patterns are automation at it’s finest.

Circuit Breaking

Circuit breaking is purposefully degrading performance in light of failure events in your logging or monitoring system. You can utilize any of the degradation patterns mentioned above in the circuit. Finally, by implementing health checks into your service you can restore normal service as soon as possible.

Load Shedding

Load shedding is a method of failing fast that occurs at the networking level. Like circuit breaking, you can rely on monitoring data to reroute traffic from your application to a Static Fallback that you have configured. For example, Route53 has failover support built right in that would allow you to use this pattern right away.


Provision a hosted Git repo with AWS CodeCommit using CloudFormation

Recently, AWS announced that you can now automate the provisioning of a hosted Git repository with AWS CodeCommit using CloudFormation. This means that in addition to the console, CLI, and SDK, you can use declarative code to provision a new CodeCommit repository – providing greater flexibility in versioning, testing, and integration.

In this post, I’ll describe how engineers can provision a CodeCommit Git repository in a CloudFormation template. Furthermore, you’ll learn how to automate the provisioning of a deployment pipeline that uses this repository as its Source action to deploy an application using CodeDeploy to an EC2 instance. You’ll see examples, patterns, and a short video that walks you through the process.


Here are the prerequisites for this solution:

These will be explained in greater detail in the Deployment Steps section.

Architecture and Implementation

In the figure below, you see the architecture for launching a pipeline that deploys software to an EC2 instance from code stored in a CodeCommit repository. You can click on the image to launch the template in CloudFormation Designer.

  • CloudFormation – All of the resource generation of this solution is described in CloudFormation  which is a declarative code language that can be written in JSON or YAML.
  • CodeCommit – With the addition of the AWS::CodeCommit::Repository resource, you can define your CodeCommit Git repositories in CloudFormation.
  • CodeDeploy – CodeDeploy automates the deployment to the EC2 instance that was provisioned by the nested stack.
  • CodePipeline – I’m defining CodePipeline’s stages and actions in CloudFormation code which includes using CodeCommit as a Source action and CodeDeploy for a Deploy action (For more information, see Action Structure Requirements in AWS CodePipeline).
  • EC2 – A nested CloudFormation stack is launched to provision a single EC2 instance on which the CodeDeploy agent is installed. The CloudFormation template called through the nested stack is provided by AWS.
  • IAM – An Identity and Access Management (IAM) Role is provisioned via CloudFormation which defines the resources that the pipeline can access.
  • SNS – A Simple Notification Service (SNS) Topic is provisioned via CloudFormation. The SNS topic is used by the CodeCommit repository for notifications.

CloudFormation Template

In this section, I’ll show code snippets from the CloudFormation template that provisions the entire solution. The focus of my samples is on the CodeCommit resources. There are several other resources defined in this template including EC2, IAM, SNS, CodePipeline, and CodeDeploy. You can find a link to the template at the bottom of  this post.


In the code snippet below, you see that I’m using the AWS::CodeCommit::Repository CloudFormation resource. The repository name is provided as parameter to the template. I created a trigger to receive notifications when the master branch gets updated using an SNS Topic as a dependent resource that is created in the same CloudFormation template. This is based on the sample code provided by AWS.

        "RepositoryDescription":"CodeCommit Repository",


In this CodePipeline snippet, you see how I’m using the CodeCommit repository resource as an input for the Source action in CodePipeline. In doing this, it polls the CodeCommit repository for any changes. When it discovers changes, it initiates an instance of the deployment pipeline in CodePipeline.



You can see an illustration of this pipeline in the figure below.



Since costs can vary widely in using certain AWS services and other tools, I’ve provided a cost breakdown and some sample scenarios to give you an idea of what your monthly spend might look like. The AWS Cost Calculator can assist in establishing cost projections.

  • CloudFormation – No additional cost
  • CodeCommit – If you’re using on small project of less than six users, there’s no additional cost. See AWS CodeCommit Pricing for more information.
  • CodeDeploy – No additional cost
  • CodePipeline – $1 a month per pipeline unless you’re using it as part of the free tier. For more information, see AWS CodePipeline pricing.
  • EC2 – Approximately $15/month if you’re running once t1.micro instance 24/7. See AWS EC2 Pricing for more information.
  • IAM – No additional cost
  • SNS – Considering you probably won’t have over 1 million Amazon SNS requests for this particular solution, there’s no cost. For more information, see AWS SNS Pricing.

So, for this particular sample solution, you’ll spend around $16/month iff you run the EC2 instance for an entire month. If you just run it once and terminate it, you’ll spend a little over $1.


Here are some patterns to consider when using CodeCommit with CloudFormation.

  • CodeCommit Template – While this solution embeds the CodeCommit creation as part of a single CloudFormation template, it’s unlikely you’ll be updating the CodeCommit repository generation with every application change so you might create a template that focuses on the CodeCommit creation and run it as part of an infrastructure pipeline that gets updated when new CloudFormation is committed to it.
  • Centralized Repos – Most likely, you’ll want to host your CodeCommit repositories in a single AWS account and use cross-account IAM roles to share access across accounts in your organization. While you can create CodeCommit repos in any AWS account, it’ll likely lead to unnecessary complexity when engineers want to know where the code is located.

The last is more of a conundrum than a pattern. As one my colleagues posted in Slack:

I’m stuck in a recursive loop…where do I store my CloudFormation template for my CodeCommit repo?

Good question. I don’t have a good answer for that one just yet. Anyone have thoughts on this one? It gets very “meta”.

Deployment Steps

There are three main steps in launching this solution: preparing an AWS account, launching the stack, and testing the deployment. Each is described in more detail in this section.

Step 1. Prepare an AWS Account

  1. If you don’t already have an AWS account, create one at by following the on-screen instructions. Part of the sign-up process involves receiving a phone call and entering a PIN using the phone keypad. Be sure you’ve signed up for the CloudFormation service.
  2. Use the region selector in the navigation bar of the console to choose the Northern Virginia (us-east-1) region
  3. Create a key pair. To do this, in the navigation pane of the Amazon EC2 console, choose Key Pairs, Create Key Pair, type a name, and then choose Create.

Step 2. Launch the Stack

Click on the Launch Stack button below to launch the CloudFormation stack. Before you launch the stack, review the architecture, configuration, security, and other considerations discussed in this post. To download the template, click here.

Time to deploy: Approximately 7 minutes

The template includes default settings that you can customize by following the instructions in this post.

Create Details

Here’s a listing of the key AWS resources that are created when this stack is launched:

  • IAM – InstanceProfile, Policy, and Role
  • CodeCommit Repository – Hosts the versioned code
  • EC2 instance – with CodeDeploy agent installed
  • CodeDeploy – application and deployment
  • CodePipeline – deployment pipeline with CodeCommit Integration

CLI Example

Alternatively, you can launch the same stack from the command line as shown in the samples below.

Base Command

From an instance that has the AWS CLI installed, you can use the following snippet as a base command prepended to one of two options described in the Parameters section below.

aws cloudformation create-stack --profile {AWS Profile Name} --stack-name {Stack Name} --capabilities CAPABILITY_IAM --template-url ""

I’ve provided two ways to run the command – from a custom parameters file or from the CLI.

Option 1 – Custom Parameters JSON File

By attaching the command below to the base command, you can pass parameters from a file as shown in the sample below.

--parameters file:///localpath/to/example-parameters-cpl-cfn.json
Option 2 – Pass Parameters on CLI

Another way to launch the stack from the command line is to provide custom parameters populated with parameter values as shown in the sample below.

--parameters ParameterKey=EC2KeyPairName,ParameterValue=stelligent-dev ParameterKey=EmailAddress, ParameterKey=RepoName,ParameterValue=my-cc-repo

Step 3. Test the Deployment

Click on the CodePipelineURL Output in your CloudFormation stack. You’ll see that the pipeline has failed on the Source action. This is because the Source action expects a populated repository and it’s empty. The way to resolve this is to commit the application files to the newly-created CodeCommit repository. First, you’ll need to clone the repository locally. To do this, get the CloneUrlSsh Output from the CloudFormation stack you launched in Step 2. A sample command is shown below. You’ll replace {CloneUrlSsh} with the value from the CloudFormation stack output. For more information on using SSH to interact with CodeCommit, see the Connect to the CodeCommit Repository section at: Create and Connect to an AWS CodeCommit Repository.

git clone {CloneUrlSsh}
cd {localdirectory}

Once you’ve cloned the repository locally, download the sample application files from and place the files directly into your local repository. Do not include the SampleApp_Linux folder. Go to the local directory and type the following to commit and push the new files to the CodeCommit repository:

git add .
git commit -am "add all files from the AWS sample linux codedeploy application"
git push

Once these files have been committed, the pipeline will discover the changes in CodeCommit and run a new pipeline instance and both stages and actions should succeed as a result of this change.

Access the Application

Once the CloudFormation stack has successfully completed, go to CodeDeploy and select Deployments. For example, if you’re in the us-east-1 region, the URL might look like: (You can also find this link in the CodeDeployURL Output of the CloudFormation stack you launched). Next, click on the link for the Deployment Id of the deployment you just launched from CloudFormation. Then, click on the link for the Instance Id. From the EC2 instance, copy the Public IP value and paste into your browser and hit enter. You should see a page like the one below.


Commit Changes to CodeCommit

Make some visual changes to the index.html (look for background-color) and commit these changes to your CodeCommit repository to see these changes get deployed through your pipeline. You perform these actions from the directory where you cloned the local version of your CodeCommit repo (in the directory created by your git clone command). To push these changes to the remote repository, see the commands below.

git commit -am "change bg color to burnt orange"
git push

Once these changes have been committed, CodePipeline will discover the changes made to your CodeCommit repo and initiate a new pipeline. After the pipeline is successfully completed, follow the same instructions for launching the application from your browser. You’ll see that the color of the index page of the application has changed.


How-To Video

In this video, I walkthrough the deployment steps described above.

Additional Resources

Here are some additional resources you might find useful:


In this post, you learned how to define and launch a stack capable of launching a CloudFormation stack that provisions a CodeCommit Git repository in code. Additionally, the example included the automation of a CodePipeline deployment pipeline (which included the CodeCommit integration) along with creating and running the deployment on an EC2 instance using CodeDeploy.

Furthermore, I described the prerequisites, architecture, implementation, costs, patterns and deployment steps of the solution.

Sample Code

The code for the examples demonstrated in this post are located at Let us know if you have any comments or questions @stelligent or @paulduvall.

Stelligent is hiring! Do you enjoy working on complex problems like figuring out ways to automate all the things as part of a deployment pipeline? Do you believe in the “one-button everything” mantra? If your skills and interests lie at the intersection of DevOps automation and the AWS cloud, check out the careers page on our website.

Microservices Platform with ECS

Architecting applications with microservices is all the rage with developers right now, but running them at scale with cost efficiency and high availability can be a real challenge. In this post, we will address this challenge by looking at an approach to building microservices with Spring Boot and deploying them with CloudFormation on AWS EC2 Container Service (ECS) and Application Load Balancers (ALB). We will start with describing the steps to build the microservice, then walk through the platform for running the microservices, and finally deploy our microservice on the platform.

Spring Boot was chosen for the microservice development as it is a very popular framework in the Java community for building “stand-alone, production-grade Spring based Applications” quickly and easily. However, since ECS is just running Docker containers you can substitute your preferred development framework for Spring Boot and the platform described in this post will be still be able to run your microservice.

This post builds upon a prior post called Automating ECS: Provisioning in CloudFormation that does an awesome job of explaining how to use ECS. If you are new to ECS, I’d highly recommend you review that before proceeding. This post will expand upon that by using the new Application Load Balancer that provides two huge features to improve the ECS experience:

  • Target Groups: Previously in a “Classic” Elastic Load Balancer (ELB), all targets had to be able to handle all possible types of requests that the ELB received. Now with target groups, you can route different URLs to different target groups, allowing heterogeneous deployments. Specifically, you can have two target groups that handle different URLs (eg. /bananas and /apples) and use the ALB to route traffic appropriately.
  • Per Target Ports: Previously in an ELB, all targets had to listen on the same port for traffic from the ELB. In ECS, this meant that you had to manage the ports that each container listened on. Additionally, you couldn’t run multiple instances of a given container on a single ECS container instance since they would have different ports. Now, each container can use an ephemeral port (next available assigned by ECS) making port management and scaling up on a single ECS container instance a non-issue.

The infrastructure we create will look like the diagram below. Notice that there is a single shared ECS cluster and a single shared ALB with a target group, EC2 Container Registry (ECR) and ECS Service for each microservice deployed to the platform. This approach enables a cost efficient solution by using a single pool of compute resources for all the services. Additionally, high availability is accomplished via an Auto Scaling Group (ASG) for the ECS container instances that spans multiple Availability Zones (AZ).

Setup Your Development Environment

You will need to install the Spring Boot CLI to get started. The recommended way is to use SDKMAN! for the installation. First install SDKMAN! with:

 $ curl -s "" | bash

Then, install Spring Boot with:

$ sdk install springboot

Alternatively, you could install with Homebrew:

$ brew tap pivotal/tap
$ brew install springboot

Scaffold Your Microservice Project

For this example, we will be creating a microservice to manage bananas. Use the Spring Boot CLI to create a project:

$ spring init --build=gradle --package-name=com.stelligent --dependencies=web,actuator,hateoas -n Banana banana-service

This will create a new subdirectory named banana-service with the skeleton of a microservice in src/main/java/com/stelligent and a build.gradle file.

Develop the Microservice

Development of the microservice is a topic for an entire post of its own, but let’s look at a few important bits. First, the application is defined in BananaApplication:

public class BananaApplication {

  public static void main(String[] args) {, args);

The @SpringBootApplication annotation marks the location to start component scanning and enables configuration of the context within the class.

Next, we have the controller class with contains the declaration of the REST routes.

public class BananaController {

  @RequestMapping(method = RequestMethod.POST)
  public @ResponseBody BananaResource create(@RequestBody Banana banana)
    // create a banana...

  @RequestMapping(path = "/{id}", method = RequestMethod.GET)
  public @ResponseBody BananaResource retrieve(@PathVariable long id)
    // get a banana by its id


These sample routes handle a POST of JSON banana data to /bananas for creating a new banana, and a GET from /bananas/1234 for retrieving a banana by it’s id. To view a complete implementation of the controller including support for POST, PUT, GET, PATCH, and DELETE as well as HATEOAS for links between resources, check out

Additionally, to look at how to accomplish unit testing of the services, check out the tests created in using WebMvcTest, MockMvc and Mockito.

Create Microservice Platform

The platform will consist of a separate CloudFormation stack that contains the following resources:

  • VPC – To provide the network infrastructure to launch the ECS container instances into.
  • ECS Cluster – The cluster that the services will be deployed into.
  • Auto Scaling Group – To manage the ECS container instances that contain the compute resources for running the containers.
  • Application Load Balancer – To provide load balancing for the microservices running in containers. Additionally, this provides service discovery for the microservices.


The template is available at platform.template. The AMIs used by the Launch Configuration for the EC2 Container Instances must be the ECS optimized AMIs:

      AMIID: ami-2b3b6041
      AMIID: ami-ac6872cd
      AMIID: ami-03238b70
      AMIID: ami-fb2f1295
      AMIID: ami-43547120
      AMIID: ami-bfe095df
      AMIID: ami-c78f43a4
      AMIID: ami-e1e6f88d

Additionally, the EC2 Container Instances must have the ECS Agent configured to register with the newly created ECS Cluster:

    Type: AWS::AutoScaling::LaunchConfiguration
              command: !Sub |
                echo ECS_CLUSTER=${EcsCluster}  >> /etc/ecs/ecs.config

Next, an Application Load Balancer is created for the later stacks to register with:

    Type: AWS::ElasticLoadBalancingV2::LoadBalancer
      - !Ref PublicSubnetAZ1
      - !Ref PublicSubnetAZ2
      - !Ref PublicSubnetAZ3
    Type: AWS::ElasticLoadBalancingV2::Listener
      LoadBalancerArn: !Ref EcsElb
      - Type: forward
        TargetGroupArn: !Ref EcsElbDefaultTargetGroup
      Port: '80'
      Protocol: HTTP

Finally we have a Gradle task in our build.gradle for upserting the platform CloudFormation stack based on a custom task named StackUpTask defined in buildSrc.

task platformUp(type: StackUpTask) {
    region project.region
    stackName "${project.stackBaseName}-platform"
    template file("ecs-resources/platform.template")
    waitForComplete true
    capabilityIam true
    if(project.hasProperty('keyName')) {
        stackParams['KeyName'] = project.keyName

Simply run the following to create/update the platform stack:

$ gradle platformUp

Deploy Microservice

Once the platform stack has been created, there are two additional stacks to create for each microservice. First, there is a repo stack that creates the EC2 Container Registry (ECR) for the microservice. This stack also creates a target group for the microservice and adds the target group to the ALB with a rule for which URL path patterns should be routed to the target group.

The second stack is for the service and creates the ECS task definition based on the version of the docker image that should be run, as well as the ECS service which specifies how many tasks to run and the ALB to associate with.

The reason for the two stacks is that you must have the ECR provisioned before you can push a docker image to it, and you must have a docker image in the ECR before creating the ECS service. Ideally, you would create the repo stack once, then configure a CodePipeline job to continuously push changes to the code to ECR as new images and then updating the service stack to reference the newly pushed image.


The entire repo template is available at repo.template, an important new resource to check out is the ALB Listener Rule that provides the URL patterns that should be handled by the new target group that is created:

    Type: AWS::ElasticLoadBalancingV2::ListenerRule
      - Type: forward
        TargetGroupArn: !Ref EcsElbTargetGroup
      - Field: path-pattern
        Values: [“/bananas”]
      ListenerArn: !Ref EcsElbListenerArn
      Priority: 1

The entire service template is available at service.template, but notice that the ECS Task Definition uses port 0 for HostPort. This allows for ephemeral ports that are assigned by ECS to remove the requirement for us to manage container ports:

    Type: AWS::ECS::TaskDefinition
      - Name: banana-service
        Cpu: '10'
        Essential: 'true'
        Image: !Ref ImageUrl
        Memory: '300'
        - HostPort: 0
          ContainerPort: 8080
      Volumes: []

Next, notice how the ECS Service is created and associated with the newly created Target Group:

    Type: AWS::ECS::Service
      Cluster: !Ref EcsCluster
      DesiredCount: 6
        MaximumPercent: 100
        MinimumHealthyPercent: 0
      - ContainerName: microservice-exemplar-container
        ContainerPort: '8080'
        TargetGroupArn: !Ref EcsElbTargetGroupArn
      Role: !Ref EcsServiceRole
      TaskDefinition: !Ref MicroserviceTaskDefinition

Finally, we have a Gradle task in our service build.gradle for upserting the repo CloudFormation stack:

task repoUp(type: StackUpTask) {
 region project.region
 stackName "${project.stackBaseName}-repo-${}"
 template file("../ecs-resources/repo.template")
 waitForComplete true
 capabilityIam true

 stackParams['PathPattern'] ='/bananas'
 stackParams['RepoName'] =

And then another to upsert the service CloudFormation stack:

task serviceUp(type: StackUpTask) {
 region project.region
 stackName "${project.stackBaseName}-service-${}"
 template file("../ecs-resources/service.template")
 waitForComplete true
 capabilityIam true

 stackParams['ServiceDesiredCount'] = project.serviceDesiredCount
 stackParams['ImageUrl'] = "${project.repoUrl}:${project.revision}"

 mustRunAfter dockerPushImage

And finally, a task to coordinate the management of the stacks and the build/push of the image:

task deploy(dependsOn: ['dockerPushImage', 'serviceUp']) {
  description "Upserts the repo stack, pushes a docker image, then upserts the service stack"

dockerPushImage.dependsOn repoUp

This then provides a simple command to deploy new or update existing microservices:

$ gradle deploy

Defining a similar build.gradle file in other microservices to deploy them to the same platform.

Blue/Green Deployment

When running the gradle deploy, the existing service stack is updated to use a new task definition that references a new docker image in ECR. This CloudFormation update causes ECS to do a rolling replacement of the containers, launching new containers with the new image and killing containers with the old image.

However, if you are looking for a more traditional blue/green deployment, this could be accomplished by creating a new service stack (the green stack) with the new docker image, rather than updating the existing. The new stack would attach to the existing ALB target group at which point you could update the existing service stack (the blue stack) to no longer reference the ALB target group, which would take it out of service without killing the containers.

Next Steps

Stay tuned for future blog posts that builds on this platform by accomplishing service discovery in a more decoupled manner through the use of Eureka as a service registry, Ribbon as a service client, and Zuul as an edge router.

Additionally, this solution isn’t complete since there is no Continuous Delivery pipeline defined. Look for an additional post showing how to use CodePipeline to orchestrate the movement of changes to the microservice source code into production.

The code for the examples demonstrated in this post are located at Let us know if you have any comments or questions @stelligent.

Are you interested in building resilient applications in AWS? Stelligent is hiring!