Next-Generation Managed Services are Self-Service

The traditional managed services provider (MSP) model is broken and needs disruption. The next-generation managed services model is about guiding customers in a self-service manner.

The key drivers causing customers to seek cloud providers like Amazon Web Services (AWS) include the agility and cost efficiencies they afford. The agility helps customers be more responsive to their users. At the same time, this same agility delivered by cloud providers can also result in customers being overwhelmed in determining best practices and patterns for deploying and operating on the cloud. Moreover, as more companies realize how software is a strategic asset for their business, they often don’t want to simply outsource all their IT needs to yet another provider. They want speed and agility, but not by being at the mercy of an IT provider. They want to leverage best practices while gaining the autonomy in obtaining these capabilities in a self-service manner.

In this post, I contrast the traditional MSP to a next-generation MSP driven by DevOps and automation. In learning how MSP features can help increase agility when delivered from a new model, you should be able to better choose providers who align best with your business outcomes.

The Traditional MSP

To begin, let’s have a look at the typical capabilities of an MSP. They include:

  • Access Management – Creating user accounts and permissions to infrastructure resources
  • Change Management – Ensure changes are applied in a controlled manner
  • Continuity Management – Prevent loss via disaster recovery techniques such as backups, high availability, and restoration
  • Incident Management – Get support to fix problems
  • Patch Management – Keep infrastructure up to date and compliant
  • Provisioning Management – Provision and configure infrastructure
  • Reporting – Get access to metrics, logs, and recommendations for improvement
  • Security Management – Ensure infrastructure is secure

With a traditional MSP, these types of services are typically provided through an opaque model in which customers are reliant on the MSP to perform remedial actions to fix most problems. This is because the MSP often has the credentials and knowledge to make changes to a largely manually-provisioned infrastructure or a hodgepodge of “automated” scripts that are not provided as a system to customers.

Next-Generation MSP on AWS

A next-gen MSP provides customers capabilities in a self-service manner enabling them to get up and running quickly with a fully-automated infrastructure while benefiting from the expert guidance provided by the MSP. This means customers don’t need someone on the MSP’s support team to – say – restart a server or perform a backup. This is because these services are provided to customers through self-service means. Instead, the reason a customer might need a next-gen MSP is for their best practices expertise in architecture and automation to help more quickly guide them to better solutions.

What Does Next-Gen Look Like?

What do each of the capabilities described in the first section look like in a next-generation MSP model? At their core, they’re self service. Customers of the MSP might have a team from the MSP get their infrastructure up and running but there should be nothing preventing the customer from provisioning everything themselves either. Furthermore, there should be a way for customers to get their applications running on the infrastructure using repeatable frameworks as well.

Let’s have a look at the types of capabilities a next-gen MSP on AWS might offer:

  • Access Management  – A customer interfaces with an API and/or console provided by the MSP that automates the provisioning of AWS Organizations, AWS Accounts, and IAM users and permissions. Possible ToolsAWS Organizations, AWS IAM, AWS Service Catalog, and automation through AWS CloudFormation and other tools.
  • Change Management – Customers interface with the API/Console to manage how changes are deployed on their infrastructure. For example, they might want to modify RDS database configuration settings or the AMI the EC2 instances use. Customers can make a request to the MSP to apply or schedule these changes or they can apply the changes themselves using frameworks provided by the MSP. These changes flow through an approval process configured by the customer. Possible Tools: AWS Service Catalog, AWS CloudFormation, AWS CloudWatch Dashboards, Configuration Management Tools, and custom automation.
  • Continuity Management – Customers can schedule disaster recovery processes and scenarios through an API/Console. This includes scheduling data, storage, and source backups. It might also include the ability to schedule disaster recovery drills with experts from the MSP. Moreover, the automation provided in the DevOps frameworks provided by the MSP should support resilient, high availability solutions that can maintain the necessary infrastructure even when parts of it fails so that users do not experience errors when parts of the underlying infrastructure fails. Possible Tools: Amazon EC2 Systems Manager, AWS CodeCommitAWS Shield, Custom Reports, Amazon Glacier, AWS Service Catalog, AWS Auto Scaling, AWS CloudFormation, and Configuration Management Tools might be used. Also, tools for automation of backing up EBS volumes, RDS database snapshots, etc.
  • Incident Management – Customers can contact MSP support experts at any time of day to help guide them to solutions through various mechanisms including real-time chat, chatbots, online systems, and the phone. However, the MSP should never be required to be present to fix an infrastructure error. This is because the MSP should provide the customer access to authorized individuals who are capable of making infrastructure changes in a governed manner – if they choose to do so. The MSP can also handle daily activities of investigating and resolving alarms or incidents. Possible ToolsAmazon Connect, AWS Step Functions, Amazon Polly, Amazon Lex, Amazon CloudWatch – Logs, Events, and Monitoring, AWS CloudTrail, New Relic (App & Performance Monitoring), AWS Config (and Config Rules), and AWS EC2 Systems Manager
  • Patch Management – The MSP can manage all customer OS patching activities to help keep infrastructure resources current and secure. This would include applying updates or patches that are released from OS vendors  in a timely and consistent manner to minimize the impact on the customers’ business. Critical security patches are applied as needed, while others are applied based on the patch schedule when customers make the request. The customer can also apply these changes through governance mechanisms provided by the MSP. Possible Tools: AWS EC2 Systems Manager, AWS Service Catalog
  • Provisioning Management – The MSP launches and manages infrastructure stacks via a framework that provisions these stacks as code that builds users, security infrastructure, networks, environments, services, and deployment pipelines. The MSP should provide these same capabilities to customers as well so that they are capable of making these changes with or without the MSP. Possible Tools: AWS CloudFormation, Configuration Management Tools, and custom automation.
  • Reporting – Customers get access to the data using to manage your infrastructure, including Amazon S3 logs, CloudTrail logs, instance logs, and real-time data from the AWS Managed Services APIs. Customer can also get real-time advice through automated systems provide by the MSP. The MSP should also walk customers through metrics, their impact, as well as recommendations to optimize platform usage. Possible Tools: Amazon CloudWatch Dashboards, AWS Trusted Advisor, custom automation, and web portals
  • Security Management – The next-gen MSP provides customers information protection of assets and keeps the infrastructure secure by providing anti-malware protection, intrusion detection, and intrusion prevention systems. Possible Tools: Amazon VPC, AWS Parameter Store, AWS WAF, Amazon Inspector, AWS Shield, AWS Config and Config Rules, AWS CloudTrail, and Security Monitoring as a Service

The overarching goal of the next-generation delivery model is to provide the capability of 100% self-service capabilities for customers as part of a shared responsibility model. Alternatively, the customer might choose for the MSP to manage everything for them. In this case, the customer should be able to take over the management of the infrastructure at any time if the MSP is not meeting its needs. Customer-centric MSPs will do this by creating fully automated, continuous, and autonomic services.

Scenario: Deployment Pipeline Management

Here’s an example scenario in how a next-generation MSP might provide a deployment pipeline monitoring and guidance service to customers.

The MSP uses an open-source framework that provisions all the necessary AWS environment, deployment pipeline, and application resources to run a highly-available, secure application on AWS. Each deployment pipeline is configured to send AWS CodePipeline statistics via AWS CloudWatch Events. These events are configured to submit notifications through Amazon SNS and AWS Lambda so that all necessary parties are informed via email and Slack. What’s more, the CodePipeline statistics are aggregated and made available through Amazon CloudWatch Dashboards. All of this is configured through configuration files that are versioned in the customer’s version-control repository and automated via the open-source framework.

Once the MSP DevOps Engineers receive failure alerts through Slack, email, or the Dashboard, they help guide the customers’ engineers in resolving errors in AWS cpl-failureCodePipeline and/or its integrations with other tools like AWS CodeBuild, AWS CodeDeploy, AWS CloudFormation, static analysis, or tests. The expertise provided in this “stop the line” model helps quickly resolve issues that arise making them less costly to fix and increasing high velocity feedback between customers and its users. You might see some MSPs provide real-time expertise through automated conversational bots enabled through services like Amazon Lex. There’s a lot of space for innovation in providing these services to companies.

What’s Next?

Going forward, we expect customers to demand more self-service capabilities from their providers. Providers will enable these self-service capabilities through systematic automation and a focus on the user experience in how these features are provided to IT consumers.

Additional Resources

Application Auto Scaling with Amazon ECS

In this blog post, you’ll see an example of Application Auto Scaling for the Amazon ECS (EC2 Container Service). Automatic scaling of the container instances in your ECS cluster has been a feature for quite some time, but until recently you were not able to scale the tasks in your ECS service with built-in technology from AWS. In May of 2016, Automatic Scaling with Amazon ECS was announced which allowed us to configure elasticity into our deployed container services in Amazon’s cloud.

Developer Note: Skip to the “CloudFormation Examples” section to skip right to the code!

Why should you auto scale your container services?

Efficient and effective scaling of your microservices is why you should choose automatic scaling of your containers. If your primary goals include fault tolerance or elastic workloads, then leveraging a combination of cloud technology for autoscaling and infrastructure as code are the keys to success. With AWS’ Automatic Application Autoscaling, you can quickly configure elasticity into your architecture in a repeatable and testable way.

Introducing CloudFormation Support

For the first few months of this new feature it was not available in AWS CloudFormation. Configuration was either a manual process in the AWS Console or a series of API calls made from the CLI or one of Amazon’s SDKs. Finally, in August of 2016, we can now manage this configuration easily using CloudFormation.

The resource types you’re going to need to work with are:

The ScalableTarget and ScalingPolicy are the new resources that configure how your ECS Service behaves when an Alarm is triggered. In addition, you will need to create a new Role to give access to the Application Auto Scaling service to describe your CloudWatch Alarms and to modify your ECS Service — such as increasing your Desired Count.

CloudFormation Examples

The below examples were written for AWS CloudFormation in the YAML format. You can plug these snippets directly into your existing templates with minimal adjustments necessary. Enjoy!

Step 1: Implement a Role

These permissions were gathered from the various sources in AWS documentation.

  Type: AWS::IAM::Role
      - Effect: Allow
        - sts:AssumeRole
     Path: "/"
     - PolicyName: ECSBlogScalingRole
         - Effect: Allow
           - ecs:UpdateService
           - ecs:DescribeServices
           - application-autoscaling:*
           - cloudwatch:DescribeAlarms
           - cloudwatch:GetMetricStatistics
           Resource: "*"

Step 2: Implement some alarms

The below alarm will initiate scaling based on container CPU Utilization.

  Type: AWS::CloudWatch::Alarm
    AlarmDescription: Containers CPU Utilization High
    MetricName: CPUUtilization
    Namespace: AWS/ECS
    Statistic: Average
    Period: '300'
    EvaluationPeriods: '1'
    Threshold: '80'
    - Ref: AutoScalingPolicy
    - Name: ServiceName
        - YourECSServiceResource
        - Name
    - Name: ClusterName
        Ref: YourECSClusterName
    ComparisonOperator: GreaterThanOrEqualToThreshold

Step 3: Implement the ScalableTarget

This resource configures your Application Scaling to your ECS Service and provides some limitations for its function. Other than your MinCapacity and MaxCapacity, these settings are quite fixed when used with ECS.

  Type: AWS::ApplicationAutoScaling::ScalableTarget
    MaxCapacity: 20
    MinCapacity: 1
      - "/"
      - - service
        - Ref: YourECSClusterName
        - Fn::GetAtt:
          - YourECSServiceResource
          - Name
      - ApplicationAutoScalingRole
      - Arn
    ScalableDimension: ecs:service:DesiredCount
    ServiceNamespace: ecs

Step 4: Implement the ScalingPolicy

This resource configures your exact scaling configuration — when to scale up or down and by how much. Pay close attention to the StepAdjustments in the StepScalingPolicyConfiguration as the documentation on this is very vague.

In the below example, we are scaling up by 2 containers when the alarm is greater than the Metric Threshold and scaling down by 1 container when below the Metric Threshold. Take special note of how MetricIntervalLowerBound and MetricIntervalUpperBound work together. When unspecified, they are effectively infinity for the upper bound and negative infinity for the lower bound. Finally, note that these thresholds are computed based on aggregated metrics — meaning the Average, Minimum or Maximum of your combined fleet of containers.

  Type: AWS::ApplicationAutoScaling::ScalingPolicy
    PolicyName: ECSScalingBlogPolicy
    PolicyType: StepScaling
      Ref: AutoScalingTarget
    ScalableDimension: ecs:service:DesiredCount
    ServiceNamespace: ecs
      AdjustmentType: ChangeInCapacity
      Cooldown: 60
      MetricAggregationType: Average
      - MetricIntervalLowerBound: 0
        ScalingAdjustment: 2
      - MetricIntervalUpperBound: 0
        ScalingAdjustment: -1

Wrapping It Up

Amazon Web Services continues to provide excellent resources for automation, elasticity and virtually unlimited scalability. As you can see, with a couple solid examples underfoot you can very quickly build in that on-demand elasticity and inherent fault tolerance. After you have your tasks auto scaled, I recommend you check out the documentation on how to scale your container instances also to provide the same benefits to your ECS cluster itself.

Deploying Microservices? Let mu help!

With support for ECS Application Auto Scaling coming soon to Stelligent mu, it offers the fastest and most comprehensive platform for deploying microservices as containers.

Want to learn more about mu from its creators? Check out the DevOps in AWS Radio’s podcast or find more posts in our blog.

Additional Resources

Here are some of the supporting resources discussed in this post.

We’re Hiring!

Like what you’ve read? Would you like to join a team on the cutting edge of DevOps and Amazon Web Services? We’re hiring talented engineers like you. Click here to visit our careers page.



Enforcing Compliance with AWS Organizations

You have a large organization with several development teams that work on various software projects that support your business. A year ago, you brought in a consultant that told you to use multiple AWS accounts because there were benefits to be gained. For example, using multiple accounts we can contain the damage from a possible security breach and isolate work by teams so that others don’t inadvertently disrupt that work. But there are also issues that we must deal with.
When a company has more than one AWS account and especially many AWS accounts, it becomes difficult to manage those accounts. How do we know that all teams are using good security policies? How do we take advantage of billing incentives for using more and more of an AWS resource? How do we manage the billing in general for all of those accounts? And if a company is in a business that requires them to comply with a set of standards such as PCI or HIPAA, how can we guarantee that teams are using only services that are certified compliant? And how can we automate the creation of new accounts in a way that they are properly configured, to begin with?

What Are AWS Organizations?

AWS organizations allow companies with multiple AWS accounts to manage those accounts from a billing and administrative perspective from a single root account. Why is this important? Until Organizations came along, I like to think of having multiple accounts as being like the Wild West. Each account was on its own and there was no way to manage all of them from one place. Users had no way to apply policies, manage permissions, or manage billing from a “company” perspective. AWS Organizations give us the tools we need to bring these accounts together and control them all in a predictable way.

Service Control Policies (SCPs)

Service Control Policies allow us to define the services that an account can access. In our case, we know that we want to allow access to only the services that are HIPAA compliant. Any service that isn’t compliant should not be allowed to be used by the teams. Using the root account, we can push this policy out to all accounts that we have within our organization.

Organizational Units (OUs)

Most organizations have accounts that have different requirements. Using the example above, some accounts may have to be HIPAA compliant while others may be used for other purposes and do not have to follow any guidelines. AWS Organizations gives us the ability to group accounts into Organizational Units.

Organizational Units allow us to split our accounts into separate groups and apply different policies to those groups. Continuing with the example from above, we can have an OU for all accounts that must be HIPAA compliant and an OU for accounts that are general purpose. All accounts in the HIPAA OU will be restricted to only the services that are HIPAA certified while the accounts in the general purpose OU have access to all AWS services. The rules that are applied to an OU even overrule account administrators. If an admin accidentally logs into an account and specifically sets permissions in that account to allow access to a service that has been restricted at the OU level, the OU rule that was applied to the account will still block that access.

OUs can be up to 5 levels deep. You can have multiple OUs inside of an OU. This allows even more granular control over accounts. As an example, let’s assume that some of our HIPAA accounts also handle patient transactional data. This means that we are dealing with both PCI and HIPAA data in those accounts. We can create an OU inside of our HIPAA account that restricts access to only services that are PCI compliant. The result is at the first level we have accounts that can only access HIPAA compliant services. In the PCI OU under the HIPAA OU we have accounts that can only access services which are HIPAA compliant AND PCI compliant.

One thing that must be remembered is that the root or “master” account cannot be restricted. Even if it is placed within an OU, none of the AWS services will be restricted to this account. Therefore, it is essential that the root account is not used by anyone other than the administrator of all accounts.

Account Creation Automation

It is often the case that a company will grow and will add teams as they are needed. These new teams will sometimes need their own set of accounts to work in to avoid disrupting the work of other teams. AWS Organizations provides the ability to automate this task. We can create an account, attach policies to this account, and add this account to the appropriate group all through the Organizations API. Not only is this useful for new teams, but it is also useful when developers need test accounts that need to be created quickly, then deleted when work within that account is finished.

How Does All of That Help Me?

Let’s take a look at an example and apply the tools above to solve the problems that companies with multiple accounts face. Let’s assume we have a health care company with a wide range of systems under their control. Some systems house identifiable patient data, which requires those systems to be HIPAA compliant, and some systems simply house generic data that can be used to generate high-level reports. The latter systems do not require any special treatment. One other platform the company has allows patients to log in and make payments. This platform allows users to store their credit card data for future transactions, which means the services they use must be PCI compliant.

Where Do We Start?

Before we begin we need to gather our requirements. We know that our company must be both HIPAA and PCI compliant so we can start by breaking the teams down into groups of standards they must follow.

Compliance Number of Teams
HIPAA and PCI (These overlap from the previous groups) 4
None 3

Once we have our teams broken out into groups, we need to know how many accounts each team has. Or this example, we are going to assume each team has 4 accounts:  Dev, Test, QA, PROD. Note that we have a group of 4 accounts that overlap in service restriction requirements. Unfortunately, Organizations will not allow an account to belong to 2 Organizational Units that are at the same hierarchical level. We will discuss the details of how to achieve this later when we create our OUs and begin adding accounts to them.

Once we have our accounts grouped we are ready to start planning our organization. The resulting Organization will have this overall structure:

cloudcraft - AWS Orgs


It’s worth noting at this point that AWS organizations treat accounts differently depending on how they were originally created. The Organizations API provides the ability to remove an account from the Organization, but only if that account was invited to join the organization. If the account was created by the organization, that account cannot be removed from the organization without deleting the account entirely. The Organizations API also does not provide the ability to delete an account, no matter how it was created. To delete an account, you must log into that account and do that manually. These limitations may influence how companies want to handle bringing accounts into an organization.

One other important fact we need to know is that the account that owns the user we use to create the Organization will become the master account. Make sure never to create an Organization from an account that needs to have policies applied to it. A master account will always have “root” access, even if it is moved to an Organizational Unit that restricts services. The services of the master account cannot be restricted and the wide-open policies will always override anything that is more restrictive.

Once we have our account information, let’s move on to creating the organization.

Creating an Organization

Before we begin, we need to make sure we have the AWS Command Line tools installed on the OS of your choice. Organizations can also be managed using the AWS SDK for your language of choice, but we’re going to use the command line tools for this example. Again, make sure we are using a user from the account we want to be the master. Make sure that user is configured with your CLI tools. Once our configs have been verified, we can issue the following command:

Minimum permissions for your user:

  • organizations:CreateOrganization
aws organizations create-organization --feature-set ALL

Notice that we are passing in a parameter to the create-organization command called “feature-set”. This tells AWS what control the organization will have over our accounts. There are 2 options we can pass in here:  ALL, CONSOLIDATED_BILLING. The ALL parameter value enables consolidated billing and also allows the organization to put policies in place that can restrict the services the account can access. This is the default value if this parameter is omitted. A value of CONSOLIDATED_BILLING will allow the new organization to consolidate the billing of all accounts under the master account. The Organization will not be allowed to restrict the services each account has access to. For our company, we need ALL functionality so we retain the ability to control access for some accounts to only HIPAA and PCI compliant services.

After running this command, we get back a response from AWS

{ "Organization": { "AvailablePolicyTypes": [{ "Status": "ENABLED", "Type": "SERVICE_CONTROL_POLICY" }], "MasterAccountId": "111111111111", "MasterAccountArn": "arn:aws:organizations::111111111111:account/o-exampleorgid/111111111111", "MasterAccountEmail": "", "FeatureSet": "ALL", "Id": "o-exampleorgid", "Arn": "arn:aws:organizations::111111111111:organization/o-exampleorgid" } }

We need to capture the “Id” value and keep that for future use.

Let’s Add Some Accounts

Inviting Accounts

Now that we have a newly created Organization, we can start adding our accounts to our organization. As mentioned above, there are 2 ways to add an account to an Organization. The first method and the one we’ll be using primarily for our example is to send an invitation to our accounts that already exist.

I want to reiterate that it’s important to note here that any account we invite to our Organization can be removed at any time. If we want our accounts tied to this Organization without the option to be removed (as a way of ensuring our policies are always in place), we need to create that account from within the Organization. Any resources would have to be migrated from the existing account to the new account.

To send an invitation to an existing account, we can issue the following command:

Minimum permissions for your users:

  • organizations:DescribeOrganization
  • organizations:InviteAccountToOrganization
aws organizations invite-account-to-organization --target '{"Type": "ACCOUNT", "Id": "ACCOUNT_ID_NUMBER"}'

We are passing in a data structure to the target parameter of the command. In this example, we are passing in the account ID. The key Type can also have values of EMAIL or ORGANIZATION. In those cases, we would set the Id to the appropriate value.

Another optional parameter that we could have passed as “notes”. If we want to include additional information in the email that is auto-generated by Organizations, we can pass that information using the “notes” parameter.

The response from this command should look like this:

  "Handshake": {
    "Action": "INVITE",
    "Arn": "arn:aws:organizations::111111111111:handshake/o-exampleorgid/invite/h-examplehandshakeid111",
    "ExpirationTimestamp": 1482952459.257,
    "Id": "h-examplehandshakeid111",
    "Parties": [{
      "Id": "o-exampleorgid",
      "Type": "ORGANIZATION"
      "Id": "",
      "Type": "EMAIL"
    "RequestedTimestamp": 1481656459.257,
    "Resources": [{
      "Type": "MASTER_EMAIL",
      "Value": ""
      "Type": "MASTER_NAME",
      "Value": "Org Master Account"
      "Value": "FULL"
      "Type": "ORGANIZATION",
      "Value": "o-exampleorgid"
      "Type": "EMAIL",
      "Value": ""
    "State": "OPEN"

Once again, we are interested in the “Id” value of the “Handshake” object. Each time we run the command to invite an account, we will receive this “Id” back in the response. We need to record that value for each account we invite so we can use it in the next step to accept the invitation.

Accepting Invitations

The process of inviting and adding an account to an organization is a “handshake” transaction. An invitation is sent to the account we want to add to our organization and the “owner” of that account must log in and accept that invitation. Fortunately for us, this can also be accomplished through the CLI. Again, we need to make sure our CLI is configured with a principal user that has the IAM permissions to accept that handshake. Once we have the CLI configured, we can issue the following command:

Minimum permissions for your user:

  • organizations:ListHandshakesForAccount
  • organizations:AcceptHandshake
  • organizations:DeclineHandshake
aws organizations accept-handshake --handshake-id HANDSHAKE_ID

The handshake ID that is being passed into this command was given to us in the response of the command to send the invitation.

Remember that we can also send and accept invitations through the console. For users with a few accounts, this may be acceptable. But if you are dealing with more than a few accounts you are definitely going to want to automate this process.


AWS has set a limit on a number of invitations that can be sent per day of 20. If you need to send more than that, contact customer support and they will up your limit.

Using Organizational Units

Here’s where the real power of Organizations starts to show. Now that we have our accounts added to the Organization we need to group them into OUs and restrict the services that can be used within those accounts. Before we started creating the Organization, we took the time to group our accounts by the compliance standard they needed to adhere to. We can use that information to help us create our OUs to move our accounts into. Looking at our chart we can see that we have four different types of accounts. We have HIPAA compliant, PCI compliant, HIPAA and PCI compliant, and accounts that require no restrictions at all. We are going to create three top-level OUs and one OU that is within either the PCI or the HIPAA OU. Because we are simply overlapping 2 sets of compliance standards, it really doesn’t matter which OU we use as a parent.

We’ll start by creating the three top-level OUs. We can issue the following commands to create those:

Minimum permissions for your user:

  • organizations:CreateOrganizationalUnit
aws organizations create-organizational-unit --parent-id PARENT_ORG_ID --name HipaaOU
aws organizations create-organizational-unit --parent-id PARENT_ORG_ID --name PciOU
aws organizations create-organizational-unit --parent-id PARENT_ORG_ID --name GeneralOU

We now have three top-level Organizational Units that we can add accounts to. We have already invited all existing accounts to our Organization. They reside at the top-level of our Org. To place those accounts into the proper OU we need to issue the “move” command on each account.

Minimum permissions for your user:

  • organizations:MoveAccount
aws organizations move-account --account-id ACCOUNT_ID --source-parent-id PARENT_ORG_ID --destination-parent-id OU_ID

We will need to issue this command for each account we need to move to an OU. We need to make sure we are using the correct destination ID to place the account into the proper OU.

We need to repeat the last 2 steps to create the sub OU for our overlapping HIPAA and PCI accounts. This time around the PARENT_ORG_ID will be changed from the ID of the organization itself to the ID of the organizational unit we want to create this sub OU in. We will create this OU within the HipaaOU that we created in the previous step.

And we can move those accounts that require both HIPAA and PCI compliance into this new OU using the same command we used to move the other accounts.

Service Control Policies

Simply moving accounts into OUs accomplishes nothing on its own. In order to take advantage of the power of these new OUs, we need to apply policies that will restrict the services that the accounts within the OU can access. At the time of this writing, Service Control Policies are the only policies that can be applied to an OU.

In order to apply a Service Control Policy to our account, we need to create a policy file that we can pass into the create-policy command. We could place this text within the command itself, but with the number of services we need to include and the fact that we have to escape characters, that approach is error-prone and very messy. Here’s what our policy file will look like

  “Version”: “2012-10-17”,
  “Statement”: [{
    “Effect”: “Allow”,
    “Action”: [
    “Resource”: “*”

In the above policy file, we are explicitly allowing a few services. There are many more HIPAA compliant services, but for the sake of this example, we are going to limit the policy to these three services.


It needs to be mentioned here that Service Control Policies which are applied to an OU will not grant any user any rights. We are not pushing this policy as a way to give each user in the accounts in the OU access to these services. This policy is in place as a way to restrict the permissions that can be applied to a user. And they will apply to all users, including administrators.

It’s also worth noting that the policies we are putting in place to restrict services assume that the “Allow *” policies have been removed from the root, OU, and individual accounts. If “Allow *” is still in place in any of these locations, the above policy will have no effect on the account(s) it is applied to.

We need to create two additional policy files, one for each additional OU type. Because we removed the “Allow *” policy from all accounts, OUs, and the root Organization, we will need to create a policy file for our GeneralOU that allows all services for that OU. We will reuse the PCI policy file for the sub OU that allows both HIPAA and PCI services.

Once we have our policy files in place, we can start creating those policies:

Minimum permissions for your user:

  • organizations:CreatePolicy
aws organizations create-policy --content file://allow_hipaa_policy.json --name AllowHipaaServices --type SERVICE_CONTROL_POLICY --description "This policy allows all HIPAA services"
aws organizations create-policy --content file://allow_pci_policy.json --name AllowPCIServices --type SERVICE_CONTROL_POLICY --description "This policy allows all PCI services"
organizations create-policy --content file://allow_all_policy.json --name AllowAllServices --type SERVICE_CONTROL_POLICY --description "This policy allows all services"

We have created three new policies that now need to be attached to our OUs.

Minimum permissions for your user:

  • organizations:AttachPolicy
aws organizations attach-policy --policy-id HIPAA_POLICY_ID --target-id HIPAA_OU_ID
aws organizations attach-policy --policy-id PCI_POLICY_ID --target-id HIPAA_OU_ID
aws organizations attach-policy --policy-id GENERAL_POLICY_ID --target-id HIPAA_OU_ID
aws organizations attach-policy --policy-id PCI_POLICY_ID --target-id HIPAA_PCI_OU_ID

Let’s take the time to examine what is happening here. We know that we have removed all permissions for all services for our root Organization, OUs, and accounts. We created policies that allow services that are compliant with HIPAA and PCI respectively. And we know that when we apply those policies to our OUs, the accounts within that OU will now have access to those services. In the case of the sub OU that allows both PCI and HIPAA services, the sub OU that has the overlapping accounts will inherit the services that are allowed by the HIPAA OU. Applying the AllowPCIServices policy to the sub OU will mean that in addition to the services it inherited, it will also be allowed to access the services which are PCI compliant.


Success! We have created a new Organization, invited our accounts into that organization, and grouped those accounts into OUs so we could ensure each group of accounts is compliant to the required standards. When dealing with a few accounts, working from the command line is fine. For larger amounts of accounts, it is highly recommended to script this process out.

AWS Organizations helps companies manage multiple accounts from a billing and policy standpoint. The use of Organizations helps reduce accidental security policies that violate compliance laws that companies may have to follow. It also reduces the time and effort required to create new accounts by providing an API that allows the auto-creation of new accounts with the correct policies already attached. Users can be restricted to the accounts they need access to and blocked from the accounts they don’t. All companies that have multiple accounts can benefit from the features provided by Organizations.

About Stelligent
Stelligent is an APN Advanced Consulting Partner and hold the AWS DevOps Competency. As a technology services company that provides DevOps Automation on Amazon Web Services (AWS) Cloud, we aim for “one-click deployment.” Our reason for being is to help our customers gain the ability to continuously deploy their software, when they want to, and with confidence. We’ve been providing DevOps Automation solutions on AWS since 2009. Follow @Stelligent on Twitter. Learn more at

Stelligent is an APN Launch Partner for the AWS Management Tools Addition to the AWS Service Delivery Program

Stelligent, an AWS Partner Network (APN) Advanced Consulting Partner specializing exclusively in DevOps Automation on the Amazon Web Services (AWS) Cloud, announce that it is a launch partner for four additional services in the AWS Service Delivery Program: AWS CloudFormationAWS CloudTrail, AWS Config, and Amazon EC2 Systems Manager. This means that Stelligent has demonstrated a successful track record of delivering specific AWS services and a demonstrated ability to provide expertise in a particular service or skill area.

800x200_Management-01 (1)

“The ability to deploy high-quality code in hours, not months, is something that we can help any company – including many in the Fortune 500 – achieve,” said Paul Duvall, Stelligent CTO and co-founder. “Using AWS Management Tools along with other AWS services we can drastically reducing our customers’ development times, while increasing the rate at which they can introduce new features.”

The AWS Service Delivery Program highlights APN Partners with a track record of delivering specific AWS services to customers. Attaining an AWS Service Delivery Distinction allows partners to differentiate themselves by showcasing to AWS customers areas of specialization.

The four AWS Management Tools included in the AWS Service Delivery Program include (Source AWS):

  • AWS CloudFormation – Create and manage a collection of related AWS resources, provisioning and updating them in an orderly and predictable fashion.
  • AWS CloudTrail – Track user activity and API usage
  • AWS Config – Record and evaluate configurations of your AWS resources
  • Amazon EC2 Systems Manager – Easily configure and manage Amazon EC2 and on-premises systems

Stelligent uses these AWS Management Tools in creating DevOps Automation solutions for customers so they can release new features to users, on demand, and reduce the costs of delivering software by reducing overall lead time. Resulting benefits include the following:

● the ability to release software with every successful change
● significant reduction of cycle time
● increased confidence in what is deployed
● increase in ability to experiment
● reduction of overall costs

“We are proud to work with AWS to deliver DevOps Automation solutions to our customers, allowing them to release new features to users whenever they choose,” said Duvall. “Being a launch partner in the AWS Management Tools addition to the AWS Service Delivery Program means a lot to us — this is what we live and breathe, and we do so exclusively for our customers targeting AWS. We obsess over customers, and we obsess over applying what we believe are essential practices to achieve the aims of continuous delivery. This acknowledgement will help us reach still more customers who value that passion.”

About Stelligent
Stelligent is an APN Advanced Consulting Partner and hold the AWS DevOps Competency. As a technology services company that provides DevOps Automation on Amazon Web Services (AWS) Cloud, we aim for “one-click deployment.” Our reason for being is to help our customers gain the ability to continuously deploy their software, when they want to, and with confidence. We’ve been providing DevOps Automation solutions on AWS since 2009. Follow @Stelligent on Twitter. Learn more at

DevOps in AWS Radio: Serverless (Episode 8)

In this episode, Paul Duvall and Brian Jakovich cover recent DevOps in AWS news and speak with Mike Roberts and John Chapin from Symphonia about Serverless architectures, DevOps, and AWS.

Here are the show notes:

DevOps in AWS News

Episode Topics

  1. Pros and Cons of Serverless architectures
  2. Symphonia’s Serverless speciality
  3. How is DevOps and Continuous Delivery fit into Serverless
  4. Continuous Experimentation and Serverless architectures
  5. Types of applications or services are most suitable or not suitable for Serverless
  6. O’Reilly report: “What is Serverless?”
  7. Serverless architectures resources and people in the space
  8. Vendor lockin

Additional Resources

About DevOps in AWS Radio

On DevOps in AWS Radio, we cover topics around applying DevOps principles and practices such as Continuous Delivery in the Amazon Web Services cloud. This is what we do at Stelligent for our customers. We’ll bring listeners into our roundtables and speak with engineers who’ve recently published on our blog and we’ll also be reaching out to the wider DevOps in AWS community to get their thoughts and insights.

The overall vision of this podcast is to describe how listeners can create a one-click (or “no click”) implementation of their software systems and infrastructure in the Amazon Web Services cloud so that teams can deliver software to users whenever there’s a business need to do so. The podcast will delve into the cultural, process, tooling, and organizational changes that can make this possible including:

  • Automation of
    • Networks (e.g. VPC)
    • Compute (EC2, Containers, Serverless, etc.)
    • Storage (e.g. S3, EBS, etc.)
    • Database and Data (RDS, DynamoDB, etc.)
  • Organizational and Team Structures and Practices
  • Team and Organization Communication and Collaboration
  • Cultural Indicators
  • Version control systems and processes
  • Deployment Pipelines
    • Orchestration of software delivery workflows
    • Execution of these workflows
  • Application/service Architectures – e.g. Microservices
  • Automation of Build and deployment processes
  • Automation of testing and other verification approaches, tools and systems
  • Automation of security practices and approaches
  • Continuous Feedback systems
  • Many other Topics…

Service discovery for microservices with mu

mu is a tool that makes it simple and cost-efficient for developers to use AWS as the platform for running their microservices.  In this fourth post of the blog series focused on the mu tool, we will use mu to setup Consul for service discovery between multiple microservices.  

Why do I need service discovery?

One of the biggest benefits of a microservices architecture is that the services can be deployed independently of one another.  However, this presents a new challenge in that it becomes difficult for clients to know the list of containers to use when invoking the service.  Here are three different approaches to address this challenge:

  • Load balancer per microservice: Create a load balancer for every microservice and add/remove containers to the load balancer as deployments and scaling events occur.  The endpoint address of the load balancer is then shared with clients through some manual process.

cloudcraft - Microservices - multip.png

There are three concerns with this approach.  First, the endpoint address of the load balancer must never change or else all the clients will be broken and require updates to take the new endpoint address.  This can be addressed via DNS CNAME records, but still requires that the name chosen for the record must not change.  Second, there is the additional cost of a load balancer for every microservice.  Finally, there is additional latency introduced with adding a load balancer between each microservice invocation.

  • Shared load balancer: Create a load balancer that is shared by all microservices in an environment.  The load balancer must have rules for each microservice to route requests by URI patterns.


The concern with this approach is that all traffic is now flowing through a single load balancer which can become a constraint in scaling the entire system.  Additionally, the load balancer becomes a shared resource amongst all the microservice teams, potentially impacting a team’s ability to operate independently of other teams.

  • Client load balancer: Load balancing from within the client is an approach in which the client has an awareness of all the containers in-service for a given microservice.  The client can then load balance between the containers when invoking the microservice.  This approach requires a system to provide service registration and service discovery.   

cloudcraft - mu-bananaservice-v3

The benefit with this approach is there are no longer load balancers between each microservice request so all the concerns with those prior approaches are addressed.  However, a new type microservice, an edge service, will need to be deployed to allow clients outside the microservice environment (that do not have access to service discovery) to invoke the service.

The preferred approach is the third approach which uses service discovery and client side load balancing within the microservice environment and edge routing with traditional load balancing for clients outside the microservice environment.  This approach provides the lowest latency and most loosely coupled solution for microservice invocation.

Let mu help!

The environment that mu creates for your microservice can manage the provisioning of Consul for service discovery and registration of your microservices.  Consul is a sort of phonebook for microservices.  It provides APIs for services to register their endpoints and for clients to lookup the endpoints.

Let’s demonstrate this by adding an additional milkshake service to the invoke the banana service from the first post.  Additionally, we will create a zuul router service to provide an edge service via Netflix’s Zuul.  Zuul is a proxy service that serves as the front door for all requests from outside the microservice environment.  Zuul will use Consul for service discovery to determine where best to route the incoming request.  Additionally, Zuul provides an excellent location to enforce policies such as authentication, authorization or logging on all incoming requests.

Enabling Consul and Edge Router

The first thing we will want to do is set up our edge router with Zuul.  This is just a matter of adding the @EnableZuulProxy and @EnableDiscoveryClient annotations to the Spring Boot application:

public class ZuulRouterApplication {

   public static void main(String[] args) {, args);

Zuul is configured via the application.yml file in src/main/resources.  For each service that we want exposed via the edge router, we add URI path patterns:

    name: zuul-router
      path: /milkshakes/**
      stripPrefix: false
      path: /bananas/**
      stripPrefix: false

In order to enable Consul in your environment, you need to update the environment definition in the mu.yml file.  Additionally, you need to configure Spring Cloud Consul to connect to the docker host ip address for service discovery.  We will also want to configure Spring Cloud to not register with Consul, since mu will already configure the Registrator agent on your ECS container instances:

 - name: acceptance
     maxSize: 5
     provider: consul
 - name: production

  name: zuul-router
  port: 8080
  - /*
      provider: GitHub
      repo: cplee/zuul-router
      image: aws/codebuild/java:openjdk-8

Create Milkshake Service

Now we can create a new service to manage the creation of milkshakes.  The service looks very similar to the banana service, with the exception of declaring a Spring RestTemplate annotated with @LoadBalanced to enable client side loadbalancing via Ribbon.


public class MilkshakeApplication {

  RestTemplate restTemplate(){
     return new RestTemplate();

Now we can use the RestTemplate to make calls directly to the banana service.  Ribbon will do a lookup in Consul for a service named banana-service and replace it in the URL with one of the container’s IP and port:

public class BananaProvider implements FlavorProvider {

  private RestTemplate restTemplate;

  private List<Map<String,Object>> getAll() {
    ParameterizedTypeReference<List<Map<String, Object>>> typeRef =
            new ParameterizedTypeReference<List<Map<String, Object>>>() {};

    ResponseEntity<List<Map<String, Object>>> exchange =
  "http://banana-service/bananas",HttpMethod.GET,null, typeRef);

    return exchange.getBody();

Try it out!

After we have deployed all three services, we can use mu to confirm that all are running as expected.

~ ❯❯❯ mu env show acceptance                                                                                                                                                                                                       

Environment:    acceptance
Cluster Stack:  mu-cluster-dev (UPDATE_COMPLETE)
VPC Stack:      mu-vpc-dev (UPDATE_COMPLETE)
Bastion Host:
Base URL:

Container Instances:
|    EC2 INSTANCE     |   TYPE   |     AMI      |     AZ     | CONNECTED | STATUS | # TASKS | CPU AVAIL | MEM AVAIL |
| i-08e3edc8c644f0534 | t2.micro | ami-62d35c02 | us-west-2b | true      | ACTIVE |       3 |       604 |       139 |
| i-05bc14a67e53889e1 | t2.micro | ami-62d35c02 | us-west-2a | true      | ACTIVE |       3 |       604 |       139 |
| i-0b56a0d9572531e9e | t2.micro | ami-62d35c02 | us-west-2c | true      | ACTIVE |       3 |       604 |       139 |
| i-05b2188a5c575fbeb | t2.micro | ami-62d35c02 | us-west-2b | true      | ACTIVE |       1 |       624 |       739 |

|      SERVICE      |         IMAGE             |      STATUS      |     LAST UPDATE     |
| milkshake-service | milkshake-service:9e4bcd9 | CREATE_COMPLETE  | 2017-05-12 11:33:05 |
| zuul-router       | zuul-router:3d4795c       | UPDATE_COMPLETE  | 2017-05-12 12:09:47 | 
| banana-service    | banana-service:3b62124    | UPDATE_COMPLETE  | 2017-05-12 11:32:55 |

We can then use curl to get a list of all the bananas available via the banana-service:

curl -s | jq
    "pickedAt": null,
    "peeled": null,
    "links": [
        "rel": "self",
        "href": ""

Next we try to create a milkshake using the milkshake-service:

~ ❯❯❯ curl -s -d "{}" -H "Content-Type: application/json"\?flavor\=Banana | jq                                                                         
  "timestamp": "2017-05-15T19:12:56.640+0000",
  "status": 500,
  "error": "Internal Server Error",
  "exception": "org.springframework.web.client.HttpClientErrorException",
  "message": "429 Not enough bananas to make the shake.",
  "path": "/milkshakes"

Looks like there aren’t enough bananas to create a milkshake.  Let’s create another one:

~ ❯❯❯ curl -s -d "{}" -H "Content-Type: application/json"

~ ❯❯❯ curl -s | jq                                                                                                                         
    "pickedAt": null,
    "peeled": null,
    "links": [
        "rel": "self",
        "href": ""
    "pickedAt": null,
    "peeled": null,
    "links": [
        "rel": "self",
        "href": ""

Now let’s try again creating a milkshake:

~ ❯❯❯ curl -s -d "{}" -H "application/json"\?flavor\=Banana | jq                                                                      
  "id": 3,
  "flavor": "Banana"

This time it worked, and if we query the list of bananas again, we see that 2 have been deleted for the milkshake:

~ ❯❯❯ curl -s | jq                                                                                                                        


Decomposing a monolithic application into microservices presents an interesting challenge in enabling services to invoke one another while still keeping the services loosely coupled.  Using a client side load balancer like Ribbon along with a service discovery tool like Consul provide an excellent solution to this challenge.  As demonstrated in this post, mu makes it simple to enable service discovery in your microservice environment to help achieve this solution.  Head over to stelligent/mu on GitHub and get started!

Additional Resources

Did you find this post interesting? Are you passionate about working with the latest AWS technologies? If so, Stelligent is hiring and we would love to hear from you!

Docker lifecycle automation and testing with Ruby in AWS

My friend and colleague, Stephen Goncher and I got to spend some real time recently implementing a continuous integration and continuous delivery pipeline using only Ruby. We were successful in developing a new module in our pipeline gem that handles many of the docker engine needs without having to skimp out on testing and code quality. By using the swipely/docker-api gem we were able to write well-tested, DRY pipeline code that can be leveraged by future users of our environment with confidence.

Our environment included the use of Amazon Web Service’s Elastic Container Registry (ECR) which proved to be more challenging to implement than we originally considered. The purpose of this post is to help others implement some basic docker functionality in their pipelines more quickly than we did. In addition, we will showcase some of the techniques we used to test our docker images.

Quick look at the SDK

It’s important that you make the connection in your mind now that each interface in the docker gem has a corresponding API call in the Docker Engine. With that said, it would be wise to take a quick stroll through the documentation and API reference before writing any code. There’s a few methods, such as Docker.authenticate! that will require some advanced configuration that is vaguely documented and you’ll need to combine all the sources to piece them together.

For those of you who are example driven learners, be sure to check out an example project on github that we put together to demonstrate these concepts.

Authenticating with ECR

We’re going to save you the trouble of fumbling through the various documentation by providing an example to authenticate with an Amazon ECR repository. The below example assumes you have already created a repository in AWS. You’ll also need to have an instance role attached to the machine you’re executing this snippet from or have your API key and secret configured.

Snippet 1. Using ruby to authenticate with Amazon ECR

require 'aws-sdk-core'
require 'base64'
require 'docker'

# AWS SDK ECR Client
ecr_client =

# Your AWS Account ID
aws_account_id = '1234567890'

# Grab your authentication token from AWS ECR
token = ecr_client.get_authorization_token(
 registry_ids: [aws_account_id]

# Remove the https:// to authenticate
ecr_repo_url = token.proxy_endpoint.gsub('https://', '')

# Authorization token is given as username:password, split it out
user_pass_token = Base64.decode64(token.authorization_token).split(':')

# Call the authenticate method with the options
Docker.authenticate!('username' => user_pass_token.first,
                     'password' => user_pass_token.last,
                     'email' => 'none',
                     'serveraddress' => ecr_repo_url)

Pro Tip #1: The docker-api gem stores the authentication credentials in memory at runtime (see: Docker.creds.) If you’re using something like a Jenkins CI server to execute your pipeline in separate stages, you’ll need to re-authenticate at each step. Here’s an example of how the sample project accomplishes this.

Snippet 2. Using ruby to logout

Docker.creds = nil

Pro Tip #2: You’ll need to logout or deauthenticate from ECR in order to pull images from the public/default repository.

Build, tag and push

The basic functions of the docker-api gem are pretty straightforward to use with a vanilla configuration. When you tie in a remote repository such as Amazon ECR there can be some gotcha’s. Here are some more examples of the various stages of a docker image you’ll encounter with your pipeline. Now that you’re authenticated, let’s get to doing some real work!

The following snippets assume you’re authenticated already.

Snippet 3. The complete lifecycle of a basic Docker image

# Build our Docker image with a custom context
image = Docker::Image.build_from_dir(
 { 'dockerfile' => 'ubuntu/Dockerfile' }

# Tag our image with the complete endpoint and repo name
image.tag(repo: '',
          tag: 'latest')

# Push only our tag to ECR
image.push(nil, tag: 'latest')

Integration Tests for your Docker Images

Here at Stelligent, we know that the key to software quality is writing tests. It’s part of our core DNA. So it’s no surprise we have some method to writing integration tests for our docker images. The solution will use Serverspec to launch the intermediate container, execute the tests and compile the results while we use the docker-api gem we’ve been learning to build the image and provide the image id into the context.

Snippet 5. Writing a serverspec test for a Docker Image

require 'serverspec'

describe 'Dockerfile' do
 before(:all) do
   set :os, family: :debian
   set :backend, :docker
   set :docker_image, '123456789' # image id

 describe file('/usr/local/apache2/htdocs/index.html') do
   it { should exist }
   it { should be_file }
   it { should be_mode 644 }
   it { should contain('Automation for the People') }

 describe port(80) do
   it { should be_listening }

Snippet 6. Executing your test

$ rspec spec/integration/docker/stelligent-example_spec.rb

You’re Done!

Using a tool like the swipely/docker-api to drive your automation scripts is a huge step forward in providing fast, reliable feedback in your Docker pipelines compared to writing bash. By doing so, you’re able to write unit and integration tests for your pipeline code to ensure both your infrastructure and your application is well-tested. Not only can you unit test your docker-api implementation, but you can also leverage the AWS SDK’s ability to stub responses and take your testing a step further when implementing with Amazon Elastic Container Repository.

See it in Action

We’ve put together a short (approx. 5 minute) demo of using these tools. Check it out from github and take a test drive through the life cycle of Docker within AWS.

Working with cool tools like Docker and its open source SDKs is only part of the exciting work we do here at Stelligent. To take your pipeline a step further from here, you should check out mu — a microservices platform that will deploy your newly tested docker containers. You can take that epic experience a step further and become a Stelligentsia because we are hiring innovative and passionate engineers like you!

Microservice testing with mu: injecting quality into the pipeline

mu is a tool that makes it simple and cost-efficient for developers to use AWS as the platform for running their microservices.  In this second post of the blog series focused on the mu tool, we will use mu to incorporate automated testing in the microservice pipeline we built in the first post.  

Why should I care about testing?

Most people, when asked why they want to adopt continuous delivery, will reply that they want to “go faster”.  Although continuous delivery will enable teams to get to production quicker, people often overlook the fact that it will also improve the quality of the software…at the same time.

Martin Fowler, in his post titled ContinuousDelivery, says you’re doing continuous delivery when:

  • Your software is deployable throughout its lifecycle
  • Your team prioritizes keeping the software deployable over working on new features
  • Anybody can get fast, automated feedback on the production readiness of their systems any time somebody makes a change to them
  • You can perform push-button deployments of any version of the software to any environment on demand

It’s important to recognize that the first three points are all about quality.  Only when a team focuses on injecting quality throughout the delivery pipeline can they safely “go faster”.  Fowler’s list of continuous delivery characteristics is helpful in assessing when a team is doing it right.  In contrast, here is a list of indicators that show when a team is doing it wrong:

  • Testing is done late in a sprint or after multiple sprints
  • Developers don’t care about quality…that is left to the QA team
  • A limited number of people are able to execute tests and assess production readiness
  • Majority of tests require manual execution

This problem is only compounded with microservices.  By increasing the number of deployable artifacts by a factor of 10x or 100x, you are increasing the complexity of the system and therefore the volume of testing required.  In short, if you are trying to do microservices and continuous delivery without considering test automation, you are doing it wrong.

Let mu help!

blog1The continuous delivery pipeline that mu creates for your microservice will run automated tests that you define on every execution of the pipeline.  This provides quick feedback to all team members as to the production readiness of your microservice.

mu accomplishes this by adding a step to the pipeline that runs a CodeBuild project to execute your tests.  Any tool that you can run from within CodeBuild can be used to test your microservice.

Let’s demonstrate this by adding automated tests to the microservice pipeline we created in the first post for the banana service.

Define tests with Postman

First, we’ll use Postman to define a test collection for our microservice.  Details on how to use Postman are beyond the scope of this post, but here are few good videos to learn more:

I started by creating a test collection named “Bananas”.  Then I created requests in the collection for the various REST endpoints I have in my microservice.  The requests use a Postman variable named “BASE_URL” in the URL to allow these tests to be run in other environments.  Finally, I defined tests in the JavaScript DSL that is provided by Postman to validate the results match my expectations.

Below, you will find an example of one of the requests in my collection:


Once we have our collection created and we confirm that our tests pass locally, we can export the collection as a JSON file and save it in our microservices repository.  For this example, I’ve exported the collection to “src/test/postman/collection.json”.


Run tests with CodeBuild

Now that we have our end to end tests defined in a Postman collection, we can use Newman to run these tests from CodeBuild.  The pipeline that mu creates will check for the existence of a file named buildspec-test.yml and if it exists, will use that for running the tests.  

There are three important aspects of the buildspec:

  • Install the Newman tool via NPM
  • Run our test collection with Newman
  • Keep the results as a pipeline artifact

Here’s the buildspec-test.yml file that was created:

version: 0.1

## Use newman to run a postman collection.  
## The env.json file is created by the pipeline with BASE_URL defined

      - npm install newman --global
      - newman run -e env.json -r html,json,junit,cli src/test/postman/collection.json

    - newman/*

The final change that we need to make for mu to run our tests in the pipeline is to specify the image for CodeBuild to use for running our tests.  Since the tool we use for testing requires Node.js, we will choose the appropriate image to have the necessary dependencies available to us.  So our updated mu.yml file now looks like:

- name: acceptance
- name: production
  name: banana-service
  port: 8080
  - /bananas
      provider: GitHub
      repo: myuser/banana-service
      image: aws/codebuild/java:openjdk-8
      image: aws/codebuild/eb-nodejs-4.4.6-amazonlinux-64:2.1.3

Apply these updates to our pipeline my running mu:

$ mu pipeline up
Upserting Bucket for CodePipeline
Upserting Pipeline for service 'banana-service' …

Commit and push our changes to cause a new run of the pipeline to occur:

$ git add --all && git commit -m "add test automation" && git push

We can see the results by monitoring the build logs:

$ mu pipeline logs -f
2017/04/19 16:39:33 Running command newman run -e env.json -r html,json,junit,cli src/test/postman/collection.json
2017/04/19 16:39:35 newman
2017/04/19 16:39:35
2017/04/19 16:39:35 Bananas
2017/04/19 16:39:35
2017/04/19 16:39:35  New Banana
2017/04/19 16:39:35   POST [200 OK, 354B, 210ms]
2017/04/19 16:39:35     Has picked date
2017/04/19 16:39:35     Not peeled
2017/04/19 16:39:35
2017/04/19 16:39:35  All Bananas
2017/04/19 16:39:35   GET [200 OK, 361B, 104ms]
2017/04/19 16:39:35     Status code is 200
2017/04/19 16:39:35     Has bananas
2017/04/19 16:39:35
2017/04/19 16:39:35
2017/04/19 16:39:35                           executed    failed
2017/04/19 16:39:35
2017/04/19 16:39:35               iterations         1         0
2017/04/19 16:39:35
2017/04/19 16:39:35                 requests         2         0
2017/04/19 16:39:35
2017/04/19 16:39:35             test-scripts         2         0
2017/04/19 16:39:35
2017/04/19 16:39:35       prerequest-scripts         0         0
2017/04/19 16:39:35
2017/04/19 16:39:35               assertions         5         0
2017/04/19 16:39:35
2017/04/19 16:39:35  total run duration: 441ms
2017/04/19 16:39:35
2017/04/19 16:39:35  total data received: 331B (approx)
2017/04/19 16:39:35
2017/04/19 16:39:35  average response time: 157ms
2017/04/19 16:39:35


Adopting continuous delivery for microservices demands the injection of test automation into the pipeline.  As demonstrated in this post, mu gives you the freedom to choose whatever test framework you desire and executes those test for you on every pipeline execution.  Only once your pipeline is doing the work of assessing the microservice readiness for production can you achieve the goal of delivering faster while also increasing quality.

In the upcoming posts in this blog series, we will look into:

  • Custom Resources –  create custom resources like DynamoDB with mu during our microservice deployment
  • Service Discovery – use mu to enable service discovery via `Consul` to allow for inter-service communication
  • Additional Use Cases – deploy applications other than microservices via mu, like a wordpress stack

Until then, head over to stelligent/mu on GitHub and get started!

Additional Resources

Did you find this post interesting? Are you passionate about working with the latest AWS technologies? If so, Stelligent is hiring and we would love to hear from you!

Introducing mu: a tool for managing your microservices in AWS

mu is a tool that Stelligent has created to make it simple and cost-efficient for developers to use AWS as the platform for running their microservices.  In this first post of the blog series focused on the mu tool, we will be introducing the motivation for the tool and demonstrating the deployment of a microservice with it.  

Why microservices?

The architectural pattern of decomposing an application into microservices has proven extremely effective at increasing an organization’s ability to deliver software faster.  This is due to the fact that microservices are independently deployable components that are decoupled from other components and highly cohesive around a single business capability.  Those attributes of a microservice yield smaller team sizes that are able to operate with a high level of autonomy to deliver what the business wants at the pace the market demands.

What’s the catch?

When teams begin their journey with microservices, they usually face cost duplication on two fronts:  infrastructure and re-engineering. The first duplication cost is found in the “infrastructure overhead” used to support the microservice deployment.  For example, if you are deploying your microservices on AWS EC2 instances, then for each microservice, you need a cluster of EC2 instances to ensure adequate capacity and tolerance to failures.  If a single microservice requires 12 t2.small instances to meet capacity requirements and we want to be able to survive an outage in 1 out of 4 availability zones, then we would need to run 16 instances total, 4 per availability zone.  This leaves an overhead cost of 4 t2.small instances.  Then multiply this cost by the number of microservices for a given application and it is easy to see that the overhead cost of microservices deployed in this manner can add up quickly.

Containers to the rescue!

An approach to addressing this challenge of overhead costs is to use containers for deploying microservices.  Each microservice would be deployed as a series of containers to a cluster of hosts that is shared by all microservices.  This allows for greater density of microservices on EC2 instances and allows the overhead to be shared by all microservices.  Amazon ECS (EC2 Container Service) provides an excellent platform for deploying microservices as containers.  ECS leverages many AWS services to provide a robust container management solution.  Additionally, a developer can use tools like CodeBuild and CodePipeline to create continuous delivery pipelines for their microservices.

That sounds complicated…

This approach leads to the second duplication cost of microservices: the cost of “reengineering”.  There is a significant learning curve for developers to learn how to use all these different AWS resources to deploy their microservices in an efficient manner.  If each team is using their autonomy to engineer a platform on AWS for their microservices then a significant level of engineering effort is being duplicated.  This duplication not only causes additional engineering costs, but also impedes a team’s ability to deliver the differentiating business capabilities that they were commissioned to do in the first place.

Let mu help!

To address these challenges, mu was created to simplify the declaration and administration of the AWS resources necessary to support microservices.  mu is a tool that a developer uses from their workstation to deploy their microservices to AWS quickly and efficiently as containers.  It codifies best practices for microservices, containers and continuous delivery pipelines into the AWS resources it creates on your behalf.  It does this from a simple CLI application that can be installed on the developer’s workstation in seconds.  Similar to how the Serverless Framework improved the developer experience of Lambda and API Gateway, this tool makes it easier for developers to use ECS as a microservices platform.

Additionally, mu does not require any servers, databases or other AWS resources to support itself.  All state information is managed via CloudFormation stacks.  It will only create resources (via CloudFormation) necessary to run your microservices.  This means at any point you can stop using mu and continue to manage the AWS resources that it created via AWS tools such as the CLI or the console.

Core components

The mu tool consists of three main components:

  • Environments – an environment includes a shared network (VPC) and cluster of hosts (ECS and EC2 instances) necessary to run microservices as clusters.  The environments include the ability to automatically scale out or scale in based on resource requirements across all the microservices that are deployed to it.  Many environments can exist (e.g. development, staging, production)
  • Services – a microservice that will be deployed to a given environment (or environments) as a set of containers.
  • Pipeline – a continuous delivery pipeline that will manage the building, testing, and deploying of a microservice in the various environments.


Installing and configuring mu

First let’s install mu:

$ curl -s | sh

If you’re appalled at the idea of curl | bash installers, then you can always just download the latest version directly.

mu will use the same mechanism as aws-cli to authenticate with the AWS services.  If you haven’t configured your AWS credentials yet, the easiest way to configure them is to install the aws-cli and then follow the aws configure instructions:

$ aws configure
AWS Secret Access Key [None]: wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
Default region name [None]: us-west-2
Default output format [None]: json

Setup your microservice

In order for mu to setup a continuous delivery pipeline for your microservice, you’ll need to run mu from within a git repo.  For this demo, we’ll be using the stelligent/banana-service repo for our microservice.  If you want to follow along and try this on your own, you’ll want to fork the repo and clone your fork.

Let’s begin with cloning the microservice repo:

$ git clone
$ cd banana-service

Next, we will initialize mu configuration for our microservice:

$ mu init --env
Writing config to '/Users/casey.lee/Dev/mu/banana-service/mu.yml'
Writing buildspec to '/Users/casey.lee/Dev/mu/banana-service/buildspec.yml'

We need to update the mu.yml that was generated with the URL paths that we want to route to this microservice and the CodeBuild image to use:

- name: acceptance
- name: production
  name: banana-service
  port: 8080
  - /bananas
      provider: GitHub
      repo: myuser/banana-service
      image: aws/codebuild/java:openjdk-8

Next, we need to update the generated buildspec.yml to include the gradle build command:

version: 0.1
      - gradle build
    - '**/*'

Finally, commit and push our changes:

$ git add --all && git commit -m "mu init" && git push

Create the pipeline

Make sure you have GitHub token with repo and admin:repo_hook scopes to provide to the pipeline in order to integrate with your GitHub repo.  Then you can create the pipeline:

$ mu pipeline up
Upserting Bucket for CodePipeline
Upserting Pipeline for service 'banana-service' ...

Now that the pipeline is created, it will build and deploy for every commit to your git repo.  You can monitor the status of the pipeline as it builds and deploys the microservice:

$ mu svc show

Pipeline URL:
|   STAGE    |  ACTION  |                 REVISION                 |   STATUS    |     LAST UPDATE     |
| Source     | Source   | 1f1b09f0bbc3f42170b8d32c68baf683f1e3f801 | Succeeded   | 2017-04-07 15:12:35 |
| Build      | Artifact |                                        - | Succeeded   | 2017-04-07 15:14:49 |
| Build      | Image    |                                        - | Succeeded   | 2017-04-07 15:19:02 |
| Acceptance | Deploy   |                                        - | InProgress  | 2017-04-07 15:19:07 |
| Acceptance | Test     |                                        - | -           |                   - |
| Production | Approve  |                                        - | -           |                   - |
| Production | Deploy   |                                        - | -           |                   - |
| Production | Test     |                                        - | -           |                   - |


You can also monitor the build logs:

$ mu pipeline logs -f
[Container] 2017/04/07 22:25:43 Running command mu -c mu.yml svc deploy acceptance 
[Container] 2017/04/07 22:25:43 Upsert repo for service 'banana-service' 
[Container] 2017/04/07 22:25:43   No changes for stack 'mu-repo-banana-service' 
[Container] 2017/04/07 22:25:43 Deploying service 'banana-service' to 'dev' from '' 

Once the pipeline has completed deployment of the service, you can view logs from service:

$ mu service logs -f acceptance                                                                                                                                                                         
  .   ____          _          __ _ _
 /\\ / ___'_ __ _ _(_)_ __ __ _ \ \ \ \
( ( )\___ | '_ | '_| | '_ \/ _` | \ \ \ \
 \\/  ___)| |_)| | | | | || (_| | ) ) ) )
  ' | ____| .__|_| |_|_| |_\__, | / / / /

:: Spring Boot ::        (v1.4.0.RELEASE) 
2017-04-07 22:30:08.788  INFO 5 --- [           main] com.stelligent.BananaApplication         : Starting BananaApplication on 6a4d5544d9de with PID 5 (/app.jar started by root in /) 
2017-04-07 22:30:08.824  INFO 5 --- [           main] com.stelligent.BananaApplication         : No active profile set, falling back to default profiles: default 
2017-04-07 22:30:09.342  INFO 5 --- [           main] ationConfigEmbeddedWebApplicationContext : Refreshing org.springframework.boot.context.embedded.AnnotationConfigEmbeddedWebApplicationContext@108c4c35: startup date [Fri Apr 07 22:30:09 UTC 2017]; root of context hierarchy 
2017-04-07 22:30:09.768  INFO 5 --- [           main] com.stelligent.BananaApplication         : Starting BananaApplication on 7818361f6f45 with PID 5 (/app.jar started by root in /) 

Testing the service

Finally, we can get the information about the ELB endpoint in the acceptance environment to test the service:

$ mu env show acceptance                                                                                                                                                                        

Environment:    acceptance
Cluster Stack:  mu-cluster-dev (UPDATE_COMPLETE)
VPC Stack:      mu-vpc-dev (UPDATE_COMPLETE)
Bastion Host:
Base URL:
Container Instances:
|    EC2 INSTANCE     |   TYPE   |     AMI      |     AZ     | CONNECTED | STATUS | # TASKS | CPU AVAIL | MEM AVAIL |
| i-093b788b4f39dd14b | t2.micro | ami-62d35c02 | us-west-2a | true      | ACTIVE |       3 |       604 |       139 |

|    SERVICE     |                                IMAGE                                |      STATUS      |     LAST UPDATE     |
| banana-service | | CREATE_COMPLETE  | 2017-04-07 15:25:43 |

$ curl -s | jq

    "pickedAt": "2017-04-10T10:34:27.911",
    "peeled": false,
    "links": [
        "rel": "self",
        "href": ""


To cleanup the resources that mu created, run the following commands:

$ mu pipeline term
$ mu env term acceptance
$ mu env term production


As you can see, mu addresses infrastructure and engineering overhead costs associated with microservices.  It makes deployment of microservices via containers simple and cost-efficient.  Additionally, it ensures the deployments are repeatable and non-dramatic by utilizing a continuous delivery pipeline for orchestrating the flow of software changes into production.

In the upcoming posts in this blog series, we will look into:

  • Test Automation –  add test automation to the continuous delivery pipeline with mu
  • Custom Resources –  create custom resources like DynamoDB with mu during our microservice deployment
  • Service Discovery – use mu to enable service discovery via Consul to allow for inter-service communication
  • Additional Use Cases – deploy applications other than microservices via mu, like a wordpress stack

Until then, head over to stelligent/mu on GitHub and get started.  Keep in touch with us in our Gitter room and share your feedback!

Additional Resources

Did you find this post interesting? Are you passionate about working with the latest AWS technologies? If you are Stelligent is hiring and we would love to hear from you!

Docker Swarm Mode on AWS

Docker Swarm Mode is the latest entrant in a large field of container orchestration systems. Docker Swarm was originally released as a standalone product that ran master and agent containers on a cluster of servers to orchestrate the deployment of containers. This changed with the release of Docker 1.12 in July of 2016. Docker Swarm Mode is now officially part of docker-engine, and built right into every installation of Docker. Swarm Mode brought many improvements over the standalone Swarm product, including:

  • Built-in Service Discovery: Docker Swarm originally included drivers to integrate with Consul, etcd or Zookeeper for the purposes of Service Discovery. However, this required the setup of a separate cluster dedicated to service discovery. The Swarm Mode manager nodes now assign a unique DNS name to each service in the cluster, and load balances between the running containers in those services.
  • Mesh Routing: One of the most unique features of Docker Swarm Mode is  Mesh Routing. All of the nodes within a cluster are aware of the location of every container within the cluster via gossip. This means that if a request arrives on a node that is not currently running the service for which that request was intended, the request will be routed to a node that is running a container for that service. This makes it so that nodes don’t have to be purpose built for specific services. Any node can run any service, and every node can be load balanced equally, reducing complexity and the number of resources needed for an application.
  • Security: Docker Swarm Mode uses TLS encryption for communication between services and nodes by default.
  • Docker API: Docker Swarm Mode utilizes the same API that every user of Docker is already familiar with. No need to install or learn additional software.
  • But wait, there’s more! Check out some of the other features at Docker’s Swarm Mode Overview page.

For companies facing increasing complexity in Docker container deployment and management, Docker Swarm Mode provides a convenient, cost-effective, and performant tool to meet those needs.

Creating a Docker Swarm cluster


For the sake of brevity, I won’t reinvent the wheel and go over manual cluster creation here. Instead, I encourage you to follow the fantastic tutorial on Docker’s site.

What I will talk about however is the new Docker for AWS tool that Docker recently released. This is an AWS Cloudformation template that can be used to quickly and easily set up all of the necessary resources for a highly available Docker Swarm cluster, and because it is a Cloudformation template, you can edit the template to add any additional resources, such as Route53 hosted zones or S3 buckets to your application.

One of the very interesting features of this tool is that it dynamically configures the listeners for your Elastic Load Balancer (ELB). Once you deploy a service on Docker Swarm, the built-in management service that is baked into instances launched with Docker for AWS will automatically create a listener for any published ports for your service. When a service is removed, that listener will subsequently be removed.

If you want to create a Docker for AWS stack, read over the list of prerequisites, then click the Launch Stack button below. Keep in mind you may have to pay for any resources you create. If you are deploying Docker for AWS into an older account that still has EC2-Classic, or wish to deploy Docker for AWS into an existing VPC, read the FAQ here for more information.


Deploying a Stack to Docker Swarm

With the release of Docker 1.13 in January of 2017, major enhancements were added to Docker Swarm Mode that greatly improved its ease of use. Docker Swarm Mode now integrates directly with Docker Compose v3 and officially supports the deployment of “stacks” (groups of services) via docker-compose.yml files. With the new properties introduced in Docker Compose v3, it is possible to specify node affinity via tags, rolling update policies, restart policies, and desired scale of containers. The same docker-compose.yml file you would use to test your application locally can now be used to deploy to production. Here is a sample service with some of the new properties:

version: "3"

    image: dockersamples/examplevotingapp_vote:before
      - 5000:80
      - frontend
      replicas: 2
        parallelism: 1
        delay: 10s
        condition: on-failure
        constraints: [node.role == worker]

While most of the properties within this YAML structure will be familiar to anyone used to Docker Compose v2, the deploy property is new to v3. The replicas field indicates the number of containers to run within the service. The update_config  field tells the swarm how many containers to update in parallel and how long to wait between updates. The restart_policy field determines when a container should be restarted. Finally, the placement field allows container affinity to be set based on tags or node properties, such as Node Role. When deploying this docker-compose file locally, using docker-compose up, the deploy properties are simply ignored.

Deployment of a stack is incredibly simple. Follow these steps to download Docker’s example voting app stack file and run it on your cluster.

SSH into any one of your Manager nodes with the user 'docker' and the EC2 Keypair you specified when you launched the stack.

curl -O

docker stack deploy -c docker-stack.yml vote

You should now see Docker creating your services, volumes and networks. Now run the following command to view the status of your stack and the services running within it.

docker stack ps vote

You’ll get output similar to this:


This shows the container id, container name, container image, node the container is currently running on, its desired and current state, and any errors that may have occurred. As you can see, the vote_visualizer.1 container failed at run time, so it was shut down and a new container spun up to replace it.

This sample application opens up three ports on your Elastic Load Balancer (ELB): 5000 for the voting interface, 5001 for the real-time vote results interface, and 8080 for the Docker Swarm visualizer. You can find the DNS Name of your ELB by either going to the EC2 Load Balancers page of the AWS console, or viewing your Cloudformation stack Outputs tab in the Cloudformation page of the AWS Console. Here is an example of the Cloudformation Outputs tab:


DefaultDNSTarget is the URL you can use to access your application.

If you access the Visualizer on port 8080, you will see an interface similar to this:


This is a handy tool to see which containers are running, and on which nodes.

Scaling Services

Scaling services is as simple as running the command docker service scale SERVICENAME=REPLICAS, for example:

docker service scale vote_vote=3

will scale the vote service to 3 containers, up from 2. Because Docker Swarm uses an overlay network, it is able to run multiple containers of the same service on the same node, allowing you to scale your services as high as your CPU and Memory allocations will allow.

Updating Stacks

If you make any changes to your docker-compose file, updating your stack is incredibly easy. Simply run the same command you used to create your stack:

docker stack deploy -c docker-stack.yml vote

Docker Swarm will update any services that were changed from the previous version, and adhere to any update_configs specified in the docker-compose file. In the case of the vote service specified above, only one container will be updated at a time, and a 10 second delay will occur once the first container is successfully updated before the second container is updated.

Next Steps

This was just a brief overview of the capabilities of Docker Swarm Mode in Docker 1.13. For further reading, feel free to explore the Docker Swarm Mode and Docker Compose docs. In another post, I’ll be going over some of the advantages and disadvantages of Docker Swarm Mode compared to other container orchestration systems, such as ECS and Kubernetes.

If you have any experiences with Docker Swarm Mode that you would like to share, or have any questions on any of the materials presented here, please leave a comment below!


Docker Swarm Mode

Docker for AWS

Docker Example Voting App Github