Apply project management practices to cloud delivery

Ultimately, when you’re creating a system for delivering software, you’re writing software and in many ways it’s no different than writing any other software system. This means that estimating how long each and every activity will take down the day is nearly impossible – even when you’re using the exact same technology. This is because each project is different: the people are different, how much they want to automate is different, and how the software and infrastructure is architected is always different. That said, your stakeholders often have a need to get an overall estimate so that they can procure funds and other resources. On many projects, there is often contention on determining how much can get accomplished within a certain amount time.
Our approach to this is to identify the number of weeks for which stakeholders can procure these funds and resources and implement the deployment production line features based on this constraint. Usually, there’s a minimum amount of feasible time to get something useful in place. For most of our projects these days, this is at least six weeks of two or more cloud delivery engineers implementing these solutions.
We timebox all activities and let our customers decide what gets implemented in each “phase” – if you will. The overall project breakdown looks like this:

  • In the first 10% of the project time, we perform the project setup
  • By the time 20% of the project is complete, we have a CI server and the deployment production template in place
  • At 40%, we’ve implemented as many of the steps that our customer has ordered in the acceptance stage of the deployment production line
  • At 50%, we’ve implemented as many of the steps that our customer has ordered in the exploratory and self-service deployment steps of the deployment production line
  • At 60%, we’ve implemented as many of the steps that our customer has ordered in the capacity stage of the deployment production line
  • At 80%, we’ve implemented as many of the steps that our customer has ordered in the pre-production stage of the deployment production line

Timeboxed Pipeline (Source:

By the time the project is complete, we’ve implemented as many of the steps that our customer has ordered in the production and cross-cutting stages of the deployment production line.
While it’s not perfect, this approach allows us to provide useful features to our customers based on their priorities and not getting stuck on implementing one feature at the cost of all the others. In other words, they’ve always got something working. It could mean that perhaps not all of the steps in their software delivery processes are automated, but it’s something that they can extend and build upon versus half-implemented features.
So, here’s a high-level example of a 12-week implementation effort.

  • We complete the project setup by the end of week 1
  • We complete the Continuous Integration Server and Deployment Pipeline Template by the end of week 2
  • We complete the commit stage by the end of week 2
  • We complete the ordered acceptance stage steps by the end of week 5
  • We complete the ordered exploratory stage and self-service deployment steps by the end of week 6
  • We complete the ordered capacity stage steps by the end of week 7
  • We complete the ordered pre-production stage steps by the end of week 10
  • We complete the ordered production stage steps by the end of week 11
  • We complete the ordered cross-cutting steps by the end of week 12

To emphasize again, we commit to completing the stages in each phase, but we only commit to completing the steps that we can complete within the timeboxed phase. These timeboxes are based on the overall number of weeks committed to the project by the customer. Our customer stakeholder has the responsibility of ordering the features so that they we implement their most important features first. This way they have something working at any point in time.

Apply project delivery practices

We adhere a few specific practices in how we deliver our solutions to customers. Some of the ideas are borrowed from how Amazon develops software for its customers.
Before we write one line of code, we do the following:

  • Write the Case Study (Challenge, Solution, Benefits) for the solution we’re developing – as we expect it to be written at the conclusion of our work
  • Write the initial FAQ (up to 15 or so Q/A) for the solution
  • Write the README as we expect the system to be used
  • Provide a Lightweight UI, strawman or mockups of the solution. This may be CD/Jenkins/Go screenshots, etc.
  • Illustrate the initial Infrastructure Architecture Diagram
  • Write a framework for the User Manual
  • Write a framework for the “non step def” Acceptance Tests (e.g. in Cucumber)
  • We apply a five-step heuristic to all of work as described in the Five Steps to Continuous section of this article.

When we state that we have a “fully working CD system and software system in AWS”, it means the following:
Step 1 – New AWS Account
We assume that the project is beginning at the point of creating a new AWS account; whether it is a new account or not, the CD system should be capable of running on a new AWS account. This is because we’ve learned from experience that you can make all sorts of implicit assumptions in which you’ve assumed certain AWS-specific assets to exist. So, the best practice we adhere to is to provision all infrastructure resources on a new AWS account and ferret out any implicit dependencies. Typical dependencies include relying in specific S3 buckets, relying on region-specific resources, for specific IAM users to exist and so on.
Step 2: Given the following resources

  • Instructions (e.g. Github README, GDrive, Wiki, etc.)
  • Instructions for recreating complete software system are fully described in a step-by-step manner so that any technical professional can follow them. While comprehensive, the instructions should be no more than three printed pages.
  • Version-Control Repositories (e.g. Github) – All application code, configuration, infrastructure code, data and automated tests to recreate the complete software system is contained in these repos. Moreover, all system code (i.e. the code/configuration to build CI server, etc.) is in version control as well.
  • S3 – Location and name of the required S3 buckets and contents (i.e. if they’re not present, the system won’t work) are clearly described in the README along with how to transfer contents between AWS accounts (as applicable)
  • Binary Repositories
    • JARs/DLLs, etc. (Commons, etc.)
      • If applicable, these may be hosted in a Nexus, Artifactory, S3, etc. repo
    • Binary packages (For example, gems, yum, apt-get repos: Apache, MySQL, Tomcat, etc. packages)
    • Any external dependencies are clearly described in the README

Step 3: Launch the CI/CD Platform (e.g. Jenkins/Go)

  • Operator gets the version-controlled source code
  • Operator configures the “External Configuration”: The documentation describes how to enter basic configuration data into a very limited number of configuration datasources (should not take more than 5 minutes to enter one time). For example, application repo URLs, credentials, etc. Moreover, the README describes how the Operator can modify this configuration data on an ongoing basis, as appropriate.
  • Operator runs a single command to launch a fully-configured Jenkins stack in any well-supported AWS region where all the AWS products being used in the stack are available
  • Feedback mechanisms have been configured so that the right people are notified on the pipeline status after each commit. The feedback setup configuration is part of the External Configuration.

Step 4: Working System
When code is committed to the version-control repositories, the fully configured deployment production line creates the full software system (application, configuration, infrastructure and data) until it’s ready to release to production or reports a failure. Failures are specified in automated tests that are committed to the version-control repo and run as part of the pipeline.
From Stelligent’s perspective, we have more control over the infrastructure tests than other types of tests, so infrastructure should be operational whereas we might need to troubleshoot other parts of the software system to get application working.
The CD system is run – even if no internal dependencies have changed – at least once a day (this is to verify that external dependencies haven’t broken the system). The CD system and infrastructure should adhere to Stelligent Information Security Policy.
Typical Deliverables and example links: