Testing AWS Config rules using potemkin-decorator

Automated integration testing of a python AWS Config rule is a challenging, but necessary undertaking to ensure that the rule provides accurate results. Creating resources, waiting for the result to show up in AWS Config, testing the results and tearing down the resources in best case circumstances takes several minutes. And it can take many times longer. Potemkin-decorator streamlines the integration testing process and with correct configuration can minimize the test duration, facilitating automated integration testing in deployment pipelines

Why AWS Config

AWS Config provides a method for enterprises to “measure” their cloud environment for purposes of change control, audit/compliance, and security analysis. The simplicity of this tool, providing a boolean result (COMPLIANT/NON_COMPLIANT) for a particular resource (ec2 instance) against a particular rule (instance launched less than 60 days ago), gives it flexibility and utility. AWS Config provides many “managed” rules to allow rapid use of this tool in a check-the-box fashion. The flip side of check-the-box is the rules are limited to general use cases. Most enterprises will find their needs quickly surpass what managed rules can do, leading to the development of custom config rules.

Custom Config rules need to be coded. And like any code, custom Config rules need to be tested to ensure they are providing accurate results. Unit tests go a long way, but not far enough. Config rules must be tested in a real world environment to ensure their accuracy. Here are some examples of how rules pass all unit tests but not provide accurate results in the real world:

The Boto3 stubbed out responses do not exactly match the actual response, causing a KeyError and no evaluations to be returned to AWS Config.
The evaluation returned to AWS Config uses the resource name where Config is expecting the resource id (or vice versa), causing the resulting evaluation to not be associated with an actual resource.
A Boto3 stubbed out call uses a camelCase variable where in AWS the variable is PascalCase, causing boto3 response errors and no evaluations to be returned to AWS Config.
When a resource is deleted the Config rule does not return a NOT_APPLICABLE evaluation, causing the deleted resources to remain in the Config results.

Manual Integration Testing

Testing the multi-component system of a custom Config rule and AWS Config can be done manually or in code. When starting out with custom Config rules it is relatively easy to do this manually.

Deploy the custom Config rule to a test environment
Create compliant and non compliant resources
Wait for the resources to show up in AWS Config
Validate the results

Some downsides of manual integration testing are

Humans make mistakes or fail to run tests.
There are no visible results. There is nothing in the code to show what tests were run, the particular settings for each test and the results of those tests.
Manual integration testing can only be enforced by a manual process (documenting the tests in the pull request, the reviewer ensuring that test results were included and reviewing the results).

Automated Integration Testing

Having code to perform integration tests is a way to overcome these downsides. Using CloudFormation to spin up resources and python/pytest tests to ensure the actual AWS Config results match the expected results removes manual process from testing. The CloudFormation template and pytest code can be reviewed as part of the merge request or as a pipeline step before deploying to production.

Downsides of automated integration testing include

Code complexity (create resources, wait for Config results, validate results, destroy resources)
Duration of tests (In the best case scenario, spinning up a simple CloudFormation template with an S3 bucket and then waiting for the results to show up in AWS Config takes about 3 minutes.)

The potemkin-decorator takes care of the code complexity, but does not directly address the duration of tests. Automating tests using potemkin-decorator will provide significant benefit over manual tests if our tests can be written to have an acceptable duration. And here the focus changes from the theoretical to the practical … looking at real world testing strategies and finding out the best way to minimize the testing duration.

All of the following tests use an AWS Config rule to ensure S3 buckets have KMS default bucket encryption.

Testing One Resource

Creating one resource and validating the results in AWS Config will be the baseline test. The CloudFormation template encrypted_bucket.yml creates an encrypted AWS bucket with a parameter for the encryption type. The pytest code for this test is:

Running the pytest results in:

This test completed successfully and took almost 3 minutes to complete.

Multiple Resources, Multiple Tests

An S3 bucket can have 3 states for default bucket encryption: not enabled, enabled with AES256 and enabled with aws:kms. To ensure full coverage, the integration test will need to create resources that match each of these conditions.

A new unencrypted_bucket.yml CloudFormation template is needed. Then add two additional tests are added to the above for the NON_COMPLIANT conditions

Encrypted with AES256
Unencrypted

And the results are

$ pytest test/integration/

Not surprisingly the results are three times as long as one test.

Minimizing the Duration of Testing Multiple Resource

The creation of a resource initiates an AWS Config evaluation of that resource. This evaluation is an asynchronous process and can take several minutes to complete. Even using S3 buckets, which get created relatively quickly, our total integration test duration is over nine minutes long. This time would be significantly longer for resources like RDS instances that take a significant period of time to provision or if testing required many more compliant/non_compliant conditions.

With one test per condition, the testing flow looks like this

Create resource
Wait for AWS Config evaluation
Evaluate test
Destroy resource
Repeat above 2 more times

To speed up testing the steps need to be done in parallel

Create resources
Wait for AWS Config evaluations
Evaluate tests
Destroy resources

Three possible solutions were examined for doing steps in parallel:

pytest-parallel
pytest-xdist
One test (one CloudFormation template, 3 assertions)

pytest-parallel

pytest-parallel is a non-thread safe library for running tests in parallel. The library being non-thread safe might cause problems with the existing unit tests. Testing confirms this is the case.

The existing boto3/stubber unit tests are not thread safe and would require significant changes before pytest-parallel could be used. Unless there is no better option, this is a PASS.

pytest-xdist

pytest-xdist is a thread safe library for running tests in parallel. Doing the same check of unit tests as above gives successful results.

And … drum roll please … running the integration tests we created above

The time is down to essentially the same as a single unit test.

One Test

One test means using one CloudFormation template containing all three resources (all_three_buckets.yml). This one test will have three assertions to validate the AWS Config results for all three resources.

And the test run

The time is about the same as pytest-xdist.

Solution Discussion

Two of our three potential solutions for reducing our AWS Config integration test duration worked well. Both got the testing time down to the same as a single test. Can we reduce this further? Unlikely. Most of the time for a single test is waiting for the asynchronous Config rule evaluation to complete. The remaining time is creating the Stack and deleting the stack. And we have to do all three of these items, so this testing time is as good as it gets.

Which should you use? The answer will depend on your particular circumstances

pytest-xdist

Advantages

Tests can be written using isolation (one assertion per test) like unit tests.

Disadvantages

Additional complexity ( install additional packages, change to command used in pipeline)
Unknown effects of parallel execution

If you have control of your pipelines, then pytest-xdist might be the better option. It will require testing to ensure there are no other effects of parallelism. If you have shared pipelines, the effort of testing pytest-xdist and adding it to that pipeline might not be worth the benefit of having isolated tests for each integration test condition.

Performance Note: Because integration tests are mostly waiting on CloudFormation and AWS Config, the multi-thread nature of pytest-dist did not seem to be a concern. This was validated by running 6 parallel tests on a 2 CPU machine. The CPU usage during this test was minimal.

One Test

Advantages

No changes to packages or pipeline
No testing of pytest add-on functionality required

Disadvantages

Loss of isolation

On the surface there is a loss of isolation as multiple assertions are being run per test. But integration tests are by design not isolated. Each test interacts with both AWS Config and AWS Cloudformation. If assertion isolation is not a significant concern, then the one test method is probably the best choice. And if your environment does not allow modifications to a shared pipeline, this is also your solution.

Performance Note: Scaling this solution to more integration tests should not increase the total time noticeably assuming that CloudFormation can create all of the resources in parallel.

Streamlined One Test

The use of multiple config_rule_wait_for_resources calls in one test made for repetitive code and excessive Boto3 calls to AWS Config as each resource was checked in series. A new method was added to potemkin-decorator to streamline the code required for one test. This method checks all resources in parallel, compares it to the expected results and returns a True/False. Here is the one test rewritten with the new config_rule_wait_for_compliance_results method.

Final Thoughts

Using potemkin-decorator and one of two testing methodologies enables automated integration tests of AWS custom Config rules with a reasonable duration. These can be used in a local development environment or in a CI pipeline. The use of automated testing provides significant benefit over manual testing and should therefore be considered by any enterprise with a significant investment in custom Config rules.

If you have other needs for automated integration testing with AWS resources, consider using potemkin-decorator to simplify the process. And if your use case could be simplified by service specific tooling (similar to how potemkin-decorator has the AWS Config specific methods), submit a feature request. Or even better send us a pull request.

Featured Image By -jkb- – Own work, CC BY-SA 3.0

Stelligent Amazon Pollycast