AWS Integration Testing for boto with potemkin-decorator

Test Automation for Integrating with External Services

Developing test automation code for an interface to an “external” service is always a difficult proposition. There is a spectrum of techniques for developing reproducible tests against an external service. On one end of the spectrum are “mocking” techniques and on the other far end of the spectrum are “live” integration tests involving the actual external service.

Test Automation for boto

The boto library is a (python) client library for interfacing with AWS services. Developing test automation code for boto code has the same difficulties as integration testing for an external service. In effect, an AWS service is the “external service”.

The spectrum of techniques for testing boto code from least realistic to most includes:

pytest Mocking
- Mocking intercepts any boto calls and replaces the responses with expected responses.
boto Stubbing
- Stubbing intercepts calls to boto as mocking does, but includes a “schema” for checking whether the expectations are illegitimate.
Service Virtualization or “Simulation”
- Simulating involves setting up a locally controlled service to behave like the AWS service. Test code can then hit that local service “over the wire”. Tooling in this space includes moto and localstack.
Live testing
- Live testing involves actually sending requests to an AWS service and verifying that the live responses match expectations.

Benefits and Drawbacks

Each one of these techniques can provide valuable feedback, but then also have trade-offs.

Mocking/Stubbing

Benefits

A test that mocks out AWS runs in total isolation from AWS. The test can provide feedback in milliseconds. This is especially useful when running tests in a CI/CD pipeline.
Setting up the “initial conditions” for the test is mostly straightforward since the responses are being completed faked out.

Drawbacks

Since the responses are replaced with expectations, they can be COMPLETELY DIVORCED FROM REALITY. So this approach is great to test how downstream business logic behaves given a response from AWS, but it may not be providing a legitimate response, invalidating the test.
Spaghetti/legacy code containing boto requests “all over the place” may not be straightforward to mock out.

Simulation

Benefits

A good simulator can allow for writing tests that feel like they are “live” over-the-wire, but without hitting the actual system. These tests may not run as quickly as a mocked test but can still run very fast given they are hitting a local process.
Such tests can still be run as part of a CI/CD pipeline.

Drawbacks

A bad simulator gives confusing or false results, which is worse than having no simulation. For example, mileage varies on the reliability of responses provided by moto and localstack. Sometimes they are consistent with what AWS would return, other times they are not.
If AWS provided a simulator with each service that they ensure behaves in a reliable way, this approach would be much more powerful. As it stands, a third-party develops moto and localstack and they do not necessarily keep up with new service APIs as they are released.
Setting up a simulator as part of an initial condition for an automated test is much more complex than mocking. On the other hand, tooling around docker-compose can mitigate the complexity (like Arquillian-Cube in the Java world).

Live Testing

Benefits

A “live” test with AWS returns realistic results; no guessing required.
AWS is about as stable as it gets so it’s unlikely a live integration test will hold up a CI/CD pipeline because AWS is down.

Drawbacks

Setting up the initial conditions in AWS can be just as complex as the boto code that is under test (harder to develop).
These tests can be very slow to set up and run. Tests will likely run in minutes instead of seconds. In a CI/CD pipeline, as few as three or four of these tests could sabotage the fast feedback developers are depending upon.
These tests require credentials to access an AWS account. This may restrict the ability to run such tests in a CI/CD pipeline, depending on the permissions that the CI/CD pipeline is allowed to have in AWS.
The act of creating actual AWS resources (over and over again) will incur a monetary cost.

Initial Conditions in Live AWS Integration Testing

While there are many drawbacks to live AWS integration testing to consider, there comes a time when it is necessary. Especially as a developer is first experimenting with a given AWS service, observing how the service actually behaves is critical.

The potemkin-decorator attempts to address the difficulties of setting up initial conditions for live integration tests. Instead of having to set up the initial conditions by writing boto code that is possibly as complex as the code under test, it allows capturing initial conditions as a CloudFormation template. CloudFormation is a far simpler “language” to develop in than boto.

For example, in developing boto code that operates on an S3 bucket, the following CloudFormation template could capture the initial conditions:

The following pytest code invokes the potemkin.CloudFormationStack decorator.

It invokes CreateStack on the aes256_bucket.yml template with the specified dictionary of Parameter values.
It creates a stack that starts with the name TestStack and ends with a timestamp suffix. The full name is wired into the pytest method as the stack_name argument so that the stack resources can be queried.
The stack outputs from CreateStack are turned into a vanilla dictionary and wired into the test method as the stack_outputs argument.
The test method is called. From here, the test code can expect the bucket exists, operate upon it, and make assertions about the outcomes of the operations.
The stack is torn down.

To execute this test, first:

Make sure python3 is installed. For more information please see: https://docs.python.org/3/using/index.html
Ensure valid AWS credentials exist for the myprofile AWS profile
Save the CloudFormation template as aes256_bucket.yml
Save the test code as bucket_test.py

Then:

AWS Config Rule Testing

Developing test automation for AWS Config Rules has similar challenges to developing arbitrary boto code. The first way to test rule code is to separate the business logic for the rule from any calls to boto and to test it in isolation (i.e. mocking/stubbing). That said, given a need to do a “live” test on an AWS Config Rule for ultimate confirmation, potemkin-decorator can be used to set up compliant or non-compliant resources for the test.

An additional complexity in doing live testing with AWS Config is that rules run asynchronously. The API includes a call to start the evaluation of a given rule but it does not return the results synchronously. Further, just because Config Service reports a rule evaluation was successful after a resource was created, that doesn’t guarantee the resource was actually evaluated (for internal rules vice custom rules). Therefore, a live test must wait/poll for the given resource to appear in the results before asserting compliance or non-compliance. The potemkin-decorator component includes a helper method (evaluate_config_rule_and_wait_for_resource) for invoking AWS Config Rules and waiting for the results.

For example, in testing the eip-attached Config Rule, the following CloudFormation template creates a non-compliant Elastic IP resource:

The following pytest code invokes the potemkin.CloudFormationStack decorator.

It invokes CreateStack on the template to create the unattached Elastic IP.
The stack outputs from CreateStack are turned into a vanilla dictionary and wired into the test method – the resource ID for the EIP.
The test method is called. From here, the test code invokes the eip-attached Config Rule and waits until the newly created resource appears in the detailed results for that rule.
The test asserts that the EIP is non-compliant.
The stack is torn down.

Conclusion

Testing boto code can be difficult, but there are many techniques that can be applied to ensure quality software. Live “integration testing” with AWS should most likely be done sparingly, but when it becomes necessary, potemkin-decorator can help improve developer velocity.

Stelligent Amazon Pollycast