It’s 7:37 AM on a Sunday.

You’re in the Security Operations Center (SOC) and alarms and emails are seemingly being triggered everywhere. You and a colleague are combing through dashboards and logs to determine what is causing these alerts.

After running around with your “hair on fire” for around 30 minutes, you finally determine that someone leaked administrator access keys by committing them to a public Git repository. In obtaining these keys the attacker was able to launch about $500 in Amazon EC2 spend within a half an hour for what was likely cryptomining purposes.

Ironically, you learned of this compromise not from AWS or logging and monitoring systems but through unusual changes in your billing activity.

You delete the access keys and view the AWS CloudTrail logs to determine other types of activity performed by any users on the AWS account (fortunately, you’d previously enabled CloudTrail log file integrity so you’re confident you’re viewing all the recent and valid AWS API calls).

You and your colleague come to the quick realization that it could’ve been much, much worse.

You walk through the remaining steps of your incident response and remediation workflow. You poke around to see if there are other resources that are vulnerable to an attack like this. While you ultimately perform a much more exhaustive post mortem, one of the first things you notice is that your Amazon DynamoDB tables were not encrypted. By viewing the CloudTrail logs, you notice the attacker did not access these tables.

Unfortunately, this is an all too common scenario in which there are humans involved in detecting and remediating a security incident like this. In this post, you’ll learn ways in which you can use automation to detect and remediate for these types of scenarios so that humans are essentially removed from the detection and remediation of incidents like these.

AWS CAF

AWS has published and regularly updates the AWS Cloud Adoption Framework (AWS CAF). In the CAF, there are three business perspectives (Business, People, and Governance) and three technical perspectives (Platform, Security, and Operations). In the Security perspective, there are five core pillars. They are:

Identity & Access Management
Detective Controls
Infrastructure Security
Data Protection
Incident Response

As you’ll learn, you can use concepts based on the Detective Controls pillar to help prevent, detect, and respond automatically to anomalies by leveraging a deployment pipeline that creates resources to monitor and remediate these incidents.

Encrypting all the things

Let’s imagine that your organization has a directive control that all AWS resources must be encrypted in transit and at rest. While we could also discuss the lack of detective controls to notice the breach described in the introduction, for the purposes of this example, we’ll focus on the fact that during the post investigation, you’d noticed that the DynamoDB tables were not encrypted. Of course, this violates your encryption directive.

Figure 1 illustrates a workflow for detecting when resources are not encrypted. In this case, we’ll focus only on DynamoDB but the same approach can be used for other resources too.

Figure 1 – Workflow for Encryption Detection and Incident Response on AWS

Here are the high-level steps in the workflow:

Step 1 – An engineer creates a DynamoDB resource within an automated provisioning tool like AWS CloudFormation along with other AWS resources. They commit their changes to a Git source code repository.
Step 2 – An AWS CodePipeline pipeline (which is automated through a bootstrapping process in CloudFormation as well) starts and runs preventive checks against all CloudFormation templates in the Git repository using the open source cfn_nag framework which notifies engineers on security vulnerabilities. Most notably, cfn_nag provides rules for detecting whether encryption has been defined as part of provisioning certain AWS resources
Step 3 – CodePipeline calls CloudFormation to configure the DynamoDB resources within the AWS account
Step 4 – AWS Config Rules monitor changes to AWS services. Running one of the AWS Managed Config Rules, it calls an AWS Lambda function which discovers changes to DynamoDB resources and flags it as a non-compliant resource because some of the tables are not encrypted.
Step 5 – Once Config Rules notices a non-compliant resource, an AWS Lambda function is called to send Slack messages (via AWS Chatbot) to developers who have recently committed code related to DynamoDB provisioning
Step 6 – The Slack messages received by the developers contain detailed “best practices” implementation so that the engineer can ensure that this process is committed as code to the Git repository and applied as part of the next change to the deployment pipeline that provisions the DynamoDB tables.

This scenario provides an automated incident response workflow for data protection across an AWS account. Similar solutions can be deployed across multiple AWS accounts using a combination of services such as Amazon CloudWatch Events, Amazon GuardDuty, Amazon Macie, Amazon Inspector, AWS CloudFormation StackSets, AWS Organizations, and so on.

Preventive Control: cfn_nag

cfn_nag is an open source static analysis framework for discovering security vulnerabilities in AWS CloudFormation templates. cfn_nag provides the following features:

Allows developers to find obvious security flaws in CloudFormation templates before doing a deployment
Provides flexible controls for rule application including whitelists, blacklists, and fine-grained suppressions
Supports custom rule development for enterprise-specific security violations

In this example encryption detection workflow, cfn_nag is called from a deployment pipeline defined in AWS CodePipeline. This way, with every code change, the pipeline can notify team members when problems are discovered and even before the infrastructure is launched which prevents security vulnerabilities.

Other Preventive Controls

There are other automated controls you can enable as part of a deployment pipeline to detect and prevent vulnerabilities from entering your infrastructure. Tools like CheckMarx and SonarQube can run thousands of static analysis rules including things like SQL injection and cross-site scripting.

Detective Control: Config Rules

AWS Config is a service that detects state changes across multiple supported AWS services. You can also define AWS Config Rules that run rules defined in AWS Lambda. AWS Config Rules monitor changes to AWS services. In this scenario, it discovers changes to DynamoDB resources and flags it as a non-compliant resource using the dynamodb-table-encryption-enabled managed AWS Config Rule.

AWS Config Rules provides 86+ managed rules that are predefined and managed by AWS. There’s also a curated repository of Config Rules developed by the community that you can leverage. Finally, you can define custom Config Rules in Lambda or generate them using the Rule Development Kit.

Other Detective Controls

You can also leverage AWS services such as Amazon CloudWatch Events, Amazon GuardDuty, Amazon Inspector, and Amazon Macie to detect security and compliance issues as part of a detection and remediation workflow that can be enabled through a deployment pipeline.

Responsive Controls: AWS Chatbot and AWS Lambda

Once Config Rules notices a non-compliant resource, an AWS Lambda function is called to notify Slack via AWS Chatbot. Slack receives a message from the AWS Chatbot and displays a detailed “best practice” implementation so that the engineer can ensure that this process is committed as code to the Git repository and applied as part of the next change to the deployment pipeline. Since DynamoDB only allows you to encrypt tables when you’re creating them, there’s no auto remediation scenario that can take place.

Alternatively, you can directly leverage Amazon Lex for chat capabilities, contextually link to detailed knowledge bases, and make automated smart choices on what to automatically remediate and/or in which of the scenarios you provide detailed code snippets to engineers.

Summary

I provided an example showing how you can completely rethink your approach to security and compliance. By thinking from a fully automated perspective, you can focus human effort on designing these systems to detect and remediate problems and then recommend solutions in order to prevent it from happening again.

You learned which AWS services can be used to create a fully automated preventive and detection workflow for a data protection scenario. These services include cfn_nag, AWS Config Rules, AWS CodePipeline, AWS CloudFormation, and others.

Resources

Stelligent Amazon Pollycast