Continuous Code Auditing for SOC 2 Compliance

A woman looking at a computer screen showing lines of code.

With the increased adoption of continuous deployment, more and more organizations have increased their deployment frequency in order to get their latest products into the hands of their customers as quickly as possible. A compliment to this fast-paced deployment practice is continuous auditing. Both of these approaches constitute a leftward shift from the more traditional methodologies of software development.

This increased rate of deployment can lead to the occasional necessity for an emergency hotfix. Like many organizations, we are using GitHub to build and deploy our software. In some cases, emergency fixes are required to avoid real-world consequences including downtime or the degradation of services. Having these changes documented as code (rather than manipulating resources in a web console) is preferred. In many cases, MTTR (Mean time to repair) is more advantageous than MTTR (Mean time to resolve). At times, an emergency hotfix can solve the problem immediately, allowing for the appropriate retrospective to be spent on working towards a longer-term solution to the underlying problem.

At Rewind, we follow a change management process as part of SOC 2 compliance. This process allows for emergency changes, whereby changes can be approved but without the usual number of reviews for the change. There are times when an emergency fix may need to be released to production quickly, but allowing production engineers or alike the ability to impact production in emergency situations should be audited.

Working in conjunction with our SOC 2 auditor, we have developed (and open-sourced) tooling to scan pull requests within a specified time window leveraging GitHub’s search syntax. If any pull requests are found by the search query, they are logged in AWS CloudWatch Logs and the relevant CloudWatch Alarms are sent to SQS and then picked up by our operations team.

This notifies us of any emergency changes so that we can track the reason for the emergency change (required for SOC2 auditing) and triage them if necessary. This allows us to trust but verify, and also retain these records for an extended period of time, since GitHub only stores audit logs for 90 days. Certifications such as SOC 2 require a well-documented change management process and procedure, including how to deal with emergency changes and how these changes are audited.

continuous auditing of pull requests — A diagram of the Rewind continuous code auditing process.

How to Set Up a Continuous Code Monitor

The solution uses AWS SAM and its CLI to build, package, and deploy an AWS Lambda (written in Ruby). To reduce the size of the Lambda package, we leveraged AWS Lambda layers to package up our dependencies separately.

See the following CloudFormation snippet:

AuditorLambdaFunction:
Type: AWS::Serverless::Function # More info about Function Resource: https://github.com/awslabs/serverless-application-model/blob/master/versions/2016-10-31.md#awsserverlessfunction
Properties:
CodeUri: src/
Handler: lambda.handler
MemorySize: 384
ReservedConcurrentExecutions: 1
Role: !GetAtt LambdaRole.Arn
Runtime: ruby2.7
Timeout: 300
Layers:
- !Ref AuditorLambdaLayer
Environment:
Variables:
GITHUB_ORG_NAME: !Ref GitHubOrgName
GITHUB_TOKEN_SSM_PATH: !Ref GitHubTokenSSMPath
LAST_TIME_CHECKED_SSM_PATH: !Ref LastTimeCheckedSSMPath
Tags:
function: github-pr-auditor
service: common
platform: common
lambda: github-pr-auditor
region: !Ref AWS::Region
AuditorLambdaLayer:
Type: AWS::Serverless::LayerVersion
Properties:
LayerName: github-pr-auditor-dependencies
Description: Dependencies for github-pr-auditor
ContentUri: lambda_layer
CompatibleRuntimes:
- ruby2.7
RetentionPolicy: Retain
Metadata:
BuildMethod: makefile

As you can see above, the Function is dependent upon a single layer called AuditorLambdaLayer. The layer is packaged up separately and contains all of the dependencies defined in the Gemfile.lock necessary to run the application. One caveat we ran into was that SAM does not package up ruby gems (for Lambda Layers) in a way that the Lambda runtime is expecting. Luckily we found this issue and were able to work around it by reorganizing the files in the Layer by making use of a custom Makefile.

After ensuring that the Lambda itself ran as expected, we set up an AWS Events Rule to run the Lambda on a schedule. The following CloudFormation snippet defines this:

LambdaSchedule:
Type: "AWS::Events::Rule"
Properties:
Description: >
A schedule for the Lambda function.
ScheduleExpression: !Ref LambdaRate
State: ENABLED
Targets:
- Arn: !Sub ${AuditorLambdaFunction.Arn}
Id: LambdaSchedule

Schedule Expressions for Rules can be defined as strings such as “rate(24 hours)” or “rate(5 minutes)” depending on how frequently you want to run the code.

The last piece of the puzzle was to ensure that alarms would notify us when a certain type of log appeared.

EmergencyChangeAlarm:
Type: AWS::CloudWatch::Alarm
Properties:
AlarmName: !Ref AlarmEmergencyChangeName
AlarmDescription: A GitHub PR emergency change was merged
MetricName: GitHubEmergencyChange
Namespace: GitHubAuditing
Statistic: Sum
Period: 300
EvaluationPeriods: 1
Threshold: 1
TreatMissingData: notBreaching
AlarmActions:
- !Ref AlarmSNSTopicArn
ComparisonOperator: GreaterThanOrEqualToThreshold
EmergencyChangeFilter:
Type: AWS::Logs::MetricFilter
Properties:
LogGroupName: !Ref LambdaLogGroup
FilterPattern: |-
"is non-compliant!"
MetricTransformations:
- MetricValue: "1"
MetricNamespace: GitHubAuditing
MetricName: GitHubEmergencyChange

We consider this type of alarm to be an “Emergency Change”. It matches the text “is non-compliant!” which is configured as a MetricFilter. These alarms then get sent to the configured AWS SNS topic, allowing for our operations team to be notified in a timely manner.

More Advanced Auditing

If you’re lucky enough to have a subscription to Github Enterprise, there are additional methods for auditing changes such as querying GitHub’s Audit Log API. There are good examples in The GitHub Enterprise Audit log API for GraphQL beginners. This enables a far more granular approach to auditing, allowing for many other types of actions to be monitored as well.

Wrapping Up

SOC 2 compliance is all about checks and balances. As a fast-growing c quickly and continuously deploy code changes. But we need to balance that with the requirement for being able to audit when changes do not follow the usual PR review process. Rather than asking users to track these, we have created automated tooling to ensure we are following our own change management procedure.

Want to solve problems like this, with experts like these? Want to get paid at the same time? Check out Rewind’s open positions and discover a career in data science, engineering, or security.

See available positions

Dave Gallant">

Dave Gallant

Dave is an experienced software developer who loves tooling, systems, security, and clouds. He’s constantly learning and reading new things. When he isn’t at a keyboard, Dave enjoys traveling the world to admire different architectural styles and sample local cuisines.

How to Set Up a Continuous Code Monitor

More Advanced Auditing

Wrapping Up

Dave Gallant

Read next on Engineering

Review is the bottleneck now: How we let AI approve pull requests (safely)

Vulnerability triage automation: How Nurse Betty keeps Rewind secure

How we removed 60 billion objects from S3 without breaking the bank (or S3)