Automatically Enforce Policies on Your Terraform Modules using OPA and Terratest

Yoriyasu Yano
Gruntwork
Published in
11 min readNov 8, 2021

--

Many organizations have business and legal requirements that must be continuously enforced on the infrastructure they have. These requirements most often stem from compliance frameworks like HIPAA and PCI.

For example, in order to enforce HIPAA compliance in your organization, it is imperative that your infrastructure resources are tagged consistently and systematically so that you can trace which resources contain Protected Health Information (PHI).

Traditionally, these requirements are expressed as policies that are enforced by humans. However, humans are the weakest link when it comes to enforcing policies.

Suppose you have a database that contains PHI, and you want to ensure that database always has the PHI tags contained in them. Here is an example module call that deploys such an RDS database:

module "rds_with_phi" {
source = "git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds?ref=v0.17.5"
name = "main"
tags = {
includes-phi = "true"
}
}

With a human driven policy, you might enforce this at the code review phase, where the team collaborates to ensure the RDS database is tagged accordingly. The weakness in the approach is that the check happens only once. You need to enforce these not only in your IaC once, but you want to ensure that these invariants continue to be enforced always in the future.

For example, what if the business requirements change such that the tag needs to be renamed to has-phi? Or what if the module is refactored to source the tags from different locations, and as a result the tags are dropped accidentally? You want to ensure that your IaC stays compliant as requirements are updated. Is there any way to encode these policies as code, and automatically enforce them on the infrastructure code through automation?

With Open Policy Agent (aka OPA) and Terratest, you can continuously check and enforce your policies at all stages of the Software Development Lifecycle (SDLC). With a fully implemented pipeline, developers no longer need to continuously check if the policies are violated for every code review.

For example, with the RDS example above, if a developer accidentally makes an update that drops the tags, you can use OPA and Terratest to immediately get a failure in your CI build that looks like this:

=== RUN   TestEnforceTaggingOnLiveModules
TestEnforceTaggingOnLiveModules
Running terraform files in ../modules/rds through `opa eval`
on policy ../policies/enforce_tagging.rego
TestEnforceTaggingOnLiveModules
Failed opa eval on file main.json
(policy enforce_tagging.rego; query data.enforce_tagging.allow)
--- FAIL: TestEnforceTaggingOnLiveModules (0.05s)

You can use this pipeline to enforce all kinds of policies on your IaC:

  • Resource tagging
  • Ensuring modules come from an approved source
  • Enforce end to end encryption (in-transit and at-rest)
  • Enforce firewalls aren’t open to the public (e.g., 0.0.0.0/0)
  • And more!

In this post, we will cover how we can use Open Policy Agent with Terratest to build out this kind of pipeline.

What is Open Policy Agent?

From the official website:

The Open Policy Agent (OPA, pronounced “oh-pa”) is an open source, general-purpose policy engine that unifies policy enforcement across the stack. OPA provides a high-level declarative language that lets you specify policy as code and simple APIs to offload policy decision-making from your software. You can use OPA to enforce policies in microservices, Kubernetes, CI/CD pipelines, API gateways, and more.

This makes OPA a useful tool for enforcing various governance policies on your Terraform code.

The main interface of OPA is the opa CLI. The OPA CLI can be run directly in the terminal, or as a webserver that serves the policy checks behind a REST-ful API. Policies are written as code using Rego, a purpose-built language for OPA that allows you to declaratively specify policies. The opa CLI takes input data as JSON and checks the Rego policies against it.

Writing an OPA Policy

Now let’s try to use OPA to enforce that we have the includes-phi tag defined on our RDS module calls. We need to first express this requirement in rego:

package enforce_tagging# Only allow this if all the RDS module blocks have the tags 
# attribute set, and the tags attribute contains the
# "includes-phi" key.
allow = true {
count(rds_blocks) == count(rds_with_tags)
}
# The set of module blocks that call the Gruntwork RDS module.
rds_blocks[module_label] {
some module_label, i
module := input.module[module_label][i]
startswith(
module.source,
"git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds",
)
}
# The set of Gruntwork RDS module blocks that have the right tags
# set.
rds_with_tags[module_label] {
some module_label, i
# Only select the modules that are in the rds_blocks set.
module := input.module[module_label][i]
rds_blocks[module_label]
# Make sure the tags attribute is set and is not empty
module.tags
# And make sure the tags attribute has a key called includes-phi
module.tags["includes-phi"]
}

rego is a declarative language. Each block in the source defines a new variable or set, and the contents of the block indicate what goes into that set. Here is a breakdown of what is happening in the enforce_tagging policy above:

  • Define a variable called allow and assign it true if the set of RDS module blocks is equal to the set of RDS modules that have tags set and includes the includes-phi key.
  • Define a set called rds_blocks which iterates over all the objects in the module key of the input and only includes those whose source attribute starts with the Gruntwork RDS module source. This is handled by the some keyword which defines iteration variables. With the expression input.module[module_label] and some module_label , OPA will evaluate the expression with every key of the module object in the input, binding the key to module_label for each iteration. This set is then further filtered by the startswith expression, taking advantage of the fact that OPA only adds items to a set when the block expression all evaluates to true.
  • Define a set called rds_with_tags which iterates over the set rds_blocks, and for each module block, include those that have the tags attribute defined, and the tags attribute includes the includes-phi key.

The two sets (rds_blocks and rds_with_tags) and can use a little more explanation, in particular how iteration works in OPA. The some keyword is used to define iterator variables. That is, when you index a list or object in rego with a variable defined with some, then OPA will automatically iterate each element, binding the index key to the variable at each step.

So the expression input.module[module_label][i] is equivalent to the following pseudo code:

for module_label in input.module:
for i in input.module[module_label]:
input.module[module_label][i]

Given that, this rego policy expects input of the following form:

{
"module": {
"MODULE_LABEL": [{
"source": "MODULE_SOURCE",
// ... other module inputs ...
}]
}
}

Combining all this together, this policy enforces that all module blocks in the Terraform code that calls the Gruntwork RDS module has a tags attribute with the includes-phi key set.

Now that we have a policy, let’s run opa to evaluate it.

Using OPA

We can use the opa CLI to evaluate policies written in rego against JSON inputs using the eval command. Unfortunately, OPA currently doesn’t natively support parsing HCL, and thus we can’t use the CLI against the Terraform code directly. However, we can use a handy utility from the community, hcl2json, for this purpose.

Download the hcl2json utility and run it to convert the main.tf we wrote previously to JSON format:

hcl2json main.tf > main.json

Now that we have the contents of the Terraform file in JSON format, we can evaluate our policy against it using the opa CLI:

# Assuming the file enforce_tagging.rego contains the policy above:
opa eval --fail \
-i ./main.json \
-d ./enforce_tagging.rego \
'data.enforce_tagging.allow'

This command call means the following:

  • Evaluate the policy specified in ./enforce_tagging.rego (specified with the -d flag).
  • Evaluate the policy against the JSON data in ./main.json (specified with the -i flag).
  • Query for the data data.enforce_tagging.allow after evaluating the policy (the positional arg passed to eval).
  • Fail the command if the query data is undefined, or the result is empty (specified with the --fail flag).

This means that the command will only be successful if the allow variable is defined. With our policy, this is true if all RDS module blocks in the source have the tags attribute set.

When you run this command, it should exit with a zero exit code and the following output:

{
"result": [
{
"expressions": [
{
"value": true,
"text": "data.enforce_tagging.allow",
"location": {
"row": 1,
"col": 1
}
}
]
}
]
}

The expressions list in the result show you the value of each of the elements that you queried. The above only contains the result of the allow variable, but you can also query for the rds_blocks set if you passed in data.enforce_tagging.rds_blocks instead. Suppose you modify the main.tf to have another module block:

module "vpc" {
source = "git::git@github.com:gruntwork-io/terraform-aws-vpc.git//modules/vpc-app?ref=v0.17.5"
}
module "rds_with_phi" {
source = "git::git@github.com:gruntwork-io/terraform-aws-data-storage.git//modules/rds?ref=v0.17.5"
name = "main"
tags = {
includes-phi = "true"
}
}

When you run OPA, it should only select the module block that calls the RDS module:

$ opa eval --fail \
-i ./main.json \
-d ./enforce_tagging.rego \
'data.enforce_tagging.rds_blocks'
{
"result": [
{
"expressions": [
{
"value": [
"rds_with_phi"
],
"text": "data.enforce_tagging.rds_blocks",
"location": {
"row": 1,
"col": 1
}
}
]
}
]
}

Note how it correctly selected the module block with the label rds_with_phi, and ignored the vpc module block.

What about when the source fails the check? Try updating the main.tf module to comment out the tags attribute and rerun the check. You should see the command exit with a non-zero exit code, with an empty result ({}). This is because the allow variable becomes undefined when it finds an RDS module block that doesn’t have the tags attribute.

At this point, we have the ingredients for automating these checks in a CI/CD pipeline, but it can be cumbersome to do these in a pipeline, especially if you have many policies, and many Terraform modules. We can use Terratest to further automate these checks in an efficient manner.

Enforce OPA using Terratest

Terratest is a Go library that provides patterns and helper functions for testing infrastructure defined as code, with first class support for Terraform, Packer, Docker, Kubernetes, and more. Recently, we added support for OPA to the library (in v0.38.1).

You can use Terratest to automatically run OPA policies against your Terraform modules. Normally, you can’t run OPA policies directly against Terraform code because OPA does not support HCL inputs. To check your Terraform code, you need to first convert it to JSON using a tool like hcl2json. This can be cumbersome when you want to check every module in your repos. Terratest makes that easier by reducing that logic to a single function (test_structure.OPAEvalAllTerraformModules) which will:

  • Find all the Terraform modules in a folder. You can configure which folders to look in and exclude using the ValidationOptions struct.
  • For each Terraform module, find all the tf files in that module.
  • For each tf file, convert it to JSON using the same routine as hcl2json.
  • For each converted JSON file, run through the OPA checks.
  • Do all of this concurrently. That is, all the OPA checks for each file found will be done in parallel.
  • Report results per Terraform module. That is, the Terraform module containing the failing tf file will be reported as failing.

Using this function, you can drop in a single go test file to run your OPA policies against all the Terraform modules in a repo!

Suppose you have the following folder structure for your Terraform repositories:

.
├── examples
│ ├── bar-example
│ │ └── main.tf
│ ├── baz-example
│ │ └── main.tf
│ └── foo-example
│ └── main.tf
├── modules
│ ├── bar
│ │ └── main.tf
│ ├── baz
│ │ └── main.tf
│ └── foo
│ └── main.tf
└── policies
└── enforce_source.rego

In this setup, you have Terraform modules in the modules folder, and each subfolder contains a Terraform module. Additional, you have an examples folder that contains example usage of each of the Terraform modules, which are also Terraform modules themselves. You want to be continuously checking the OPA policy in the policies folder against all those modules.

To do that, add a new folder test and place a file called enforce_opa_test.go in there containing the following:

package testvalidateimport (
"os"
"path/filepath"
"testing"
test_structure "github.com/gruntwork-io/terratest/modules/test-structure"
"github.com/gruntwork-io/terratest/modules/opa"
"github.com/stretchr/testify/require"
)
func TestWithOPAEvalAllTerraformModules(t *testing.T) {
t.Parallel()
cwd, err := os.Getwd()
require.NoError(t, err)
// Look for Terraform modules starting at the directory
// above the `test` folder.
opts, err := test_structure.NewValidationOptions(
filepath.Join(cwd, ".."), nil, nil)
require.NoError(t, err)
// Configure OPA to run the `enforce_tagging.rego` policy
// in `FailUndefined` mode
rulePath := filepath.Join(cwd, "../policies/enforce_tagging.rego")
opaOpts := &opa.EvalOptions{
FailMode: opa.FailUndefined,
RulePath: rulePath,
}
test_structure.OPAEvalAllTerraformModules(
t, opts, opaOpts, "data.enforce_tagging.allow")
}

This sets up the OPAEvalAllTerraformModules the function with the following settings:

  • Look for Terraform modules starting at the directory right above where we are. This will be relative to the test folder, and thus will look at the repository root.
  • Run OPA with the policy at the folder ../policies/enforce_tagging.rego again relative to the test folder.
  • When running OPA, query for data.enforce_tagging.allow and run in FailUndefined mode so that the checks fail if the allow variable is undefined.

To finish the setup, initialize the test folder as a go module so that it can pull the terratest dependency:

cd test
# Update this to your terraform repo
go mod init github.com/gruntwork-io/infrastructure-modules/test
go mod tidy

This will create two files, go.mod and go.sum, which tracks all the go modules that are needed to run the test.

Once you have the go modules files, you can now run the test by calling go test in the test folder:

# You should already be in the test directory if you
# ran the previous command, but if not, change to test dir.
cd test
go test -v .

This single command will now run the opa eval check on every single Terraform module in your repository!

Note that this will check a single policy file against a single query. To run multiple checks, you can implement that in the policy source.

For example, imagine you had three separate policies you wanted to enforce, each one written in the same manner as the enforce_tagging policy above. Each policy is encoded in their own source file, e.g., enforce_source.rego, enforce_tagging.rego, and enforce_no_allow_all.rego. You can combined these by importing the sources and creating a single allow directive that aggregates the three:

package enforce_policiesimport data.enforce_source
import data.enforce_tagging
import data.enforce_no_allow_all_network
# Only pass the policy if all the imported checks evaluate to true.
allow = true {
enforce_source.allow == true
enforce_tagging.allow == true
enforce_no_allow_all_network.allow == true
}

When the check fails, Terratest will automatically rerun opa eval with the query set to data, which will allow you to see all the OPA variables that are defined. This way, you can debug which OPA check failed.

Check out the terraform-opa-example folder in the Terratest repository for live example usage of the OPA functionality.

Summary

In this post we covered:

  • A canonical use case of policy checks on Terraform source code.
  • How to write OPA policies in Rego.
  • How to use the opa CLI to check policies against Terraform source code.
  • How to use Terratest to automate opa calls.

In most cases, you identify non-compliance retroactively: you run checks against the live AWS environment after things have been deployed. With OPA and Terratest, you can preform the checks before any infrastructure goes live! This approach allows you to ensure your infrastructure stays compliant with company policies over time, as you make changes.

To get a fully implemented and tested collection of Terraform modules that meet compliance standards like CIS and HIPAA enforced with OPA, check out Gruntwork.io.

--

--

Staff level Startup Engineer with 10+ years experience (formerly at Gruntwork)