Testing Infrastructure in CI: Validating Terraform, Docker, and K8s Configs

Infrastructure code — Terraform modules, Dockerfiles, Kubernetes manifests, Helm charts — is responsible for the environment your application runs in. When it breaks, your application goes down, your data is at risk, or your security posture degrades. Yet most teams test their infrastructure far less rigorously than their application code.

Building CI checks for infrastructure configuration catches problems before they reach production: misconfigurations, security vulnerabilities, syntax errors, and policy violations.

What to Test in Infrastructure CI

A complete infrastructure CI pipeline covers several layers:

Syntax validation: Is this valid HCL/YAML/Dockerfile syntax?
Linting: Does this follow best practices and style conventions?
Security scanning: Are there security misconfigurations or vulnerabilities?
Policy validation: Does this comply with organizational policies?
Plan/dry-run: What would this change do? (Terraform plan, kubectl dry-run)
Unit tests: Do modules behave as expected with test inputs?
Integration tests: Does the infrastructure actually work when applied?

Not every project needs all layers, but most benefit from at least the first five.

Terraform: Validation and Testing Pipeline

Start with built-in Terraform validation:

# Validate HCL syntax and internal consistency
terraform init -backend=false
terraform validate

Add tflint for style and best-practice checking:

# Install tflint
curl -s https://raw.githubusercontent.com/terraform-linters/tflint/master/install_linux.sh | bash

# Run with AWS rules enabled
tflint --init
tflint --recursive

Configure tflint rules in .tflint.hcl:

# .tflint.hcl
plugin "aws" {
  enabled = true
  version = "0.30.0"
  source  = "github.com/terraform-linters/tflint-ruleset-aws"
}

rule "terraform_required_version" {
  enabled = true
}

rule "terraform_required_providers" {
  enabled = true
}

# Warn on deprecated resource types
rule "aws_instance_invalid_type" {
  enabled = true
}

For security scanning, use tfsec or Checkov:

# tfsec: scan for security misconfigurations
docker run --rm -v "$(pwd):/src" aquasec/tfsec /src

# Checkov: broader policy checking
pip install checkov
checkov -d . --framework terraform

Checkov will flag issues like:

Check: CKV_AWS_18: "Ensure the S3 bucket has access logging enabled"
FAILED for resource: aws_s3_bucket.app_uploads

Check: CKV_AWS_19: "Ensure the S3 bucket has server-side-encryption enabled"
FAILED for resource: aws_s3_bucket.app_uploads

For actual unit testing of Terraform modules, use Terratest:

// test/vpc_test.go
package test

import (
    "testing"
    "github.com/gruntwork-io/terratest/modules/terraform"
    "github.com/stretchr/testify/assert"
)

func TestVpcModule(t *testing.T) {
    terraformOptions := &terraform.Options{
        TerraformDir: "../modules/vpc",
        Vars: map[string]interface{}{
            "name":        "test-vpc",
            "cidr_block":  "10.0.0.0/16",
            "environment": "test",
        },
    }

    // Clean up after the test
    defer terraform.Destroy(t, terraformOptions)

    // Apply the Terraform config
    terraform.InitAndApply(t, terraformOptions)

    // Verify outputs
    vpcId := terraform.Output(t, terraformOptions, "vpc_id")
    assert.NotEmpty(t, vpcId)

    subnetIds := terraform.OutputList(t, terraformOptions, "private_subnet_ids")
    assert.Len(t, subnetIds, 3) // Expect 3 AZs
}

Combine it all in a GitHub Actions workflow:

# .github/workflows/terraform-ci.yml
name: Terraform CI

on:
  pull_request:
    paths:
      - 'infrastructure/**'
      - 'modules/**'

jobs:
  validate:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: '1.7.0'

      - name: Terraform fmt check
        run: terraform fmt -check -recursive

      - name: Terraform validate
        run: |
          terraform init -backend=false
          terraform validate

      - name: Run tflint
        uses: terraform-linters/setup-tflint@v4
      - run: |
          tflint --init
          tflint --recursive

      - name: Run Checkov
        uses: bridgecrewio/checkov-action@v12
        with:
          directory: .
          framework: terraform
          soft_fail: false

  plan:
    runs-on: ubuntu-latest
    needs: validate
    steps:
      - uses: actions/checkout@v4

      - name: Configure AWS credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_PLAN_ROLE_ARN }}
          aws-region: us-east-1

      - name: Terraform plan
        run: |
          terraform init
          terraform plan -out=plan.tfplan

      - name: Post plan to PR
        uses: actions/github-script@v7
        with:
          script: |
            const planOutput = require('fs').readFileSync('plan.txt', 'utf8');
            github.rest.issues.createComment({
              issue_number: context.issue.number,
              owner: context.repo.owner,
              repo: context.repo.repo,
              body: `## Terraform Plan\n\`\`\`\n${planOutput}\n\`\`\``,
            });

Docker: Linting and Security Scanning

Hadolint lints Dockerfiles against best practices:

# Install hadolint
docker pull hadolint/hadolint

# Lint a Dockerfile
docker run --rm -i hadolint/hadolint < Dockerfile

Common hadolint findings and their fixes:

# DL3008: Pin versions in apt-get install
# BAD:
RUN apt-get install -y curl

# GOOD:
RUN apt-get install -y curl=7.88.1-10+deb12u5

# DL3025: Use JSON notation for CMD
# BAD:
CMD php artisan serve

# GOOD:
CMD ["php", "artisan", "serve"]

# DL3006: Always tag the version of the image explicitly
# BAD:
FROM php

# GOOD:
FROM php:8.2-fpm-alpine

# DL3002: Last USER should not be root
# Add a non-root user for production
RUN addgroup -S appgroup && adduser -S appuser -G appgroup
USER appuser

For vulnerability scanning in built images, use Trivy:

# .github/workflows/docker-ci.yml
jobs:
  build-and-scan:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - name: Lint Dockerfile
        uses: hadolint/hadolint-action@v3.1.0
        with:
          dockerfile: Dockerfile

      - name: Build image
        run: docker build -t myapp:${{ github.sha }} .

      - name: Scan image with Trivy
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: myapp:${{ github.sha }}
          format: 'table'
          exit-code: '1'
          severity: 'CRITICAL,HIGH'
          ignore-unfixed: true

Kubernetes: Manifest Validation and Policy Enforcement

kubeval validates Kubernetes manifests against the API schema:

# Install kubeval
curl -L https://github.com/instrumenta/kubeval/releases/latest/download/kubeval-linux-amd64.tar.gz | tar xz

# Validate manifests
kubeval --strict manifests/*.yaml

# Validate against a specific Kubernetes version
kubeval --kubernetes-version 1.28.0 manifests/*.yaml

For policy enforcement, use Conftest with OPA policies:

# policy/deployment.rego
package main

import future.keywords.if

deny[msg] if {
    input.kind == "Deployment"
    not input.spec.template.spec.securityContext.runAsNonRoot
    msg := sprintf("Deployment '%s' must run as non-root", [input.metadata.name])
}

deny[msg] if {
    input.kind == "Deployment"
    container := input.spec.template.spec.containers[_]
    not container.resources.limits.memory
    msg := sprintf("Container '%s' in Deployment '%s' must have memory limits", [
        container.name,
        input.metadata.name,
    ])
}

deny[msg] if {
    input.kind == "Deployment"
    container := input.spec.template.spec.containers[_]
    container.image == input.spec.template.spec.containers[_].image
    endswith(container.image, ":latest")
    msg := sprintf("Container '%s' must not use the 'latest' tag", [container.name])
}

Run Conftest in CI:

conftest test manifests/*.yaml --policy policy/

For Helm charts, use helm lint and helm template for validation:

# Lint the chart
helm lint charts/myapp

# Render templates and pipe to kubeval
helm template charts/myapp | kubeval --strict

# Run Conftest on rendered templates
helm template charts/myapp | conftest test - --policy policy/

Putting It Together: A Unified Infrastructure Pipeline

# .github/workflows/infrastructure-ci.yml
name: Infrastructure CI

on:
  pull_request:
    paths:
      - 'infrastructure/**'
      - 'docker/**'
      - 'charts/**'
      - 'manifests/**'

jobs:
  terraform:
    name: Terraform
    runs-on: ubuntu-latest
    defaults:
      run:
        working-directory: infrastructure/
    steps:
      - uses: actions/checkout@v4
      - uses: hashicorp/setup-terraform@v3
      - run: terraform fmt -check -recursive
      - run: terraform init -backend=false && terraform validate
      - uses: bridgecrewio/checkov-action@v12

  docker:
    name: Docker
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: hadolint/hadolint-action@v3.1.0
      - run: docker build -t myapp:test .
      - uses: aquasecurity/trivy-action@master
        with:
          image-ref: myapp:test
          severity: CRITICAL,HIGH
          exit-code: '1'

  kubernetes:
    name: Kubernetes
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Install kubeval
        run: |
          curl -L https://github.com/instrumenta/kubeval/releases/latest/download/kubeval-linux-amd64.tar.gz | tar xz
          mv kubeval /usr/local/bin/
      - name: Validate manifests
        run: kubeval --strict manifests/*.yaml
      - name: Policy check
        uses: instrumenta/conftest-action@v0.1.0
        with:
          files: manifests/*.yaml
          policy: policy/

The goal is to make infrastructure issues visible before they merge — not after they break production at midnight.

Building secure, reliable systems? We help teams deliver software they can trust. scopeforged.com