Infrastructure as Code (IaC) transforms how we manage cloud resources. Instead of clicking through consoles or running ad-hoc commands, you define infrastructure declaratively and let tools handle the rest. Terraform has emerged as the leading IaC tool, working across multiple cloud providers with a consistent workflow.
Why Infrastructure as Code?
The Problems It Solves
- Reproducibility: Same config produces same infrastructure
- Version control: Track changes, review, rollback
- Collaboration: Multiple team members can work safely
- Documentation: Config is the documentation
- Disaster recovery: Rebuild infrastructure quickly
Why Terraform?
- Multi-cloud: AWS, Azure, GCP, and hundreds of providers
- Declarative: Describe desired state, not steps
- State management: Tracks what exists
- Plan before apply: Preview changes
- Module ecosystem: Reusable components
Getting Started
Installation
You can install Terraform directly or use a version manager for easier upgrades. The version manager approach is recommended for teams that need to work with multiple Terraform versions across different projects.
# macOS
brew install terraform
# Windows (Chocolatey)
choco install terraform
# Or use tfenv for version management
brew install tfenv
tfenv install 1.7.0
tfenv use 1.7.0
Basic Structure
Every Terraform project starts with a few key files. This example shows the essential structure for deploying an AWS EC2 instance. You will typically organize your configuration across multiple files as your project grows.
# main.tf
terraform {
required_version = ">= 1.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {
region = var.aws_region
}
resource "aws_instance" "web" {
ami = var.ami_id
instance_type = var.instance_type
tags = {
Name = "web-server"
Environment = var.environment
}
}
The required_providers block pins your provider version, preventing unexpected changes when providers release updates. Always specify version constraints for production infrastructure. The ~> 5.0 constraint allows patches like 5.1 but blocks major version changes.
Variables
Variables make your Terraform code reusable across environments. Define them in a separate file for clarity and easier management.
# variables.tf
variable "aws_region" {
description = "AWS region"
type = string
default = "us-east-1"
}
variable "instance_type" {
description = "EC2 instance type"
type = string
default = "t3.micro"
}
variable "environment" {
description = "Environment name"
type = string
}
Note that environment has no default, making it a required variable. This forces you to explicitly set it, preventing accidental deployments to the wrong environment.
You can set variable values in a .tfvars file for each environment. Keep these files separate from your main configuration and include them at apply time.
# terraform.tfvars
aws_region = "us-west-2"
instance_type = "t3.small"
environment = "production"
Outputs
Outputs expose values from your infrastructure for use by other systems or for display after apply. They form the interface between your Terraform configuration and the outside world.
# outputs.tf
output "instance_ip" {
description = "Public IP of the instance"
value = aws_instance.web.public_ip
}
output "instance_id" {
description = "Instance ID"
value = aws_instance.web.id
}
You can reference these outputs in CI/CD pipelines or use them as inputs to other Terraform modules. They are also useful for scripting post-deployment steps.
Core Workflow
Terraform has a simple but powerful workflow. You will run these commands hundreds of times throughout the lifecycle of your infrastructure.
# Initialize (download providers)
terraform init
# Preview changes
terraform plan
# Apply changes
terraform apply
# Show current state
terraform show
# Destroy resources
terraform destroy
Always run terraform plan before apply. Review the plan carefully, especially the resources that will be destroyed or replaced. A resource being replaced means downtime, and understanding why Terraform wants to replace rather than update is critical.
State Management
Remote State
Never store state locally in production. Remote state enables team collaboration and prevents state file corruption from concurrent operations.
This configuration stores state in S3 with DynamoDB for locking. This is the standard approach for AWS-based teams.
# backend.tf
terraform {
backend "s3" {
bucket = "my-terraform-state"
key = "prod/terraform.tfstate"
region = "us-east-1"
encrypt = true
dynamodb_table = "terraform-locks"
}
}
The encrypt = true setting ensures your state file is encrypted at rest. This is important because state files can contain sensitive data like database passwords, API keys, and other secrets that Terraform needs to manage your resources.
State Locking
State locking prevents concurrent modifications that could corrupt your state file. You need to create the DynamoDB table before configuring the backend.
resource "aws_dynamodb_table" "terraform_locks" {
name = "terraform-locks"
billing_mode = "PAY_PER_REQUEST"
hash_key = "LockID"
attribute {
name = "LockID"
type = "S"
}
}
This is a chicken-and-egg problem: you need to create this table with local state first, then migrate to remote state. Many teams create this table manually or with a separate bootstrapping configuration.
State Commands
Sometimes you need to manipulate state directly. These commands help you recover from mistakes or refactor resources without recreating them.
# List resources in state
terraform state list
# Show specific resource
terraform state show aws_instance.web
# Move resource (rename)
terraform state mv aws_instance.old aws_instance.new
# Remove from state (keep resource)
terraform state rm aws_instance.web
# Import existing resource
terraform import aws_instance.web i-1234567890abcdef0
Use terraform import when you need to bring manually-created resources under Terraform management. This is common when adopting IaC for existing infrastructure or when recovering from resources created outside of Terraform.
Modules
Creating a Module
Modules let you package and reuse infrastructure patterns. A module is just a directory with Terraform files that accepts inputs and produces outputs.
modules/
└── vpc/
├── main.tf
├── variables.tf
└── outputs.tf
Here is a module that creates a VPC with public subnets. Notice how it uses variables for all configurable values, making it flexible for different use cases.
# modules/vpc/main.tf
resource "aws_vpc" "main" {
cidr_block = var.cidr_block
enable_dns_hostnames = true
enable_dns_support = true
tags = {
Name = var.name
}
}
resource "aws_subnet" "public" {
count = length(var.public_subnets)
vpc_id = aws_vpc.main.id
cidr_block = var.public_subnets[count.index]
availability_zone = var.availability_zones[count.index]
tags = {
Name = "${var.name}-public-${count.index + 1}"
}
}
The count meta-argument creates multiple subnets from a list, making the module flexible for different availability zone configurations. This pattern is common when you need to create similar resources across multiple AZs or regions.
Define the module's inputs in variables.tf. Good modules document their variables thoroughly.
# modules/vpc/variables.tf
variable "name" {
type = string
}
variable "cidr_block" {
type = string
default = "10.0.0.0/16"
}
variable "public_subnets" {
type = list(string)
}
variable "availability_zones" {
type = list(string)
}
Using Modules
Call your module from the root configuration and pass in the required variables. The module source can be a local path, Git repository, or the Terraform Registry.
# main.tf
module "vpc" {
source = "./modules/vpc"
name = "production"
cidr_block = "10.0.0.0/16"
public_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
availability_zones = ["us-west-2a", "us-west-2b"]
}
# Use module outputs
resource "aws_instance" "web" {
subnet_id = module.vpc.public_subnet_ids[0]
# ...
}
Module outputs become accessible as module.<name>.<output>, enabling you to wire modules together into larger architectures.
Public Module Registry
The Terraform Registry contains thousands of community-maintained modules. These can save significant development time and incorporate best practices.
# Use community modules
module "vpc" {
source = "terraform-aws-modules/vpc/aws"
version = "5.0.0"
name = "my-vpc"
cidr = "10.0.0.0/16"
azs = ["us-west-2a", "us-west-2b"]
private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]
public_subnets = ["10.0.101.0/24", "10.0.102.0/24"]
enable_nat_gateway = true
}
Always pin module versions in production. Without a version constraint, Terraform will use the latest version, which may introduce breaking changes unexpectedly.
Data Sources
Query existing resources. Data sources let you reference infrastructure that was created outside of Terraform or in a different Terraform configuration.
# Get current AWS account info
data "aws_caller_identity" "current" {}
# Get latest Amazon Linux AMI
data "aws_ami" "amazon_linux" {
most_recent = true
owners = ["amazon"]
filter {
name = "name"
values = ["amzn2-ami-hvm-*-x86_64-gp2"]
}
}
# Use in resources
resource "aws_instance" "web" {
ami = data.aws_ami.amazon_linux.id
# ...
}
The AMI data source is particularly useful because it automatically picks up the latest patched image, keeping your instances up to date without hardcoding AMI IDs. This saves you from manually updating IDs after each AWS release.
Dynamic Blocks
Generate repeated blocks when the number of items is variable. This is common for security group rules where you might have different rules per environment.
variable "ingress_rules" {
type = list(object({
port = number
cidr_blocks = list(string)
}))
default = [
{ port = 80, cidr_blocks = ["0.0.0.0/0"] },
{ port = 443, cidr_blocks = ["0.0.0.0/0"] },
{ port = 22, cidr_blocks = ["10.0.0.0/8"] },
]
}
resource "aws_security_group" "web" {
name = "web-sg"
dynamic "ingress" {
for_each = var.ingress_rules
content {
from_port = ingress.value.port
to_port = ingress.value.port
protocol = "tcp"
cidr_blocks = ingress.value.cidr_blocks
}
}
}
Inside the dynamic block, ingress.value gives you access to the current item in the loop. This pattern keeps your code DRY while remaining readable. Use dynamic blocks sparingly though; overuse can make configurations harder to understand.
Environment Management
Workspaces
Workspaces let you manage multiple environments with the same configuration. Each workspace maintains its own state file.
# Create workspace
terraform workspace new staging
# Switch workspace
terraform workspace select production
# List workspaces
terraform workspace list
You can use the workspace name to customize resource configuration based on the environment.
# Use workspace in config
locals {
environment = terraform.workspace
}
resource "aws_instance" "web" {
instance_type = terraform.workspace == "production" ? "t3.large" : "t3.micro"
# ...
}
Workspaces are simple but have limitations. For significantly different environments, consider the directory-based approach instead, which provides clearer separation of concerns.
Directory-Based Environments
For environments with different configurations, use separate directories. This provides more flexibility and clearer separation.
terraform/
├── modules/
├── environments/
│ ├── dev/
│ │ ├── main.tf
│ │ └── terraform.tfvars
│ ├── staging/
│ └── production/
Best Practices
Project Structure
Organize your Terraform code consistently across projects. This structure works well for most teams and scales as projects grow.
terraform/
├── main.tf # Main resources
├── variables.tf # Variable declarations
├── outputs.tf # Output values
├── versions.tf # Required versions
├── backend.tf # State configuration
├── locals.tf # Local values
└── data.tf # Data sources
Naming Conventions
Use clear, consistent naming for resources and use locals to avoid repetition. Good naming makes your configuration self-documenting.
# Consistent naming
resource "aws_instance" "web_primary" { }
resource "aws_instance" "web_secondary" { }
# Use locals for repeated values
locals {
common_tags = {
Project = var.project_name
Environment = var.environment
ManagedBy = "terraform"
}
}
resource "aws_instance" "web" {
tags = merge(local.common_tags, {
Name = "web-server"
})
}
The ManagedBy = "terraform" tag helps identify resources that should not be modified manually. This prevents drift and confusion when troubleshooting infrastructure.
Security
Never commit secrets to version control. Use environment variables or IAM roles instead. Terraform provides multiple options for secure credential handling.
# Never commit secrets
# Use environment variables
provider "aws" {
# Reads AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY
}
# Or use IAM roles
# Terraform assumes the role automatically on EC2/ECS
# Use variables for sensitive values
variable "database_password" {
type = string
sensitive = true
}
The sensitive = true attribute prevents Terraform from showing the value in plan or apply output, reducing the risk of accidental exposure in logs or CI/CD output.
Validation
Add validation rules to catch configuration errors early, before they reach production. This shifts errors left and provides better error messages.
variable "environment" {
type = string
validation {
condition = contains(["dev", "staging", "production"], var.environment)
error_message = "Environment must be dev, staging, or production."
}
}
variable "instance_type" {
type = string
validation {
condition = can(regex("^t3\\.", var.instance_type))
error_message = "Instance type must be from t3 family."
}
}
These validations fail fast during terraform plan, saving you from deploying invalid configurations. Custom error messages help users understand exactly what went wrong.
CI/CD Integration
Automate Terraform in your deployment pipeline. This workflow runs on every PR and only applies changes on merge to main, providing code review for infrastructure changes.
# .github/workflows/terraform.yml
name: Terraform
on:
push:
branches: [main]
pull_request:
jobs:
terraform:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.0
- name: Terraform Init
run: terraform init
- name: Terraform Format
run: terraform fmt -check
- name: Terraform Validate
run: terraform validate
- name: Terraform Plan
run: terraform plan -out=tfplan
- name: Terraform Apply
if: github.ref == 'refs/heads/main'
run: terraform apply -auto-approve tfplan
The terraform fmt -check step enforces consistent formatting across your team. The -out=tfplan flag saves the plan to apply exactly what was reviewed, preventing race conditions where the infrastructure changes between plan and apply.
Common Patterns
Conditional Resources
Create resources only when certain conditions are met. The count meta-argument with a ternary expression is the standard pattern for this.
resource "aws_instance" "bastion" {
count = var.enable_bastion ? 1 : 0
ami = data.aws_ami.amazon_linux.id
instance_type = "t3.micro"
}
When count = 0, no resource is created. Reference conditional resources as aws_instance.bastion[0] since they become a list. You can also use one() to convert the list to a single value or null.
For Each
Use for_each when you need to create multiple similar resources from a map. Unlike count, for_each uses meaningful keys instead of numeric indices.
variable "instances" {
type = map(object({
instance_type = string
subnet_id = string
}))
}
resource "aws_instance" "web" {
for_each = var.instances
ami = data.aws_ami.amazon_linux.id
instance_type = each.value.instance_type
subnet_id = each.value.subnet_id
tags = {
Name = each.key
}
}
With for_each, removing an item from the middle of your configuration does not cause other resources to be recreated, unlike count where indices would shift. This makes for_each the preferred approach for production resources.
Conclusion
Terraform brings software engineering practices to infrastructure management. Start with simple configurations, use modules for reusability, manage state remotely, and integrate into CI/CD. The initial learning curve pays dividends in reproducible, version-controlled infrastructure.