Terraform Best Practices - abcxyz/readability GitHub Wiki

Terraform Best Practices

Objective

We desire a consistent Terraform style in our code base. This document proposes a common set of standards for writing Terraform within Bets Platform. This list is by no means thorough. If you find a style worth documenting, add a comment or suggestion for review.

This document is both a style guide and a best practices guide.

Readability

Bets Platform uses readability groups in GitHub CODEOWNERS files to ensure the appropriate standards and style are followed. Complete the following steps to add the abcxyz/terraform-readability team as CODEOWNERS for *.tf files in the service's repository

  • Add the abcxyz/readability team as a contributor to the service's repository with WRITE access
    • This is a parent group that contains all readability teams and ensures readability teams can review code as needed
  • Add the following lines to the .github/CODEOWNERS file:
# terraform-readability owns all terraform files
*.tf    @abcxyz/terraform-readability

Google Cloud Organization Structure

TODO: Document this entire section and the use of the infra repository after getting automation and new structure setup that follows go/bets-platform-org-setup.

For now, this is generalized.

Terraform in the infra repository should source a module in the service repository's /terraform folder to create the resources required for the service(see Service Directory Structure). This allows a service's resources to be created in a consistent and repeatable fashion per environment.

Directory Naming

Directories should be named the same as the service's GitHub repository. If a directory is needed that does not represent a service in a GitHub repository, it should be named to reflect the context of what will be created. For example, a /resources/production folder structure holds resources for the production folder.

Service Directory Structure

All Terraform should exist in a folder named terraform and should represent the required resources for the services environment. In general, modules should not need to be created for the service's environment. Prefer the use of our abcxyz/terraform-modules repository for reusable infrastructure. If a new, reusable set of resources is needed and represent a common set of functionality for use across Bets Platform, it should go in the abcxyz/terraform-modules repository. Otherwise, create the resources directly in the service's Terraform configuration.

This provides the opinionated way to set up the infrastructure needed for the service and allows consumers to re-use, or view it as an example of how to set up their infrastructure.

Example

/terraform
  main.tf
  outputs.tf
  variables.tf
  terraform.tf

File Structure and Naming

The following should be used to create and name the Terraform configuration files:

  • main.tf - The main file to create resources, locals, data sources, and call modules.

NOTE: For larger projects, additional files can be used with an appropriate name to supplement the main file, that groups related resources together (e.g. - vms.tf, network.tf, etc).

If required, make use of the following files:

  • outputs.tf - Declare all outputs in this file.
  • variables.tf - Declare all inputs in this file.

If required for Terraform configuration requirements:

  • terraform.tf - Declare version requirements for Terraform and/or providers and any backend or provider configuration.

Other Naming

Resources

  • Resource names should use snake_case (e.g. load_balancer) to match the Terraform resource naming format.
resource "google_project" "test_service_project" {
  project_id = "test-service-${random_id.default.hex}"
  name       = "test-service"
}

Variables/Outputs

  • Variable names should use snake_case (e.g. project_id).
  • Names should be specific and if applicable, should contain a suffix indicating units or sizing as needed (e.g. storage_size_gb).
  • For booleans, default to false and name appropriately for intended use case
variable "ram_size_gb" {
  type        = string
  description = "The service RAM size, in gigabytes."
}

# The logging feature would default to off and the user needs to enable it
variable "enable_logging" {
  type        = string
  description = "Enable the use of logging."
  default     = false
}

# The cache feature would be enabled by default and the user needs to disable it
variable "disable_cache" {
  type        = string
  description = "Disable the use of caching."
  default     = false
}

output "service_account_email" {
  description = "Service account email address."
  value       = google_service_account.service_account.email
}

Google Cloud

  • When naming Google Cloud resource, prefer using hyphens (-) as a separator
  • Always review any Google Cloud naming constraints for the resource. Some have length limits and allowed character sets.
  • For resources that must have a unique name, use the random_id resource to ensure uniqueness across Terraform runs.
  • Prefer putting "project", "folder" and "organization" keys first in resource stanzas.
resource "random_id" "default" {
  byte_length = 2
}

resource "google_project" "test_service_project" {
  project_id = "test-service-${random_id.default.hex}"
  name       = "test-service"
}

Attribute Ordering

  • Leading meta-arguments
    • The following argument should appear first and in the following order
    • for_each
    • count
    • provider
    • An empty newline should follow all meta-arguments for readability
  • Module source
    • source
    • If there are no meta-arguments, an empty newline should follow all the module source attribute
  • Provider specific preference
    • In general prefer top-level identifiers first
    • Google Provider (in this order)
      • organization
      • folder
      • project
  • Ending meta-arguments
    • The following argument should appear last and in the following order
    • depends_on
    • lifecycle
# with meta-arguments
module "some_modules" {
  for_each               = toset(["name_1", "name_2"])

  source                 = "git::https://github.com/owner/repo.git//modules/name?ref=SHA/TAG"

  project_id             = "project-id"
  name                   = each.value

  depends_on = [
    google_project_services.services["cloudbuild.googleapis.com"]
  ]

  lifecycle = {
    prevent_destroy = true
  }
}


# without meta-arguments
module "some_modules" {
  source                 = "git::https://github.com/owner/repo.git//modules/name?ref=SHA/TAG"

  project_id             = "project-id"
  name                   = each.value

  depends_on = [
    google_project_services.services["cloudbuild.googleapis.com"]
  ]

  lifecycle = {
    prevent_destroy = true
  }
}

Formatting

All Terraform files should be formatted with the default Terraform formatter using the terraform fmt command. The Bets Platform terraform-lint.yml linter should be set up in the service's repository to ensure proper formatting.

Visual Studio Code can be configured to auto format on save using the HashiCorp Terraform extension and updating the user settings with the following values (per the docs):

NOTE: Visual Studio Code may need to be restarted for this to take effect

  "[terraform]": {
    "editor.defaultFormatter": "hashicorp.terraform",
    "editor.formatOnSave": true,
    "editor.formatOnSaveMode": "file"
  },
  "[terraform-vars]": {
    "editor.defaultFormatter": "hashicorp.terraform",
    "editor.formatOnSave": true,
    "editor.formatOnSaveMode": "file"
  },

Modules

Creating Modules

Common modules should be located in the abcxyz/terraform-modules repository. Modules in this repository should have a specific and reusable function rather than trying to create a large module that tries to handle every complex configuration possible.

Structure

/<module_name>
  README.md
  main.tf
  variables.tf
  outputs.tf
  terraform.tf

Sourcing Modules

Prefer using Generic Git over HTTPS when sourcing modules. This option is generally easier to read and provides the ability to inject GitHub PAT tokens to access private repositories if needed.

Additionally, modules should always pin the version using a git tag or sha as opposed to using the main branch. This locks the module in place and prevents accidentally pulling the latest code that may break the infrastructure in unexpected ways.

Example

module "github_ci_infra" {
  source                 = "git::https://github.com/abcxyz/terraform-modules.git//modules/github_ci_infra?ref=SHA/TAG"

  project_id             = google_project.github_metrics_ci.project_id
  name                   = "github-metrics"
  github_repository_name = "github-metrics-aggregator"
}

Injecting PAT Token (GitHub Actions)

# using default github actions token
git config --global url."https://x-access-token:${{ github.token }}@github.com".insteadOf "https://github.com"

# using custom PAT token
git config --global url."https://<USERNAME>:${{ secrets.PAT_TOKEN }}@github.com".insteadOf "https://github.com"

Google Provider

General

  • Prefer accepting a project_id as a variable instead of trying to inherit it from the provider.
    • Adding the project_id at the global provider level can have unwanted effects if you miss adding a project_id on a resource and it gets created in the wrong place. This can also be hard to catch when combing through plan output.
# GOOD
variable "project_id" {
  description = "The GCP project ID."
  type        = string
}

resource "google_project_iam_member" "browser" {
  project = var.project_id
  role    = "roles/browser"
  member  = "group:[email protected]"
}
# BAD
provider "google" {
  project = "my-project-id"
}

google-beta

  • The google-beta provider should only be used on a limited basis and only when a required beta feature is needed. If you aren't using a beta feature, do not include it in the required_providers configuration and do not include it in your resources
  • When required, the provider attribute must be specified in the resource designated google-beta and should be the top most attribute
resource "google_artifact_registry_repository" "image_registry" {
  provider = google-beta

  project       = var.project_id
  location      = "US"
  repository_id = "docker-images"
  description   = "Container Registry for the abcxyz images."
  format        = "DOCKER"
  depends_on = [
    google_project_service.services["artifactregistry.googleapis.com"],
  ]
}

IAM

  • In general, we prefer the use of _member resources for configuring IAM instead of _binding or _policy.
  • _member resources are non-authoritative, meaning they add a member to a given IAM role and manage only that relationship. If other members are added to this role outside of Terraform, it will not cause Terraform to detect changes and try to reconcile them.
  • _binding and _policy are authoritative, meaning they define the list of members for a given IAM role, any changes outside that list are considered "drift" and will be removed by Terraform on the next plan/apply cycle, i.e. Terraform becomes the source of truth.
  • Using authoritative resources can trigger a lot of "drift" scenarios and have unwanted side-effects like removing Google Cloud default members for certain roles. Rather than relying on Terraform to enforce the IAM policy, we follow the principle of least-privilege to prevent changes to any IAM policies outside of Terraform.
⚠️ **GitHub.com Fallback** ⚠️