Terraform: Cross-Statefile Migration

When refactoring Terraform code, you often need to move resources between state files; splitting a monolithic state into smaller ones, consolidating shared resources, or reorganising by team or service boundary. Terraform’s state mv command supports this, but doing it safely across many workspaces and environments requires a consistent, scripted approach.

The problem

Terraform state is a source of truth. Moving resources incorrectly; or pushing a broken state; can cause Terraform to think resources need to be destroyed and recreated. The risks are:

Losing track of which resources have been moved
Pushing a partial state mid-migration and leaving things inconsistent
No audit trail of what was moved

The Approach

The pattern used here works entirely on local copies of the state files. Nothing is pushed to the remote backend until all moves have been validated. The flow for each workspace pair is:

pull source state → pull target state → move resources locally → verify → push both states

Working on local copies means you can safely iterate, inspect, and validate before touching the remote state.

Script structure

Each migration script follows the same pattern:

Configure; define source/target workspaces, environments, and the list of resources to move
Init; run terraform init -reconfigure -upgrade for both source and target directories
Pull; pull both state files locally, keeping a backup copy and a working copy
Move; use terraform state mv with -state and -state-out flags to move resources between the local copies
Verify; count resources before and after to confirm the move was correct
Push; push the modified state files back to the remote backend with -lock=false
Plan; run terraform plan on both workspaces to confirm no unexpected changes

Example script

Here is a generalised version of the pattern:

#!/usr/bin/env bash

set -e

terraform_dir="$HOME/code/terraform"

# Define environments and source/target workspace names
envs=(
  "ew1-dev"
  # "ew1-staging"
  # "ew1-prod"
)

tempdir="$(pwd)/tmp"

target_dir="terraform/shared"
target_project="shared"

for env in ${envs[@]}; do

  backupdir="$tempdir/backup/$env"
  mkdir -p $backupdir

  workingdir="$tempdir/working/$env"
  mkdir -p $workingdir

  target_workspace="$target_project-$env"
  source_dir="terraform/my-service"
  source_workspace="my-service-$env"

  # Init and pull target state
  export TF_WORKSPACE="$env"
  terraform -chdir=$terraform_dir/$target_dir init -reconfigure -upgrade > /dev/null
  terraform -chdir=$terraform_dir/$target_dir state pull > $backupdir/target-backup-$target_workspace.tfstate
  cp $backupdir/target-backup-$target_workspace.tfstate $workingdir/target-$target_workspace.tfstate

  # Init and pull source state
  terraform -chdir=$terraform_dir/$source_dir init -reconfigure -upgrade > /dev/null
  terraform -chdir=$terraform_dir/$source_dir state pull > $backupdir/source-backup-$source_workspace.tfstate
  cp $backupdir/source-backup-$source_workspace.tfstate $workingdir/source-$source_workspace.tfstate

  # Count resources before move (for validation)
  target_count_before=$(terraform state list -state=$workingdir/target-$target_workspace.tfstate | grep -FcE 'module.my_resource' || true)
  source_count_before=$(terraform state list -state=$workingdir/source-$source_workspace.tfstate | grep -FcE 'module.my_resource' || true)

  if [ $source_count_before -eq 0 ]; then
    echo "[ERROR] No resources found in source state" && exit 1
  fi

  # Get list of resources to move from source
  resources=$(terraform state list -state=$workingdir/source-$source_workspace.tfstate | grep -FE 'module.my_resource')

  # Move each resource from source to target (locally)
  for resource in $resources; do
    echo "[INFO] Moving: $resource"
    terraform state mv \
      -state=$workingdir/source-$source_workspace.tfstate \
      -state-out=$workingdir/target-$target_workspace.tfstate \
      $resource $resource

    # Verify the resource landed in the target state
    terraform state list -state=$workingdir/target-$target_workspace.tfstate \
      | grep -F "$resource" \
      || (echo "[ERROR] Resource not found in target state" && exit 1)

    rm $workingdir/*.backup
  done

  # Count after and validate totals add up
  target_count_after=$(terraform state list -state=$workingdir/target-$target_workspace.tfstate | grep -FcE 'module.my_resource' || true)
  source_count_after=$(terraform state list -state=$workingdir/source-$source_workspace.tfstate | grep -FcE 'module.my_resource' || true)

  expected_count=$((source_count_before + target_count_before))
  if [ $target_count_after -ne $expected_count ]; then
    echo "[ERROR] Count mismatch. Expected: $expected_count, Got: $target_count_after" && exit 1
  fi

  # Push both states back to remote
  terraform -chdir=$terraform_dir/$source_dir state push -lock=false $workingdir/source-$source_workspace.tfstate
  terraform -chdir=$terraform_dir/$target_dir state push -lock=false $workingdir/target-$target_workspace.tfstate

  # Plan both to confirm no unexpected changes
  terraform -chdir=$terraform_dir/$source_dir plan
  terraform -chdir=$terraform_dir/$target_dir plan
done

Key details

Always work on local copies

The script pulls both state files before making any changes. The backup copy is never modified; only the working copy. This means you can always diff or restore from backup if something goes wrong.

# backup = untouched original
terraform ... state pull > $backupdir/source-backup-$source_workspace.tfstate

# working = the copy we mutate
cp $backupdir/source-backup-$source_workspace.tfstate $workingdir/source-$source_workspace.tfstate

Use `-state` and `-state-out` flags

terraform state mv normally operates against the remote backend. Using -state and -state-out redirects reads and writes to local files instead:

terraform state mv \
  -state=./working/source.tfstate \
  -state-out=./working/target.tfstate \
  module.my_resource \
  module.my_resource

If the resource address is the same in both state files (i.e. you’re not renaming), pass the same address twice.

Validate counts before pushing

Before pushing, verify that the number of resources in the target state equals the sum of what was in both states before the migration. This catches partial moves or double-moves:

expected_count=$((source_count_before + target_count_before))
if [ $target_count_after -ne $expected_count ]; then
  echo "[ERROR] Count mismatch" && exit 1
fi

Push with `-lock=false`

When pushing local state files back, use -lock=false to avoid lock conflicts. This is safe here because you are the only one modifying the state during the migration:

terraform state push -lock=false $workingdir/target.tfstate

Run plan after pushing

After pushing both states, run terraform plan on each workspace to confirm Terraform sees no unexpected changes. Some drift is expected (e.g. tag changes), but you should not see any resource destruction.

When to Use This Pattern

Splitting a large workspace into smaller per-team or per-service workspaces
Consolidating shared resources (e.g. SNS topics, KMS keys) into a shared workspace
Migrating from one backend to another
Reorganising modules across repos without destroying and recreating infrastructure

Gotchas

Data sources; data sources cannot be moved with state mv. They are re-read on the next plan/apply. Remove them from your resource list.
Resource exclusions; some resources (KMS keys, IAM roles) may exist in the source but should stay there. Explicitly exclude them from the move list rather than relying on grep patterns.
Workspace vs directory; if using Terraform Cloud or Terraform workspaces, ensure TF_WORKSPACE is set correctly before running init and state pull, otherwise you may pull the wrong state.
.backup files; terraform state mv creates .backup files alongside your working state files. Clean these up between iterations to avoid confusion:
```
rm $workingdir/*.backup
```