Terraform Risk Encyclopedia

Documentation

Risk Encyclopedia

Every rule OpStack evaluates is documented here. Search by rule ID, resource type, or keyword to understand exactly what breaks, why it breaks, and what to do about it. Rule IDs in analysis findings link directly to their entry below.

Total Rules

Critical

High

Medium

Low

Azure Resources

AWS Resources

Azure Resource Coverage

47 resource types discovered

OpStack discovers these Azure resource types from your live environment before analysing any Terraform plan.

Virtual Network

azurerm_virtual_network

Networking

Subnet

azurerm_subnet

Networking

Network Security Group

azurerm_network_security_group

Networking

Route Table

azurerm_route_table

Networking

NAT Gateway

azurerm_nat_gateway

Networking

VNet Peering

azurerm_virtual_network_peering

Networking

Network Interface (NIC)

azurerm_network_interface

Networking

Public IP Address

azurerm_public_ip

Networking

Subnet-NSG Association

azurerm_subnet_network_security_group_association

Networking

Subnet-Route Table Association

azurerm_subnet_route_table_association

Networking

Subnet-NAT Association

azurerm_subnet_nat_gateway_association

Networking

NAT-Public IP Association

azurerm_nat_gateway_public_ip_association

Networking

App Service VNet Integration

azurerm_app_service_virtual_network_swift_connection

Networking

Private Endpoint

azurerm_private_endpoint

Private Link

Private DNS Zone

azurerm_private_dns_zone

DNS

Private DNS VNet Link

azurerm_private_dns_zone_virtual_network_link

DNS

Public DNS Zone

azurerm_dns_zone

DNS

Linux VM

azurerm_linux_virtual_machine

Compute

Windows VM

azurerm_windows_virtual_machine

Compute

AKS Cluster

azurerm_kubernetes_cluster

Containers

AKS Node Pool

azurerm_kubernetes_cluster_node_pool

Containers

Azure Container Registry

azurerm_container_registry

Containers

Container App Environment

azurerm_container_app_environment

Containers

Container App

azurerm_container_app

Containers

App Service Plan

azurerm_service_plan

App Hosting

Linux Web App

azurerm_linux_web_app

App Hosting

Linux Function App

azurerm_linux_function_app

App Hosting

Internal Load Balancer

azurerm_lb

Load Balancing

Application Gateway

azurerm_application_gateway

Load Balancing

Azure Bastion

azurerm_bastion_host

Access

Azure SQL Server

azurerm_mssql_server

Data

Azure SQL Database

azurerm_mssql_database

Data

Cosmos DB Account

azurerm_cosmosdb_account

Data

Azure Redis Cache

azurerm_redis_cache

Data

Storage Account

azurerm_storage_account

Storage

Key Vault

azurerm_key_vault

Security

User-Assigned Managed Identity

azurerm_user_assigned_identity

Identity

Service Bus Namespace

azurerm_servicebus_namespace

Messaging

Event Hub Namespace

azurerm_eventhub_namespace

Messaging

API Management

azurerm_api_management

API

Front Door Profile

azurerm_cdn_frontdoor_profile

Traffic

Traffic Manager Profile

azurerm_traffic_manager_profile

Traffic

Log Analytics Workspace

azurerm_log_analytics_workspace

Observability

Application Insights

azurerm_application_insights

Observability

Monitor Metric Alert

azurerm_monitor_metric_alert

Observability

Monitor Action Group

azurerm_monitor_action_group

Observability

Data Factory

azurerm_data_factory

Data Integration

AWS Resource Coverage

29 resource types discovered

OpStack discovers these AWS resource types from your live environment before analysing any Terraform plan.

VPC

aws_vpc

Networking

Subnet

aws_subnet

Networking

Security Group

aws_security_group

Networking

NAT Gateway

aws_nat_gateway

Networking

Internet Gateway

aws_internet_gateway

Networking

Route Table

aws_route_table

Networking

IAM Role

aws_iam_role

IAM

EC2 Instance

aws_instance

Compute

Lambda Function

aws_lambda_function

Compute

EKS Cluster

aws_eks_cluster

Containers

EKS Node Group

aws_eks_node_group

Containers

ECS Cluster

aws_ecs_cluster

Containers

ECS Service

aws_ecs_service

Containers

Application/Network LB

aws_lb

Load Balancing

Target Group

aws_lb_target_group

Load Balancing

CloudFront Distribution

aws_cloudfront_distribution

CDN

RDS Instance

aws_db_instance

Data

ElastiCache Cluster

aws_elasticache_cluster

Data

S3 Bucket

aws_s3_bucket

Storage

KMS Key

aws_kms_key

Security

ACM Certificate

aws_acm_certificate

Security

SQS Queue

aws_sqs_queue

Messaging

SNS Topic

aws_sns_topic

Messaging

API Gateway REST API

aws_api_gateway_rest_api

API

API Gateway HTTP API

aws_apigatewayv2_api

API

Route53 Hosted Zone

aws_route53_zone

DNS

Route53 Record

aws_route53_record

DNS

CloudWatch Metric Alarm

aws_cloudwatch_metric_alarm

Observability

Auto Scaling Policy

aws_autoscaling_policy

Scaling

Azure Risk Rules

41 rules

AZURE_VNET_DELETION_BREAKS_INFRA CRITICAL

Deleting VNet With Active Infrastructure

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_virtual_network

Fires when

VNet contains subnets, NICs, VMs, AKS clusters, private endpoints, load balancers, App Services, Container App Environments, or Bastion hosts.

What breaks

All networking for every resource inside the VNet is destroyed. Apply fails with InUseSubnetCannotBeDeleted or VirtualNetworkInUse if dependent resources are not in the same plan.

Why it happens

Azure ARM enforces VNet deletion constraints at the API layer. Any NIC, private endpoint, App Gateway, or service-delegated subnet holds an ARM reference inside the VNet - Azure refuses deletion while those references exist.

Remediation

Include all resources inside the VNet in the same Terraform plan. Never delete a VNet in isolation.

Example scenario

Deleting vnet-prod while aks-cluster, 9 private endpoints, and an Application Gateway are deployed fails at the ARM API level before any resource is touched.

Microsoft Documentation ↗

AZURE_SUBNET_DELETION_BREAKS_AKS CRITICAL

Deleting Subnet Used By AKS Cluster

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet

Fires when

An AKS cluster has node pools deployed into this subnet.

What breaks

Azure blocks subnet deletion with InUseSubnetCannotBeDeleted. Apply fails immediately.

Why it happens

AKS node pools hold ARM NIC references inside the subnet. Azure enforces deletion constraints at the ARM layer regardless of node pool state.

Remediation

Delete the AKS cluster (or node pool using this subnet) in the same Terraform plan. Terraform handles ordering.

Example scenario

Removing subnet-aks while aks-prod has its default node pool in it fails with InUseSubnetCannotBeDeleted.

Microsoft Documentation ↗

AZURE_SUBNET_IN_USE_BY_NIC CRITICAL

Deleting Subnet With Active NICs (Apply Will Fail)

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet

Fires when

One or more Network Interfaces have IP configurations placed inside this subnet.

What breaks

Azure blocks subnet deletion with InUseSubnetCannotBeDeleted. Apply fails immediately.

Why it happens

Every NIC deployed in a subnet holds an ARM reference to the subnet. Azure enforces this regardless of VM power state - even a stopped VM's NIC blocks subnet deletion.

Remediation

Delete or detach all NICs from this subnet first. Include them in the same Terraform plan.

Example scenario

Deleting subnet-vm while vm-prod's NIC is still active fails with InUseSubnetCannotBeDeleted.

Microsoft Documentation ↗

AZURE_SUBNET_IN_USE_BY_APP_SERVICE CRITICAL

Deleting Subnet Used By App Service VNet Integration (Apply Will Fail)

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet

Fires when

A Web App or Function App has Regional VNet Integration pointing to this subnet via azurerm_app_service_virtual_network_swift_connection.

What breaks

Azure may block subnet deletion with InUseSubnetCannotBeDeleted if the Swift connection is not removed first.

Why it happens

The Swift connection resource holds an ARM reference inside the subnet. Terraform's parallel execution may attempt to delete the subnet before the connection is removed.

Remediation

Include azurerm_app_service_virtual_network_swift_connection in the deletion plan before the subnet.

Example scenario

Removing subnet-app while the WebApp Swift connection references it fails unless the Swift connection is deleted first.

Microsoft Documentation ↗

AZURE_SUBNET_IN_USE_BY_PRIVATE_ENDPOINT CRITICAL

Deleting Subnet With Active Private Endpoints (Apply Will Fail)

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet

Fires when

One or more Private Endpoints are deployed in this subnet.

What breaks

Azure blocks deletion with PrivateEndpointInUseError. For services with public_network_access=Disabled, removing PEs also makes those services completely unreachable.

Why it happens

Private endpoints hold ARM NIC references inside the subnet. In private-link-only architectures, a single PE subnet commonly hosts all private endpoints for the subscription.

Remediation

Include all Private Endpoints in the deletion plan. Verify target services have a fallback access path before removing PEs.

Example scenario

Deleting subnet-private-endpoints while 9 PEs exist fails with PrivateEndpointInUseError.

Microsoft Documentation ↗

AZURE_NSG_DELETION_BREAKS_VMS CRITICAL

Deleting NSG Attached to Active Resources

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_network_security_group

Fires when

The NSG is attached to a subnet or NIC with active VMs or services.

What breaks

All traffic filtering rules removed immediately. Subnet loses its security boundary. Resources may become exposed within the VNet.

Why it happens

Azure allows NSG deletion even when associated with a subnet or NIC. The association resource remains as an orphaned reference.

Remediation

Remove the NSG association first. Verify the subnet operates correctly under default rules before deleting.

Example scenario

Deleting nsg-web immediately removes all port 443/80 restrictions and lateral movement controls for subnet-frontend.

Microsoft Documentation ↗

AZURE_LOADBALANCER_SUBNET_DELETION HIGH

Deleting Subnet Used By Load Balancer

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet

Fires when

An Internal Load Balancer or Application Gateway has its frontend IP configuration in this subnet.

What breaks

Azure blocks deletion with InUseSubnetCannotBeDeleted. Traffic routing fails.

Why it happens

The LB or App Gateway frontend IP configuration holds an ARM reference to the subnet.

Remediation

Delete or reconfigure the Load Balancer or Application Gateway before deleting the subnet.

Example scenario

Deleting subnet-edge while internal-lb has its frontend IP in it fails with InUseSubnetCannotBeDeleted.

Microsoft Documentation ↗

AZURE_APP_SERVICE_PLAN_HAS_APPS CRITICAL

Deleting App Service Plan With Hosted Apps (Apply Will Fail)

Azure compute

▾

Trigger

DELETE or REPLACE of azurerm_service_plan or azurerm_app_service_plan

Fires when

One or more Web Apps or Function Apps are hosted on this plan.

What breaks

Azure blocks plan deletion with CannotDeleteHostingPlanWithSites. Apply fails immediately.

Why it happens

Azure enforces that an App Service Plan cannot be deleted while sites are registered to it.

Remediation

Delete all hosted apps in the same Terraform plan. If apps are in a separate workspace, destroy them first.

Example scenario

Deleting service-plan-prod while web-app and func-app are hosted on it fails with CannotDeleteHostingPlanWithSites.

Microsoft Documentation ↗

AZURE_CONTAINER_APP_ENV_HAS_APPS CRITICAL

Deleting Container App Environment With Hosted Apps (Apply Will Fail)

Azure containers

▾

Trigger

DELETE or REPLACE of azurerm_container_app_environment

Fires when

One or more Container Apps are hosted inside this environment.

What breaks

Azure blocks environment deletion until all hosted Container Apps are removed.

Why it happens

Container App Environments have a parent-child relationship with Container Apps enforced at the ARM layer.

Remediation

Delete all Container Apps in the same Terraform plan before the environment.

Example scenario

Deleting cae-prod while api-container-app and worker-container-app are hosted in it fails at apply time.

Microsoft Documentation ↗

AZURE_KEYVAULT_DELETION CRITICAL

Deleting Key Vault With Dependent Resources

Azure security

▾

Trigger

DELETE or REPLACE of azurerm_key_vault

Fires when

Always fires. Severity downgrades to HIGH if purge protection is enabled.

What breaks

All secrets, keys, and certificates become inaccessible. CMK-encrypted storage data may become unreadable. App Services fail to retrieve secrets on next restart.

Why it happens

Without purge protection, KV deletion is permanent after the 90-day soft-delete window. OpStack detects purge protection from live state and adjusts severity accordingly.

Remediation

Enable purge protection on all production Key Vaults. Rotate all secrets to a new vault before deleting.

Example scenario

Deleting kv-prod makes all app secrets immediately inaccessible. Web apps fail to start because AZURE_CLIENT_SECRET cannot be retrieved.

Microsoft Documentation ↗

AZURE_SQL_SERVER_DELETION CRITICAL

Deleting Azure SQL Server

Azure data

▾

Trigger

DELETE or REPLACE of azurerm_sql_server or azurerm_mssql_server

Fires when

Always fires.

What breaks

Every database on the server permanently deleted. All application connections fail immediately. Private endpoints orphaned. Private DNS sql zone resolves to a dead endpoint.

Why it happens

Azure SQL Server deletion cascades to all hosted databases. Unlike RDS there is no final snapshot parameter.

Remediation

Export or restore all databases before deletion. Update private endpoint configurations and DNS records.

Example scenario

Deleting sql-prod while hosting app-db, analytics-db, and audit-db permanently destroys all three databases simultaneously.

Microsoft Documentation ↗

AZURE_AKS_CLUSTER_DELETION HIGH

Deleting AKS Cluster

Azure containers

▾

Trigger

DELETE or REPLACE of azurerm_kubernetes_cluster

Fires when

Always fires.

What breaks

All node pools terminated. All running Kubernetes workloads destroyed. PVs with Delete reclaim policy permanently deleted. Managed identity assignments removed.

Why it happens

AKS cluster deletion removes the entire control plane and all node pools without graceful drain. PVs backed by Azure Disk with Delete reclaim policy are deleted by the Azure CSI driver.

Remediation

Cordon and drain all nodes before deletion. Back up StatefulSet data. Ensure PVs use Retain reclaim policy if data must survive.

Example scenario

Deleting aks-prod terminates all 12 node pools, kills all 847 running pods, and permanently deletes 34 Azure Disk PVs.

Microsoft Documentation ↗

AZURE_APP_SERVICE_DELETION HIGH

Deleting App Service or Function App

Azure compute

▾

Trigger

DELETE or REPLACE of azurerm_linux_web_app, azurerm_windows_web_app, azurerm_linux_function_app, azurerm_windows_function_app, or azurerm_function_app

Fires when

Always fires.

What breaks

Application taken offline immediately. Front Door origins return 502. Traffic Manager endpoints become unhealthy. VNet integration Swift connection removed. App Insights config becomes stale.

Why it happens

App Service deletion is immediate and irreversible - no soft-delete. All app settings, connection strings, and deployment history permanently deleted.

Remediation

Export app settings before deletion. Update DNS records, Front Door origins, and Traffic Manager endpoints.

Example scenario

Deleting api-webapp causes Front Door to return 502, Traffic Manager probes to fail, and App Insights to stop collecting telemetry.

Microsoft Documentation ↗

AZURE_NAT_GATEWAY_DELETION HIGH

Deleting NAT Gateway Removes Outbound Internet Connectivity

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_nat_gateway

Fires when

One or more subnets have this NAT Gateway associated.

What breaks

All subnets using this NAT Gateway lose outbound internet connectivity. VMs, AKS pods, and App Services cannot reach external endpoints.

Why it happens

Azure allows NAT Gateway deletion even when subnets reference it. The subnet associations remain but the SNAT pool is gone - outbound traffic is silently dropped.

Remediation

Create a replacement NAT Gateway before deleting. Update all subnet associations.

Example scenario

Deleting nat-gw-prod disconnects AKS pods from Docker Hub and blocks Function Apps from calling external APIs.

Microsoft Documentation ↗

AZURE_NSG_ASSOCIATION_REMOVAL HIGH

Removing NSG-Subnet Association Silently Removes Traffic Filtering

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet_network_security_group_association

Fires when

Always fires.

What breaks

All custom traffic filtering rules removed immediately. Subnet reverts to Azure default rules - custom port restrictions, source IP blocks, and lateral movement controls silently lost. Apply succeeds with no error.

Why it happens

Azure allows NSG association deletion regardless of the subnet's active resources. The NSG itself is not deleted - only the binding. The subnet immediately uses default VNet rules.

Remediation

Ensure an alternative security boundary exists before removing the association. Verify default rules are sufficient for the subnet's workload.

Example scenario

Removing nsg-association-frontend silently removes all port restrictions. All VMs in the subnet become accessible on any port from within the VNet.

Microsoft Documentation ↗

AZURE_ROUTE_TABLE_ASSOCIATION_REMOVAL HIGH

Removing Route Table Association Silently Removes Custom Routes

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet_route_table_association

Fires when

Always fires. Severity is HIGH if the route table contains Virtual Appliance or VPN Gateway routes; MEDIUM otherwise.

What breaks

All custom routes removed. Subnet reverts to Azure system routes only. Traffic to NVAs, firewalls, or on-premises destinations silently lost. Apply succeeds with no error.

Why it happens

Azure allows route table association deletion without affecting the subnet's resources. The subnet immediately uses only Azure system routes.

Remediation

Verify the subnet can operate with system routes only before removing. For subnets routing through Azure Firewall, this removal bypasses the firewall entirely.

Example scenario

Removing route-table-association-app removes the default route to Azure Firewall. Outbound traffic bypasses the firewall and routes directly to the internet.

Microsoft Documentation ↗

AZURE_NAT_ASSOCIATION_REMOVAL HIGH

Removing Subnet-NAT Association Silently Removes Egress Connectivity

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_subnet_nat_gateway_association

Fires when

Always fires.

What breaks

Subnet loses outbound internet connectivity. NAT Gateway still exists and incurs cost but no longer serves this subnet. VMs, AKS pods, and App Services lose egress access. Apply succeeds with no error.

Why it happens

Azure allows subnet-NAT association deletion independently of both resources. The association is the binding - removing it is equivalent to turning off egress for that subnet.

Remediation

If migrating to a different NAT Gateway, create the new association before removing the old one.

Example scenario

Removing nat-association-aks removes egress from subnet-aks. AKS pods can no longer pull images from Docker Hub.

Microsoft Documentation ↗

AZURE_NAT_PIP_ASSOCIATION_REMOVAL HIGH

Removing NAT Gateway-Public IP Association Removes All Egress (Silent)

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_nat_gateway_public_ip_association

Fires when

Always fires.

What breaks

NAT Gateway loses its entire SNAT pool. ALL subnets using this NAT Gateway simultaneously lose outbound internet connectivity. NAT Gateway and Public IP still exist and still incur cost. Apply succeeds with no error.

Why it happens

The NAT-PIP association is the binding that attaches a public IP to a NAT Gateway. Without a public IP, the NAT Gateway has no SNAT pool and cannot perform outbound translation for any subnet.

Remediation

Attach a replacement public IP to the NAT Gateway before removing this association.

Example scenario

Removing nat-pip-association detaches the public IP from nat-gw-prod. All 3 subnets (vm, app, aks) simultaneously lose internet egress.

Microsoft Documentation ↗

AZURE_VNET_INTEGRATION_REMOVAL HIGH

Removing App Service VNet Integration Breaks Private Network Access

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_app_service_virtual_network_swift_connection

Fires when

Always fires.

What breaks

App Service loses private network access to the VNet. SQL, Service Bus, Key Vault, and other VNet resources become unreachable via private endpoint. Services with public_network_access=Disabled become completely unreachable. Apply succeeds with no error.

Why it happens

The Swift connection manages the App Service's Regional VNet Integration. Deleting it removes the app from the VNet without deleting the app. The app continues running but loses all private connectivity.

Remediation

If migrating to a different subnet, create the new connection before removing the old one.

Example scenario

Removing swift-connection-webapp causes the web app to fail all connections to SQL, Service Bus, and Key Vault.

Microsoft Documentation ↗

AZURE_PRIVATE_ENDPOINT_DELETION HIGH

Deleting Private Endpoint Breaks Private Service Connectivity

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_private_endpoint

Fires when

Always fires. Severity escalates to CRITICAL if the target service has public_network_access=Disabled.

What breaks

Private network connectivity to the target PaaS service removed. Resources in the VNet receive DNS resolution failures or connection refused. If public access disabled, service becomes completely unreachable from all clients.

Why it happens

Private endpoints provide the private IP for PaaS services inside a VNet. Deleting the PE removes the network path and the DNS A record. Without the PE and with public access disabled, there is no access path.

Remediation

Verify the target service has an alternative access path before deleting. Update connection strings to use the public FQDN if switching to public access.

Example scenario

Deleting pe-sql with SQL Server public_network_access=Disabled makes the SQL Server completely unreachable. All connections fail with connection timeout.

Microsoft Documentation ↗

AZURE_DNS_VNET_LINK_DELETION HIGH

Removing Private DNS VNet Link Breaks Private Hostname Resolution

Azure dns

▾

Trigger

DELETE or REPLACE of azurerm_private_dns_zone_virtual_network_link

Fires when

Always fires.

What breaks

VNet loses DNS resolution for that private zone. Resources inside the VNet get NXDOMAIN when resolving private hostnames for PE-backed services. PE still exists but is unreachable by hostname. Apply succeeds with no error.

Why it happens

Private DNS VNet links register the VNet with Azure DNS to receive answers from a private zone. Without the link, Azure DNS does not return records from that zone to resources in the VNet.

Remediation

Recreate the VNet link immediately. This is a zero-downtime operation - recreating the link restores DNS resolution instantly.

Example scenario

Removing dns-vnet-link-blob causes all VNet resources to get NXDOMAIN for <storage>.blob.core.windows.net, making storage unreachable even though the PE and storage account still exist.

Microsoft Documentation ↗

AZURE_MANAGED_IDENTITY_DELETION HIGH

Deleting Managed Identity Breaks Authentication for Assigned Resources

Azure identity

▾

Trigger

DELETE or REPLACE of azurerm_user_assigned_identity

Fires when

One or more Web Apps, Function Apps, AKS clusters, or Container Apps have this identity assigned.

What breaks

All resources with this identity lose their authentication token for Key Vault, Storage, SQL, and other Azure services. Authentication errors surface only when those resources attempt to use the identity. Apply succeeds with no error.

Why it happens

Azure allows User-Assigned Managed Identity deletion even when assigned to resources. The assignment reference remains on those resources pointing to a deleted identity. The failure manifests as 401/AccessDenied when the resource tries to obtain a token.

Remediation

Create a replacement managed identity, assign it to all dependent resources, and grant the same RBAC roles before deleting the old identity.

Example scenario

Deleting managed-identity-prod causes web-app to fail all Key Vault secret retrievals and AKS to receive AccessDenied on ACR image pulls.

Microsoft Documentation ↗

AZURE_STORAGE_ACCOUNT_DELETION HIGH

Deleting Azure Storage Account

Azure data

▾

Trigger

DELETE or REPLACE of azurerm_storage_account

Fires when

Always fires.

What breaks

All blobs, queues, tables, and file shares destroyed. Function Apps using this account for AzureWebJobsStorage fail to start. Private endpoints orphaned. ADF managed private endpoints lose their connection.

Why it happens

Storage account deletion is immediate. Function App runtime state is stored in Azure Storage - cold starts fail without it.

Remediation

Migrate all data before deletion. Update Function App AzureWebJobsStorage connection strings. Remove or update private endpoints and DNS records.

Example scenario

Deleting storage-prod makes all 3 blob containers inaccessible and causes func-payment-processor to fail on next cold start.

Microsoft Documentation ↗

AZURE_COSMOSDB_DELETION HIGH

Deleting Cosmos DB Account

Azure data

▾

Trigger

DELETE or REPLACE of azurerm_cosmosdb_account

Fires when

Always fires.

What breaks

All databases, containers, and data across all geo-replicated regions permanently destroyed. Private endpoints orphaned. Private DNS zone for Cosmos resolves to dead endpoint.

Why it happens

Cosmos DB deletion cascades to all child databases and containers across all replicated regions.

Remediation

Enable continuous backup before deletion. Remove geo-replication before deleting the primary.

Example scenario

Deleting cosmos-orders destroys 5 years of order history across 3 geo-replicated regions simultaneously.

Microsoft Documentation ↗

AZURE_SERVICEBUS_DELETION HIGH

Deleting Service Bus Namespace

Azure messaging

▾

Trigger

DELETE or REPLACE of azurerm_servicebus_namespace

Fires when

Always fires.

What breaks

All queues, topics, and subscriptions permanently destroyed. All in-flight and unprocessed messages permanently lost. Applications receive connection errors immediately.

Why it happens

Service Bus namespace deletion is immediate and permanent. There is no soft-delete at the namespace level. Dead-letter queue messages are also deleted.

Remediation

Drain all queues before deletion. Migrate consumers to a new namespace. Update connection strings in all consuming applications.

Example scenario

Deleting sb-payments destroys order-queue with 2,847 unprocessed messages. payment-processor immediately starts failing with MessagingEntityNotFoundException.

Microsoft Documentation ↗

AZURE_APIM_DELETION HIGH

Deleting API Management Service

Azure api

▾

Trigger

DELETE or REPLACE of azurerm_api_management

Fires when

Always fires.

What breaks

All published APIs unavailable (404/502). API consumers receive connection errors. OAuth2 token endpoints, developer portal, and subscription keys invalidated.

Why it happens

APIM deletion removes the API gateway permanently. For Developer/Standard/Premium tiers, deletion takes 30-60 minutes.

Remediation

Migrate API consumers to direct backend URLs before deletion. Export API definitions. Update DNS records.

Example scenario

Deleting apim-prod (hosting 47 APIs) immediately returns 404 to all API consumers and invalidates all subscription keys.

Microsoft Documentation ↗

AZURE_DNS_ZONE_DELETION HIGH

Deleting Azure DNS Zone

Azure dns

▾

Trigger

DELETE or REPLACE of azurerm_dns_zone or azurerm_private_dns_zone

Fires when

Always fires.

What breaks

DNS resolution breaks for all hostnames in the zone. For private DNS zones: services with public_network_access=Disabled become completely unreachable from within the VNet.

Why it happens

Private DNS zones are the resolution mechanism for private endpoints. Deleting the zone removes all A records and VNet links. Clients get NXDOMAIN when resolving private service hostnames.

Remediation

Recreate the zone and VNet links immediately. For service-specific zones (privatelink.*), Azure automatically re-registers PE records when the zone exists.

Example scenario

Deleting privatelink.blob.core.windows.net zone causes all blob storage connections via private hostname to fail with NXDOMAIN.

Microsoft Documentation ↗

AZURE_PUBLIC_IP_ORPHANED_APPGW LOW

Application Gateway Deletion Leaves Public IP Orphaned

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_application_gateway

Fires when

The Application Gateway has a separately-managed Public IP resource as its frontend IP.

What breaks

Public IP remains allocated and incurs cost (~$5/month for Standard SKU) but is no longer attached to anything.

Why it happens

Azure does not automatically delete a Public IP when the resource using it is deleted. The PIP is a separate ARM resource with its own lifecycle.

Remediation

Include the Public IP resource in the same Terraform destruction plan as the Application Gateway.

Example scenario

Deleting appgw-prod leaves pip-appgw-prod allocated and billing at $5/month indefinitely.

Microsoft Documentation ↗

AZURE_PUBLIC_IP_ORPHANED_BASTION LOW

Bastion Host Deletion Leaves Public IP Orphaned

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_bastion_host

Fires when

The Bastion host has a separately-managed Public IP resource.

What breaks

Dedicated Bastion Public IP remains allocated and incurs cost but is no longer attached.

Why it happens

Azure requires a Standard SKU Public IP for Bastion. That PIP is a separate ARM resource and must be deleted explicitly.

Remediation

Include the Bastion Public IP in the same Terraform destruction plan as the Bastion host.

Example scenario

Deleting bastion-prod leaves pip-bastion-prod orphaned at ~$5/month.

Microsoft Documentation ↗

AZURE_PUBLIC_IP_ORPHANED_NIC LOW

NIC Deletion Leaves Associated Public IP Orphaned

Azure networking

▾

Trigger

DELETE or REPLACE of azurerm_network_interface

Fires when

The NIC has a separately-managed Public IP resource associated with it.

What breaks

Public IP remains allocated and incurs cost but is detached.

Why it happens

When a NIC is deleted, Azure removes the NIC resource but does not cascade-delete any separately-managed Public IP resources associated with it.

Remediation

Include the Public IP in the same Terraform destruction plan as the NIC.

Example scenario

Deleting nic-vm-prod leaves pip-vm-prod orphaned.

Microsoft Documentation ↗

AZURE_METRIC_ALERT_ORPHANED LOW

VM or AKS Deletion Leaves Metric Alert Rules Orphaned

Azure observability

▾

Trigger

DELETE or REPLACE of azurerm_linux_virtual_machine or azurerm_kubernetes_cluster

Fires when

Monitor Metric Alert rules exist that target this resource.

What breaks

Alert rules still exist and incur monitoring cost. They either never fire (resource no longer emits metrics) or fire with a ResourceNotFound evaluation. Action Group notifications sent in wrong context.

Why it happens

Azure allows deletion of the monitored resource without cleaning up alert rules.

Remediation

Delete the orphaned alert rules in the same Terraform plan as the resource being deleted.

Example scenario

Deleting vm-prod leaves cpu-alert-vm-prod and memory-alert-vm-prod orphaned, each billing at ~$0.10/month per metric.

Microsoft Documentation ↗

AZURE_DNS_RECORD_ORPHANED LOW

VM or Load Balancer Deletion Leaves Private DNS A Records Stale

Azure dns

▾

Trigger

DELETE or REPLACE of azurerm_linux_virtual_machine, azurerm_windows_virtual_machine, or azurerm_lb

Fires when

Private DNS A records exist whose name matches the deleted resource.

What breaks

DNS A records point to now-deallocated IPs. Applications using hostname-based connections receive connection refused. IP reuse may cause silent misrouting.

Why it happens

Azure does not automatically remove private DNS records when the resource they reference is deleted.

Remediation

Delete the DNS A records in the same Terraform plan as the resource.

Example scenario

Deleting vm-prod leaves A record vm.lab.internal pointing to the VM's former IP. If that IP is reassigned, internal services silently connect to the wrong host.

Microsoft Documentation ↗

AZURE_VM_SKU_DOWNGRADE MEDIUM

Azure VM Size Downgrade

Azure compute

▾

Trigger

UPDATE of azurerm_linux_virtual_machine or azurerm_windows_virtual_machine with a smaller vm_size

Fires when

The new VM size is a lower tier than the current size.

What breaks

VM is resized, causing a reboot. Available vCPUs and memory reduced. Workloads may experience increased latency, OOM errors, or failure under peak load.

Why it happens

VM resize in Azure requires deallocation and reallocation in most cases, causing a cold reboot.

Remediation

Load test the workload at the target VM size in staging before applying. Schedule the resize during a maintenance window.

Example scenario

Downgrading vm-prod from Standard_D8s_v5 to Standard_D4s_v5 causes a reboot and may cause OOM kills under peak database query load.

Microsoft Documentation ↗

AZURE_SQL_BACKUP_RETENTION_REDUCTION MEDIUM

Azure SQL Backup Retention Period Reduced

Azure data

▾

Trigger

UPDATE of azurerm_mssql_database with a shorter short_term_retention_policy.retention_days

Fires when

The new retention value is strictly less than the current value.

What breaks

Recovery window shortened. Azure immediately purges backup files older than the new retention period. Point-in-time restore no longer possible beyond the new window.

Why it happens

Azure SQL automated backups are purged immediately when they fall outside the retention window.

Remediation

Evaluate RPO requirements before reducing retention. Consider long-term retention (LTR) policies.

Example scenario

Reducing backup retention from 35 days to 1 day means a corruption event discovered 3 days later cannot be recovered.

Microsoft Documentation ↗

AZURE_AKS_NODE_POOL_SCALE_DOWN MEDIUM

AKS Node Pool Count Reduced

Azure containers

▾

Trigger

UPDATE of azurerm_kubernetes_cluster or azurerm_kubernetes_cluster_node_pool with a lower node_count

Fires when

The new node count is strictly less than the current node count.

What breaks

Kubernetes evicts pods from removed nodes. Workloads may be temporarily unavailable. If remaining nodes lack capacity, pods enter Pending state.

Why it happens

AKS scale-down cordons and drains nodes. If PodDisruptionBudgets are violated or resources are insufficient on remaining nodes, pods cannot reschedule.

Remediation

Check resource utilisation before scaling down. Verify PodDisruptionBudgets allow eviction. Scale down gradually.

Example scenario

Scaling aks-prod from 5 to 2 nodes evicts 28 pods. 6 pods cannot reschedule because remaining nodes are at 95% CPU.

Microsoft Documentation ↗

AZURE_MONITOR_ALERT_DELETION MEDIUM

Deleting Azure Monitor Alert Rule

Azure observability

▾

Trigger

DELETE or REPLACE of azurerm_monitor_metric_alert

Fires when

Always fires.

What breaks

Monitoring coverage for the targeted resource removed. Incidents go undetected until the alert is recreated.

Why it happens

Azure Monitor alerts are the primary operational alerting mechanism. Deleting an alert rule permanently removes its evaluation logic.

Remediation

Export alert configuration before deletion. Ensure alternative alerting coverage exists.

Example scenario

Deleting vm-cpu-alert means the operations team has no notification when vm-prod CPU exceeds 90%.

Microsoft Documentation ↗

AZURE_VNET_INTEGRATION_LIFECYCLE_RISK MEDIUM

App Service VNet Integration Missing ignore_changes (Perpetual Drift Risk)

Azure hygiene

▾

Trigger

UPDATE of azurerm_linux_web_app or any function app with virtual_network_subnet_id changing

Fires when

The App Service has an active VNet integration in live state AND virtual_network_subnet_id is changing in this UPDATE.

What breaks

Without lifecycle { ignore_changes = [virtual_network_subnet_id] } on the web app resource, Terraform generates a perpetual diff - an UPDATE is produced on every plan, causing unnecessary App Service restarts on every apply and intermittent VNet integration outages.

Why it happens

When azurerm_app_service_virtual_network_swift_connection manages VNet integration, it writes virtual_network_subnet_id on the web app via a separate ARM API call. Terraform sees this as drift on the web app resource and generates an UPDATE on every plan.

Remediation

Add lifecycle { ignore_changes = [virtual_network_subnet_id] } to the azurerm_linux_web_app or azurerm_linux_function_app resource to let the Swift connection resource own that attribute.

Example scenario

Every CI/CD pipeline run generates an UPDATE for api-webapp touching virtual_network_subnet_id, restarting the app unnecessarily.

Microsoft Documentation ↗

AZURE_SUBNET_DELEGATION_CONFLICT CRITICAL

Changing Subnet Delegation With Active Delegated Resources (Apply Will Fail)

Azure hygiene

▾

Trigger

UPDATE or REPLACE of azurerm_subnet with a changed delegation block

Fires when

The delegation service_name changes (added, removed, or changed) AND active resources using the current delegation are present in the subnet.

What breaks

Azure blocks in-place delegation changes with SubnetDelegationConflict. Apply fails immediately.

Why it happens

Azure subnet delegations are all-or-nothing and enforced at the ARM layer. A subnet with an active Container App Environment cannot have its Microsoft.App/environments delegation changed while the environment is deployed.

Remediation

Delete all resources using the current delegation first. Apply the delegation change. Redeploy resources with the new delegation. There is no in-place migration path.

Example scenario

Changing delegation from Microsoft.App/environments to Microsoft.Web/serverFarms while a Container App Environment is deployed in the subnet fails with SubnetDelegationConflict.

Microsoft Documentation ↗

AZURE_STORAGE_VERSIONING_DISABLED LOW

Azure Storage Blob Versioning Disabled

Azure governance

▾

Trigger

UPDATE of azurerm_storage_account with blob_properties.versioning_enabled changing from true to false

Fires when

Blob versioning was previously enabled.

What breaks

Future overwrites or deletions of blobs cannot be recovered from previous versions.

Why it happens

Azure Blob versioning maintains previous versions of each blob when overwritten or deleted. Disabling it means any future accidental write or delete is permanent.

Remediation

Ensure alternative data recovery mechanisms exist before disabling versioning.

Example scenario

Disabling versioning on storage-artifacts means the next accidental overwrite of a deployment package is permanent.

Microsoft Documentation ↗

AZURE_PRODUCTION_TAG_REMOVAL LOW

Production Tag Removed From Azure Resource

Azure governance

▾

Trigger

UPDATE of any Azure resource removing a tag with key env/environment and value prod/production/prd

Fires when

The resource had a production environment tag before the change and does not after.

What breaks

Azure Policy rules using environment tags may stop applying. Cost allocation reports misclassify the resource.

Why it happens

Azure Policy enforcement, cost management grouping, and monitoring dashboards use environment tags as their primary resource selection criterion.

Remediation

Ensure the tag change is intentional. Review Azure Policy assignments and cost allocation configurations.

Example scenario

Removing Environment=production from sql-server-prod causes Azure Policy to stop enforcing the deny-public-access rule.

Microsoft Documentation ↗

AWS Risk Rules

23 rules

IAM_ROLE_DELETION_BREAKS_EKS CRITICAL

Deleting IAM Role Used By EKS NodeGroup

AWS compute

▾

Trigger

DELETE or REPLACE of aws_iam_role

Fires when

An EKS node group uses this IAM role as its node instance role.

What breaks

EKS nodes lose IAM permissions. New nodes cannot register with the cluster, pull from ECR, write to CloudWatch, or access other AWS services.

Why it happens

AWS allows IAM role deletion even when EKS node groups reference it. On the next node replacement, nodes attempt to assume the deleted role ARN and receive AccessDenied.

Remediation

Create and attach a replacement IAM role to the node group before deleting the old role.

Example scenario

Deleting eks-node-role allows current nodes to continue but every new node after the next cluster upgrade fails with AccessDenied on AssumeRole.

AWS Documentation ↗

ECS_ROLE_DELETION CRITICAL

Deleting IAM Role Used By ECS Service

AWS compute

▾

Trigger

DELETE or REPLACE of aws_iam_role

Fires when

An ECS service uses this IAM role as its task execution role or task role.

What breaks

ECS services fail to launch new tasks. Cannot pull images from ECR, retrieve secrets, write logs, or access other AWS services.

Why it happens

The ECS task execution role is required at task launch time. AWS allows the role to be deleted while tasks reference it - the next launch fails with AccessDenied on AssumeRole.

Remediation

Update the ECS task definition to reference a replacement role before deleting. Force a new deployment after updating.

Example scenario

Deleting ecs-task-role allows existing tasks to continue but the next auto-scaling event fails to launch new tasks.

AWS Documentation ↗

ACM_CERT_DELETION CRITICAL

Deleting ACM Certificate In Use By Load Balancer

AWS networking

▾

Trigger

DELETE or REPLACE of aws_acm_certificate

Fires when

An Application Load Balancer HTTPS listener references this certificate.

What breaks

HTTPS listeners on the Load Balancer fail. All TLS clients receive certificate errors or connection failures.

Why it happens

AWS ACM prevents deletion of an in-use certificate with ResourceInUseException. If the Terraform plan also removes the listener, the deletion may succeed and break HTTPS.

Remediation

Provision and validate a replacement certificate first. Update the listener, then delete the old certificate.

Example scenario

Deleting api-cert while alb-prod has an HTTPS listener using it fails with ResourceInUseException unless the listener is also removed.

AWS Documentation ↗

VPC_DELETION_BREAKS_INFRA CRITICAL

Deleting VPC With Active Infrastructure

AWS networking

▾

Trigger

DELETE or REPLACE of aws_vpc

Fires when

Always fires.

What breaks

All networking for every resource inside the VPC is destroyed. Subnets, EC2 instances, EKS clusters, RDS databases, and load balancers lose connectivity.

Why it happens

AWS blocks VPC deletion if any resources exist inside it. If those resources are also in the deletion plan, AWS processes them in dependency order.

Remediation

Delete all resources inside the VPC before deleting the VPC. Include all in the same Terraform plan.

Example scenario

Deleting vpc-prod while eks-cluster, rds-db, and alb-frontend exist fails with DependencyViolation.

AWS Documentation ↗

KMS_KEY_DELETION CRITICAL

Deleting KMS Key With Dependent Resources

AWS security

▾

Trigger

DELETE or REPLACE of aws_kms_key

Fires when

Always fires.

What breaks

All data encrypted with this key becomes permanently inaccessible after the deletion waiting period - RDS snapshots, S3 objects with SSE-KMS, Secrets Manager values, EBS volumes.

Why it happens

AWS KMS schedules key deletion with a 7-30 day waiting period. After deletion the key material is destroyed and all ciphertext is permanently unreadable.

Remediation

Before scheduling deletion, rotate all encrypted resources to a new key. Take final RDS snapshots before the deletion window expires.

Example scenario

Scheduling kms-prod for deletion starts a 7-day countdown. All RDS snapshots and S3 objects encrypted with it become permanently unreadable after 7 days.

AWS Documentation ↗

RDS_DELETION_NO_SNAPSHOT CRITICAL

Deleting RDS Instance Without Final Snapshot

AWS data

▾

Trigger

DELETE or REPLACE of aws_db_instance

Fires when

Always fires. Severity is CRITICAL if skip_final_snapshot=true, HIGH otherwise.

What breaks

RDS database and all its data permanently destroyed. With skip_final_snapshot=true, no backup is taken and data is immediately unrecoverable.

Why it happens

RDS deletion with skip_final_snapshot=true immediately destroys the database with no recovery option. Automated backups are also deleted with the instance.

Remediation

Always set skip_final_snapshot=false and specify a final_snapshot_identifier in production.

Example scenario

Deleting rds-prod with skip_final_snapshot=true permanently destroys the production database.

AWS Documentation ↗

SECURITY_GROUP_DELETION_BREAKS_EC2 HIGH

Deleting Security Group Used By EC2

AWS networking

▾

Trigger

DELETE or REPLACE of aws_security_group

Fires when

One or more EC2 instances, Lambda functions, or RDS databases have this security group attached.

What breaks

AWS blocks deletion of a security group in use with DependencyViolation. Apply fails immediately.

Why it happens

AWS enforces security group deletion constraints at the API level.

Remediation

Remove the security group from all associated resources before deleting. Replace it with an alternative first.

Example scenario

Deleting sg-web while ec2-app has it as primary security group fails with DependencyViolation.

AWS Documentation ↗

SUBNET_DELETION_BREAKS_CLUSTER CRITICAL

Deleting Subnet Used By EKS Cluster

AWS networking

▾

Trigger

DELETE or REPLACE of aws_subnet

Fires when

An EKS cluster or node group is deployed in this subnet.

What breaks

AWS blocks subnet deletion while EKS resources reference it with DependencyViolation.

Why it happens

AWS enforces subnet deletion constraints - a subnet cannot be removed while any network interfaces exist inside it.

Remediation

Delete the EKS cluster or remove node groups from this subnet before deleting.

Example scenario

Deleting subnet-private while eks-prod has its node group in it fails with DependencyViolation.

AWS Documentation ↗

LOADBALANCER_SUBNET_DELETION HIGH

Deleting Subnet Used By Load Balancer

AWS networking

▾

Trigger

DELETE or REPLACE of aws_subnet

Fires when

An Application or Network Load Balancer has nodes in this subnet.

What breaks

Load Balancer loses AZ coverage. If this is the only AZ, the apply fails. Reduced redundancy otherwise.

Why it happens

AWS ALBs must have at least one subnet. Removing a subnet the ALB is using may succeed if other subnets remain but reduces AZ coverage.

Remediation

Add the LB to a replacement subnet before removing it from the target subnet.

Example scenario

Removing subnet-us-east-1a from alb-prod reduces AZ coverage and increases single-AZ failure risk.

AWS Documentation ↗

TARGETGROUP_INSTANCE_REMOVAL HIGH

Removing EC2 Instance Used By Target Group

AWS compute

▾

Trigger

DELETE or REPLACE of aws_instance

Fires when

The EC2 instance is registered as a target in a Load Balancer Target Group.

What breaks

If this is the only healthy target, the load balancer returns 502/503 to all requests.

Why it happens

AWS allows EC2 termination regardless of load balancer registration. The health check detects the missing target and removes it from rotation.

Remediation

Deregister the instance from the target group before terminating. Ensure sufficient remaining targets.

Example scenario

Terminating ec2-app-1 while tg-prod has it as its only registered target causes alb-frontend to return 503 to all users.

AWS Documentation ↗

NAT_GATEWAY_DELETION HIGH

Deleting NAT Gateway Used By Private Subnets

AWS networking

▾

Trigger

DELETE or REPLACE of aws_nat_gateway

Fires when

One or more route tables have a default route pointing to this NAT Gateway.

What breaks

Private subnets lose outbound internet access. Lambda functions, ECS tasks, and EC2 instances cannot reach external endpoints.

Why it happens

AWS allows NAT Gateway deletion even when route tables reference it. The route table entry becomes a blackhole - traffic is silently dropped.

Remediation

Create a replacement NAT Gateway and update route table entries before deleting.

Example scenario

Deleting nat-prod blackholes the 0.0.0.0/0 route, blocking all Lambda functions and ECS tasks from reaching external APIs.

AWS Documentation ↗

S3_BUCKET_DELETION HIGH

Deleting S3 Bucket With Dependent Resources

AWS data

▾

Trigger

DELETE or REPLACE of aws_s3_bucket

Fires when

Always fires. Severity upgrades to CRITICAL if Lambda functions reference this bucket for deployment packages.

What breaks

All objects permanently destroyed (if force_destroy=true). Lambda functions using this bucket fail on cold starts. Applications get NoSuchBucket errors.

Why it happens

S3 bucket deletion with force_destroy=true immediately deletes all objects. Lambda cold starts re-fetch deployment packages from S3 and fail if the bucket is gone.

Remediation

Migrate all objects to a new bucket. Update Lambda function deployment packages.

Example scenario

Deleting s3-deployments while lambda-api references it causes the Lambda to fail on the next cold start.

AWS Documentation ↗

ELASTICACHE_DELETION HIGH

Deleting ElastiCache Cluster

AWS data

▾

Trigger

DELETE or REPLACE of aws_elasticache_cluster or aws_elasticache_replication_group

Fires when

Always fires.

What breaks

Applications lose their cache layer. Session-based applications log out all active users. Applications fall back to primary database causing increased query load.

Why it happens

ElastiCache deletion is immediate. All cached data is lost - Redis does not persist data unless configured with AOF or RDB snapshots.

Remediation

Provision a replacement cluster before deleting. Plan for user session invalidation during the migration window.

Example scenario

Deleting elasticache-sessions invalidates all 12,000 active user sessions and causes a 10x spike in database connections.

AWS Documentation ↗

SQS_QUEUE_DELETION HIGH

Deleting SQS Queue

AWS messaging

▾

Trigger

DELETE or REPLACE of aws_sqs_queue

Fires when

Always fires.

What breaks

All in-flight and queued messages permanently lost. Lambda event source mappings stop processing. Consumer applications receive QueueDoesNotExist errors.

Why it happens

SQS queue deletion is permanent and immediate. AWS enforces a 60-second delay before allowing recreation with the same name.

Remediation

Drain the queue before deletion. Update all Lambda event source mappings and consumer applications.

Example scenario

Deleting order-queue while 3,200 unprocessed orders are in-flight permanently loses all those messages.

AWS Documentation ↗

ROUTE53_RECORD_DELETION HIGH

Deleting Route53 Record Pointing To Live Resource

AWS dns

▾

Trigger

DELETE or REPLACE of aws_route53_record

Fires when

Always fires.

What breaks

DNS resolution fails for the hostname. All clients receive NXDOMAIN. Traffic stops routing to the target resource within 60 seconds.

Why it happens

Route53 changes propagate globally within 60 seconds.

Remediation

Reduce TTL to 60 seconds 24 hours in advance. Only delete records when confirmed no traffic routes through them.

Example scenario

Deleting api.example.com A record causes all API clients to receive NXDOMAIN within 60 seconds.

AWS Documentation ↗

CLOUDFRONT_ORIGIN_DELETION HIGH

Deleting CloudFront Origin Resource

AWS networking

▾

Trigger

DELETE or REPLACE of aws_cloudfront_distribution or origin backing resource

Fires when

A CloudFront distribution depends on the deleted resource as its origin.

What breaks

CloudFront returns 502/503 for all requests routing to this origin.

Why it happens

Deleting the backing origin (S3 bucket, ALB, custom origin) makes CloudFront unable to forward requests.

Remediation

Update the CloudFront distribution to use a replacement origin before deleting the original.

Example scenario

Deleting s3-static-assets while cloudfront-prod uses it as its S3 origin causes all static asset requests to return 502.

AWS Documentation ↗

INSTANCE_TYPE_DOWNGRADE MEDIUM

EC2 or RDS Instance Type Downgrade

AWS compute

▾

Trigger

UPDATE of aws_instance or aws_db_instance with a smaller instance_type or instance_class

Fires when

The new instance type is a lower tier than the current type.

What breaks

Instance stopped and restarted. Available CPU and memory reduced. Workloads may experience increased latency or failures under peak load.

Why it happens

EC2 and RDS instance type changes require a reboot (EC2) or maintenance window restart (RDS).

Remediation

Load test at the target type in staging. Schedule during off-peak hours.

Example scenario

Downgrading rds-prod from db.r5.2xlarge to db.r5.large causes a restart and elevated query latency under peak load.

AWS Documentation ↗

RDS_BACKUP_RETENTION_REDUCTION HIGH

RDS Backup Retention Period Reduced

AWS data

▾

Trigger

UPDATE of aws_db_instance or aws_rds_cluster with a lower backup_retention_period

Fires when

New retention value is lower than current. Severity is HIGH if set to 0 (disables backups), MEDIUM for any reduction.

What breaks

Recovery window shortened. Setting to 0 disables automated backups entirely, making point-in-time recovery impossible.

Why it happens

RDS automated backups are purged immediately when they fall outside the retention window.

Remediation

Evaluate RPO requirements. Consider snapshot-based backups before disabling automated backups.

Example scenario

Reducing backup retention from 35 days to 0 disables point-in-time recovery entirely.

AWS Documentation ↗

AUTOSCALING_POLICY_REMOVAL MEDIUM

Auto Scaling Policy Removed

AWS compute

▾

Trigger

DELETE or REPLACE of aws_autoscaling_policy or aws_appautoscaling_policy

Fires when

Always fires.

What breaks

Target resource stops automatically scaling. Under traffic spikes, the resource remains at current capacity.

Why it happens

Removing a scaling policy does not affect current capacity but means the next spike will not trigger scale-out.

Remediation

Verify current capacity is sufficient for expected peak load before removing.

Example scenario

Removing ecs-cpu-scaling causes api-service to stay at 2 tasks during a spike, causing 503 errors.

AWS Documentation ↗

CLOUDWATCH_ALARM_DELETION MEDIUM

CloudWatch Alarm Deleted

AWS observability

▾

Trigger

DELETE or REPLACE of aws_cloudwatch_metric_alarm

Fires when

Always fires.

What breaks

Monitoring coverage for the targeted metric removed. Incidents may go undetected.

Why it happens

CloudWatch alarms are the primary operational alerting mechanism for AWS resources.

Remediation

Export alarm configuration before deletion. Ensure alternative monitoring coverage exists.

Example scenario

Deleting rds-cpu-alarm means the DBA team receives no notification when rds-prod CPU exceeds 90%.

AWS Documentation ↗

S3_VERSIONING_DISABLED LOW

S3 Bucket Versioning Disabled

AWS governance

▾

Trigger

UPDATE of aws_s3_bucket or aws_s3_bucket_versioning disabling versioning

Fires when

Versioning was previously enabled.

What breaks

Future overwrites or deletions of objects cannot be recovered from previous versions.

Why it happens

S3 versioning once suspended means all new writes create a single version with no history.

Remediation

Consider Glacier lifecycle policies or Object Lock as cost alternatives. Never disable versioning on buckets containing critical artifacts.

Example scenario

Disabling versioning on s3-deployments means the next accidental overwrite of a Lambda deployment package is permanent.

AWS Documentation ↗

PRODUCTION_TAG_REMOVAL LOW

Production or Compliance Tags Removed

AWS governance

▾

Trigger

UPDATE of any AWS resource removing tags with keys environment/env/team/owner

Fires when

Production or compliance tags were present before the change and absent after.

What breaks

Cost allocation reports misclassify the resource. AWS Config rules and SCPs using tag conditions may stop applying.

Why it happens

AWS tagging is the foundation for cost allocation, compliance enforcement via Config, and resource ownership tracking.

Remediation

Review why the tag is being removed. Update dependent cost reports, Config rules, and SCP conditions.

Example scenario

Removing Environment=production from rds-prod causes Cost Explorer to misclassify it as non-production.

AWS Documentation ↗

Cross-Cloud Rules

Applied to both clouds

DELETE_WITH_DEPENDENTS HIGH

Delete Resource With Dependents

AWS + Azure networking

▾

Trigger

DELETE or REPLACE of any resource with dependent resources in the live infrastructure graph

Fires when

The resource has downstream dependents in the discovered graph that are not covered by a more specific rule.

What breaks

Dependent infrastructure loses its connection to the deleted resource. Exact impact depends on the specific resource type and its dependents.

Why it happens

This is the catch-all rule that fires when a resource has graph dependents but no dedicated rule exists for it yet. It provides a general blast-radius signal.

Remediation

Examine the impacted resources listed in the finding. Determine whether each dependent can continue operating without the deleted resource.

Example scenario

Deleting a custom Terraform module resource that has downstream graph dependents triggers this catch-all finding.

Questions about a rule or how OpStack works? Contact Support ↗

⌕

No rules match

Try a different keyword or clear the filters.