Governance Policies Cost Scaling - Azure/az-prototype GitHub Wiki
Governance policies for Scaling
Domain: cost
| Name | Description |
|---|---|
| Environment-aware autoscale configuration | Dev/POC uses aggressive scale-down with low maximums; production uses higher minimums with zone-redundant capacity |
| Spot VM cost optimization | Use spot VMs for interruptible workloads in dev/POC to achieve up to 90% cost savings |
| Description | Instead |
|---|---|
| Do not deploy compute resources with fixed instance counts | Configure autoscale with appropriate min/max and metrics-based scaling rules |
| Do not use the same scale configuration for dev and production | Use lower minimums, lower maximums, and scale-to-zero where possible in dev |
| Do not scale on a single metric | Use CPU as the primary trigger; add memory, queue depth, or HTTP connections as secondary triggers |
- Azure Autoscale overview
- Container Apps scaling
- AKS cluster autoscaler
- Cosmos DB autoscale throughput
- SQL elastic pools
| Check | Severity | Description |
|---|---|---|
| WAF-COST-SCALE-001 | Required | Configure App Service autoscale with CPU-based rules — scale out at >70%, scale in at <30%, with cooldown periods |
| WAF-COST-SCALE-002 | Required | Configure Container Apps scaling rules with appropriate min/max replicas and HTTP/custom scaling triggers |
| WAF-COST-SCALE-003 | Required | Configure VMSS autoscale profiles with CPU-based rules and scheduled profiles for predictable workloads |
| WAF-COST-SCALE-004 | Required | Configure database autoscale — Cosmos DB autoscale maxThroughput for production, SQL elastic pools for multi-database workloads |
| WAF-COST-SCALE-005 | Required | Configure AKS cluster autoscaler with appropriate node pool settings — spot nodes for dev, on-demand for production |
Configure App Service autoscale with CPU-based rules — scale out at >70%, scale in at <30%, with cooldown periods
Severity: Required
Rationale: Autoscale prevents both over-provisioning (cost waste) and under-provisioning (performance degradation). Cooldown prevents flapping
Agents: terraform-agent, bicep-agent, cloud-architect, cost-analyst
- Microsoft.Web/sites
- Microsoft.App/containerApps
- Microsoft.Compute/virtualMachines
- Microsoft.Compute/virtualMachineScaleSets
- Microsoft.DocumentDB/databaseAccounts
- Microsoft.Sql/servers/databases
- Microsoft.ContainerService/managedClusters
Configure Container Apps scaling rules with appropriate min/max replicas and HTTP/custom scaling triggers
Severity: Required
Rationale: Container Apps scaling is per-app; proper configuration prevents idle costs in dev and ensures availability in production
Agents: terraform-agent, bicep-agent, cloud-architect, cost-analyst
- Microsoft.Web/sites
- Microsoft.App/containerApps
- Microsoft.Compute/virtualMachines
- Microsoft.Compute/virtualMachineScaleSets
- Microsoft.DocumentDB/databaseAccounts
- Microsoft.Sql/servers/databases
- Microsoft.ContainerService/managedClusters
Configure VMSS autoscale profiles with CPU-based rules and scheduled profiles for predictable workloads
Severity: Required
Rationale: VMSS without autoscale runs at fixed capacity; autoscale adapts to demand and reduces off-hours costs
Agents: terraform-agent, bicep-agent, cloud-architect, cost-analyst
- Microsoft.Web/sites
- Microsoft.App/containerApps
- Microsoft.Compute/virtualMachines
- Microsoft.Compute/virtualMachineScaleSets
- Microsoft.DocumentDB/databaseAccounts
- Microsoft.Sql/servers/databases
- Microsoft.ContainerService/managedClusters
Configure database autoscale — Cosmos DB autoscale maxThroughput for production, SQL elastic pools for multi-database workloads
Severity: Required
Rationale: Database scaling directly impacts both cost and performance; autoscale prevents over-provisioning while handling spikes
Agents: terraform-agent, bicep-agent, cloud-architect, cost-analyst
- Microsoft.Web/sites
- Microsoft.App/containerApps
- Microsoft.Compute/virtualMachines
- Microsoft.Compute/virtualMachineScaleSets
- Microsoft.DocumentDB/databaseAccounts
- Microsoft.Sql/servers/databases
- Microsoft.ContainerService/managedClusters
Configure AKS cluster autoscaler with appropriate node pool settings — spot nodes for dev, on-demand for production
Severity: Required
Rationale: AKS cluster autoscaler adjusts node count automatically; spot VMs provide up to 90% savings for interruptible workloads
Agents: terraform-agent, bicep-agent, cloud-architect, cost-analyst
- Microsoft.Web/sites
- Microsoft.App/containerApps
- Microsoft.Compute/virtualMachines
- Microsoft.Compute/virtualMachineScaleSets
- Microsoft.DocumentDB/databaseAccounts
- Microsoft.Sql/servers/databases
- Microsoft.ContainerService/managedClusters