topics

Operational Governance

January 7, 2025

CloudHealth recommends adding the following Azure Best Practice Policies for Operational Governance.

Step 1 of 6

Identify and Terminate Zombie Virtual Machines

Zombie virtual machines are running virtual machines that are idle, most likely forgotten, and costing you money. Identify VMs that are running with a daily average CPU rate lower than 10% for 2 weeks in a row and Network I/O less than 5 MB for 4 or more days. If you want to be more specific, isolate instances based on their instance type.

Example: F-series VMs (compute optimized) that have a Maximum CPU less than 10% for the last 14 days are most likely to be running idle and are good candidates to be terminated.

Sample Zombie VM Identifying Policy: This policy identifies VM series compute optimized (e.g., F series) that have a low average CPU % and sends a notification.

In addition, by leveraging CloudHealth Perspectives, you can run this policy against specific non-production environments.

Variant: Add different rules that capture other performance metrics such as network traffic.

Step 2 of 6

Identify and Terminate Zombie Disks

When a virtual machine is deleted in Azure, any disks attached to the VM aren't automatically deleted, costing you money.

Example: Identify disks that have been unattached for more than 2 weeks and terminate them after confirming that they do not contain critical data.

Sample Zombie Disk Identifying Policy: This policy identifies unattached disks and sends a notification to a user who can review the disk and determine whether to delete it.

Step 3 of 6

Identify and Delete Old Snapshots

These are old snapshots that have crossed a certain age threshold. Old snapshots can become a legal liability.

Example: Identify snapshots that are older that a specified time period.

Sample old Snapshot Identifying Policy: This policy sends a notification when it identifies potential zombie VM snapshots that older than 6 months.

Step 4 of 6

VM Scheduling (Lights On/Lights Off)

Not all Virtual Machines are in use 24x7x365, especially those outside of production. These VMs can be periodically shut down to reduce cost.

Sample Lights on/Lights off Policy: Turns off development environment over the weekend.

Step 5 of 6

Locate Unattached IP Addresses

A network interface (NIC) is the interconnection between an Azure Virtual Machine (VM) and the underlying software network. A VM has one or more NICs attached to it depending on the VM size.

You can manage NICs as objects that are decoupled form the VM. When you delete a VM, the NIC object remains unattached and its settings persist, including the Public IP Address that is associated with it, subnets, and Network Security Groups.

Sample Unattached NIC Identifying Policy: This policy sends a notification when unattached IP addresses are detected in your Azure infrastructure. You can use the notifications to determine whether you want to retain the unattached NICs.

Variant: Add different conditions that capture other performance metrics such as network traffic.

Step 6 of 6

Identify VMs on Unapproved Operating Systems

The price per server fluctuates depending on the operating system (OS) or license used. Identify virtual machines that are running on an unapproved OS.

Sample Unapproved OS Policy: This policy sends a notification when a VM runs on an unapproved OS.

Variant: Change the filter to identify VMs running on old generation VM types.