Thursday, April 4, 2019

How to define a CloudWatch Alarm in CloudFormation using Metrics Math

It took my quite some time to figure out how to define a CloudWatch Alarm using CloudFormation using Metrics math. Especially because I could not find a proper example, especially not a yaml example. 

After some searching I found a JSON example and converted it for my purposes into yaml


Starting point for my example is a RDS Oracle DB instance defined in the same CloudFormation template,
 but you can use any resource which exposes metrics

Oracle:
  Type: "AWS::RDS::DBInstance"
  DeletionPolicy: Retain
  Properties:
    AllocatedStorage: !Ref DBInstanceStorage
    DBInstanceClass: !Ref DBInstanceClass
    ....

I wanted to create an alert based on the available storage space on the instance
using a percentage value. So I used Metrics Math to convert the FreeStorageSpace metric provided by RDS. The metric comes in bytes, the RDS storage is given in Giga Bytes. So Storage size is converted into Bytes and put into relation to free storage to get a percentage value. 



The Metrics section is the important one. To better understand it, read it bottom up. 
The second entry (Id: m1) actually pulls in the RDS FreeStorageSpace metric for a 5 minute period. "ReturnData: False" makes the data available for calculation but does not return the data in the Metric.
The first entry (Id e1) does the actual math and returns the percentage value for free storage. 

LowStorageSpace:
  Type: AWS::CloudWatch::Alarm
  Properties:
    AlarmDescription: !Sub 'The FreeStorageSpace of ${Oracle} DB is below X%'
    Metrics:
      - Id: e1
        Expression: !Sub 'm1/(${DBInstanceStorage}*1024*1024*1024)*100'
        Label: free storage percentage
      - Id: m1
        MetricStat:
          Metric:
            Dimensions:
            - Name: DBInstanceIdentifier
              Value: !Ref Oracle
            MetricName: FreeStorageSpace
            Namespace: AWS/RDS
          Period: 300
          Stat: Average
          Unit: Bytes
        ReturnData: False
    Threshold: 10
    ComparisonOperator: LessThanOrEqualToThreshold
    EvaluationPeriods: 5
    AlarmActions:
    - Fn::ImportValue: alarming-topic-alarmNotificationTopicArn'
    OKActions:
    - Fn::ImportValue: alarming-topic-alarmNotificationTopicArn'
    InsufficientDataActions:

    - Fn::ImportValue: alarming-topic-alarmNotificationTopicArn'


An overview of all available metrics provided by AWS services can be found in the docs.

The AWS CLI is also helpfull to better understand metrics: 

aws cloudwatch list-metrics --namespace "AWS/RDS" --metric-name "FreeStorageSpace"

for example shows the available Dimension. 

aws cloudwatch get-metric-statistics --namespace "AWS/RDS" --metric-name "FreeStorageSpace" --start-time 2019-04-04T05:00:00Z --end-time 2019-04-04T10:00:00Z --period 300 --statistics Average

Allows to experiment with exactly the parameters you also have to specify in the MetricStat section. 

Unfortunately the "Unit:" Parameter in the MetricStat section does not allow you to choose the unit you would like to receive but rather is used as a filter, so it has to match to the metric you want to load.