Centralizing Health Event Reporting for AWS Organizations at Scale

Table of Contents

AWS Health provides ongoing visibility into your resource performance and the availability of your AWS services and accounts. You can use AWS Health events to learn how service and resource changes might affect your applications running on AWS. AWS Health provides relevant and timely information to help you manage events in progress. It also helps you be aware of, and to prepare for, planned activities.

If you have a small number of AWS accounts, you can configure a notification, triggered by AWS Health events, and be prepared. But if you have hundreds of AWS accounts, this may become challenging, as you may miss some important planned activity.

Problem statement

A company has an AWS Organization with several hundred accounts. The company’s support team wants to receive a daily report via email. The report should have aggregation by event types and affected AWS accounts, for example:

Proposed solution

By default, you can use AWS Health to view the AWS Health events of a single AWS account. If you use AWS Organizations, you can also view AWS Health events centrally across your organization. This feature provides access to the same information as single-account operations.

  1. A Delegated Administrator will be set up to minimize the use of the Management account.
  2. The Delegated Administrator account will see all health events for the whole AWS organization.
  3. Amazon EventBridge scheduled rule will execute the Lambda daily
  4. AWS Lambda function will use AWS API (boto3 Python library) to collect information and create the file.
  5. (Optionally) We can store the file in an Amazon S3 bucket.
  6. Amazon Simple Email Service (SES) will be used to send the email, because we need a file attached there.

Prerequisites

  1. Before you use the organizational view, you must be part of an organization with all features enabled.

2. Enable organizational view for AWS Health

After you set up AWS Organizations and sign in to the management account, you can enable AWS Health to aggregate all events. These events appear in the AWS Health Dashboard.

3. Register a Delegated Administrator for organizational view (optional, but highly recommended)

After you enable organizational view for your organization, you can register up to five member accounts in your organization as a delegated administrator. To do this, call the RegisterDelegatedAdministrator API operation. After registering the member accounts, they are delegated to administer accounts and can access the AWS Health organizational view from the AWS Health Dashboard.

If the account has a Business, Enterprise On-Ramp, or Enterprise Support plan, then the delegated administrators can use the AWS Health API to access the AWS Health organizational view.

To establish a delegated administrator, from the management account in your organization, call the following AWS Command Line Interface (AWS CLI) command:

aws organizations register-delegated-administrator --account-id ACCOUNT_ID --service-principal  health.amazonaws.com

Then, new tabs appear in the AWS Health dashboard console:

Lambda code

We already have all the information in one place. We only need to pack it into a file and send it via email daily.

There is no single API method to get all the information, so we need to use several methods together:

Here is the Python code for the Lambda function (available on GitHub as well):

import boto3
import logging
from datetime import datetime
import os
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText
from email.mime.application import MIMEApplication

# Constants
OUTPUT_FILE_PATH = "/tmp/output.txt"
OBJECT_KEY = "output.txt"
SENDER = os.environ.get('email_from')
RECIPIENT = os.environ.get('email_to')
EMAIL_SUBJECT = "AWS Health Events Report"
# If needed to upload report to S3
BUCKET_NAME = os.environ.get('output_bucket')

# Set up logging
logger = logging.getLogger()
logger.setLevel(logging.INFO)

# Global clients
health_client = boto3.client('health', region_name='us-east-1')
s3_client = boto3.client('s3')
ses_client = boto3.client('ses', region_name='us-east-1')

def describe_health_events_for_organization():
    events = []
    next_token = None
    while True:
        params = {
            'filter': {
                'eventStatusCodes': ['open', 'upcoming']
            },
            'maxResults': 50
        }
        if next_token:
            params['nextToken'] = next_token

        try:
            response = health_client.describe_events_for_organization(**params)
        except Exception as e:
            logger.error("Failed to call describe_events_for_organization")
            logger.exception(e)
            raise

        events.extend(response.get('events', []))
        next_token = response.get('nextToken')
        if not next_token:
            break
    return events

def describe_health_events_details_for_organization(item, account_id):
    try:
        response = health_client.describe_event_details_for_organization(
            organizationEventDetailFilters=[
                {
                    'eventArn': item['arn'],
                    'awsAccountId': account_id
                }
            ]
        )
        return response
    except Exception as e:
        logger.error(f"Error getting event details: {e}")
        return {}

def describe_affected_accounts(item):
    try:
        response = health_client.describe_affected_accounts_for_organization(
            eventArn=item['arn'],
            maxResults=50
        )
        return response.get('affectedAccounts', [])
    except Exception as e:
        logger.error(f"Error getting affected accounts: {e}")
        return []

def describe_affected_entities(item, account_id):
    try:
        response = health_client.describe_affected_entities_for_organization(
            maxResults=50,
            organizationEntityAccountFilters=[
                {
                    'eventArn': item['arn'],
                    'awsAccountId': account_id,
                    'statusCodes': ['IMPAIRED', 'UNIMPAIRED', 'UNKNOWN', 'PENDING']
                }
            ]
        )
        return response.get('entities', [])
    except Exception as e:
        logger.error(f"Error getting affected entities: {e}")
        return []

def send_email_with_attachment():
    try:
        msg = MIMEMultipart()
        msg['Subject'] = EMAIL_SUBJECT
        msg['From'] = SENDER
        msg['To'] = RECIPIENT

        body = MIMEText("Please find attached the latest AWS Health Events report.", 'plain')
        msg.attach(body)

        with open(OUTPUT_FILE_PATH, 'rb') as file:
            attachment = MIMEApplication(file.read())
            attachment.add_header('Content-Disposition', 'attachment', filename=os.path.basename(OUTPUT_FILE_PATH))
            msg.attach(attachment)

        response = ses_client.send_raw_email(
            Source=SENDER,
            Destinations=[RECIPIENT],
            RawMessage={
                'Data': msg.as_string()
            }
        )
        logger.info("Email sent successfully.")
    except Exception as e:
        logger.error(f"Failed to send email: {e}")

def lambda_handler(event, context):
    health_events = describe_health_events_for_organization()
    affected_accounts_total = set()

    with open(OUTPUT_FILE_PATH, "a") as output_file:
        for item in health_events:
            output_file.write(f"\n{item['eventTypeCode']}\n")
            affected_accounts = describe_affected_accounts(item)
            for account in affected_accounts:
                affected_accounts_total.add(account)
                output_file.write(f"{account} - ")

                details = describe_health_events_details_for_organization(item, account)
                try:
                    event_info = details['successfulSet'][0]['event']
                    output_file.write(f"{event_info['startTime']} - ")
                except (IndexError, KeyError):
                    output_file.write("No details - ")
                    continue

                entities = describe_affected_entities(item, account)
                if not entities:
                    output_file.write(f"no entity - {event_info.get('region', 'N/A')}\n")
                else:
                    for entity in entities:
                        output_file.write(f"{entity['entityValue']}; ")
                    output_file.write(f"- {event_info.get('region', 'N/A')}\n")

        output_file.write("\n------------------\n")

        for account in affected_accounts_total:
            output_file.write(f"\n{account}\n")
            response = health_client.describe_events_for_organization(
                filter={
                    'awsAccountIds': [account],
                    'eventStatusCodes': ['open', 'upcoming']
                },
                maxResults=50
            )
            for event in response.get('events', []):
                entities = describe_affected_entities(event, account)
                time_str = event['startTime'].strftime("%m/%d/%Y, %H:%M:%S")
                region = event.get('region', 'N/A')
                if not entities:
                    output_file.write(f"{event['eventTypeCode']} - {time_str} - {region}\n")
                else:
                    for entity in entities:
                        output_file.write(f"{event['eventTypeCode']} - {time_str} - {entity['entityValue']} - {region}\n")

    try:
        s3_client.upload_file(OUTPUT_FILE_PATH, BUCKET_NAME, OBJECT_KEY)
        logger.info("Output file successfully uploaded to S3")
    except Exception as e:
        logger.error(f"Failed to upload file to S3: {e}")

    # Send file via email
    send_email_with_attachment()

 

Deploy

Terraform Serverless.tf was used to pack the solution (available on GitHub as well):

provider "aws" {
  region = "us-east-1"

  # Make it faster by skipping something
  #skip_metadata_api_check     = true
  skip_region_validation      = true
  skip_credentials_validation = true
}

module "eventbridge" {
  source = "terraform-aws-modules/eventbridge/aws"

  create_bus = false

  rules = {
    crons = {
      description         = "Trigger for a Lambda"
      schedule_expression = "rate(1 day)"
    }
  }

  targets = {
    crons = [
      {
        name  = "lambda-loves-cron"
        arn   = module.lambda_function.lambda_function_arn
        input = jsonencode({"job": "cron-by-rate"})
      }
    ]
  }
}

module "lambda_function" {
  source = "terraform-aws-modules/lambda/aws"

  function_name = "aws-health-regular-check"
  description   = "Daily report about AWS Health events"
  handler       = "lambda_function.lambda_handler"
  runtime       = "python3.13"
  timeout       = 300

  source_path = "./src/"

  environment_variables = {
    email_from      = var.email_from
    email_to        = var.email_to
    output_bucket   = var.output_bucket_name
  }

  create_current_version_allowed_triggers = false
  allowed_triggers = {
    OneRule = {
      principal  = "events.amazonaws.com"
      source_arn = module.eventbridge.eventbridge_rule_arns["crons"]
    }
  }

   attach_policy_statements = true
   policy_statements = {
     aws_health = {
       effect    = "Allow",
       actions   = [
        "health:DescribeEventsForOrganization", 
        "health:DescribeEventDetails",
        "health:DescribeEventDetailsForOrganization",
        "health:DescribeAffectedAccountsForOrganization",
        "health:DescribeAffectedEntitiesForOrganization"
        ],
       resources = ["*"]
     },
     organizations = {
       effect    = "Allow",
       actions   = ["organizations:ListAccounts"],
       resources = ["*"]
     },
     s3 = {
       effect    = "Allow",
       actions   = ["s3:PutObject"],
       resources = ["arn:aws:s3:::${var.output_bucket_name}/*"]
     },
    ses = {
       effect    = "Allow",
       actions   = ["ses:SendRawEmail"],
       resources = [
            aws_ses_email_identity.email_from.arn, 
            aws_ses_email_identity.email_to.arn
       ]
    }
   }

  tags = {
    Name = "aws-health-regular-check"
  }
}

### An identity is a email address you use to send email through Amazon SES. Identity verification at the domain level extends to all email addresses under one verified domain identity. To verify ownership of an email address, you must have access to its inbox to open the verification email.

resource "aws_ses_email_identity" "email_from" {
  email = var.email_from
}

resource "aws_ses_email_identity" "email_to" {
  email = var.email_to
}

terraform.tfvars contains the email of the sender and receiver, which should be a real email under your control:

email_from              = "email.from@example.com"
email_to                = "email.to@example.com"
output_bucket_name      = "example-bucket"

The last two Terraform resources, “aws_ses_email_identity”, create an Amazon SES Identity. Otherwise, you would have to do it manually in the Amazon SES console:

After the Terraform apply, you will see the “Verification pending” status:

And you will receive an email on the addresses you provided for “sender” and “receiver”:

Follow the link from the email and make sure your addresses are verified in SES:

 

Result

Reports are sent every day:

If you don’t need Amazon SES for other purposes, the Sandbox mode will be enough, as we send only one email per day:

Conclusion

In this post, we looked at centralized aggregated reporting about AWS Health events for a multi-account AWS organization. Such reports are helpful for planning and carefully reacting to the scheduled AWS changes (pathing, retirements, etc.) and other notifications about AWS services.