Optimizing AWS Costs: Automating EBS Snapshot Cleanup with Lambda
Table of contents
Introduction
Managing cloud costs effectively is paramount for any organization leveraging AWS (Amazon Web Services). One proven strategy to optimize costs involves identifying and removing unused resources. This article will focus on saving storage costs by efficiently detecting and deleting stale EBS (Elastic Block Store) snapshots.
Understanding EBS Snapshots
EBS snapshots are backups for your EBS volumes, which are crucial for data integrity and disaster recovery. However, these snapshots can become orphaned over time when associated instances are terminated or volumes are deleted. Orphaned snapshots continue to occupy storage space and incur unnecessary expenses, impacting your AWS billing.
Automating Cleanup with Lambda
To address this challenge, we will demonstrate how to automate the identification and deletion of stale EBS snapshots using AWS Lambda. Lambda allows you to run code without provisioning or managing servers, making it ideal for cost-effective and scalable automation tasks.
Prerequisites
Before diving into the implementation, ensure the following prerequisites are in place:
AWS Account: You need permissions to manage Lambda functions, EBS snapshots, and EC2 instances.
AWS CLI: Installed and configured on your local machine for AWS service interactions.
Basic IAM Understanding: Ability to create roles and custom policies.
Basic AWS Lambda Knowledge: Understanding of how to create and deploy Lambda functions.
Python Knowledge: Familiarity with Python programming, as our Lambda function will be implemented in Python.
Boto3 Library: Basic knowledge of Boto3, the AWS SDK for Python, which will be used for AWS API interactions.
Step 1: Create an EC2 Instance
You have two options for provisioning an EC2 Instance: you can either create it manually through the AWS Management Console or utilize Terraform for automation. The choice is yours based on your preference and operational needs.
If you prefer to proceed with manual creation, you can follow the detailed guide provided in this article: Unlock the Cloud: A Step-by-Step Guide to Launching Your First EC2 Instance on AWS.
Alternatively, if you opt for automation using Terraform, you can refer to the comprehensive instructions outlined in this article: Creating EC2 Instance Using Terraform on AWS.
Once you have completed either method, we can proceed with the next steps.
Step 2: Verify the Volume
To access details about the root volume of your EC2 instance:
Navigate to the EC2 Console.
Find your instance and click on its ID or name.
Scroll down to the
Description
tab.Click the root volume link to view size, state, and type details.
Note that this volume was automatically created during instance setup and serves as the root volume for your EC2 instance.
Step 3: Creating Snapshots of the Volume
To verify the absence of snapshots for our instance:
Navigate to the EC2 Dashboard by searching for
EC2
in the services menu.Locate and click on the
Snapshots
section in the left-hand menu.
This will allow you to confirm that no snapshots are currently available for our instance.
- To create a snapshot of the instance click on the
Create Snapshot
button.
Select the EBS volume from the dropdown menu for which you wish to generate a snapshot. You may also include a description to add further details about the snapshot if needed.
- Click on the
Create Snapshot
button to initiate the snapshot creation process.
Step 4: AWS Lambda
To leverage AWS Lambda for executing code in response to events without managing servers, our project focuses on automating the identification and removal of outdated EBS snapshots. Here are the steps to create this Lambda function using the AWS Management Console:
A. Sign in to the AWS Management Console:
- Access the AWS Management Console and authenticate using your AWS credentials.
B. Navigate to the Lambda Dashboard:
- Locate and select
Lambda
from the services menu in the AWS Management Console.
Start Creating a Lambda Function:
- On the Lambda Dashboard, click the
Create function
button.
- On the Lambda Dashboard, click the
C. Choose Authoring Method:
Opt for
Author from scratch
to initiate the creation of a new Lambda function.Configure Basic Details:
Specify a descriptive name for your Lambda function.
Choose
Python 3.12
as the runtime environment.
Create the Lambda Function:
- Click on
Create function
to proceed with the creation of your Lambda function.
- Click on
Edit Configuration:
- Navigate to the
Configuration
tab of your Lambda function and clickEdit
.
- Navigate to the
- Adjust the function timeout setting to "10 seconds" (default is 3 seconds).
This setting determines the maximum duration Lambda allows for function execution before terminating it. Ensure to minimize execution time to optimize cost efficiency based on AWS billing parameters.
Save your changes by clicking
Save
.
Step 5: Setup IAM Role
In our project, the Lambda function plays a crucial role in optimizing AWS costs by identifying and deleting stale EBS snapshots.
To accomplish this, the function requires specific permissions: the ability to describe and delete snapshots, and to describe volumes and instances.
Roles are used to delegate access to AWS resources securely, eliminating the need to share long-term credentials such as access keys.
Follow these steps to configure the necessary permissions:
Navigate to the Lambda function details page and click on the
Configuration
tab.Scroll down to the
Permissions
section and expand it.Click on the execution role link to open the IAM role configuration in a new tab.
In the newly opened tab, you will be directed to the IAM Console with details of the IAM role associated with your Lambda function:
- Scroll down to the
Permissions
section of the IAM role details page.
- Click on the
Add inline policy
button to create a new inline policy.
To configure the policy:
Choose EC2 as the service and filter permissions.
Search for
Snapshot
and add theDescribeSnapshots
andDeleteSnapshots
permissions.Add the
DescribeVolume
andDescribeInstances
permissions as well.
Under the Resources section:
Select
All
.Click the
Next
button.
Final steps:
Name the policy and click the
Create Policy
button.Ensure that the newly created policy is attached to the existing role.
Step 6: Writing the Lambda Function
Our Lambda function, powered by Boto3, automates the identification and deletion of stale EBS snapshots. Key features include:
Snapshot Retrieval: Fetching owned EBS snapshots and active EC2 instances.
Stale Snapshot Detection: Identifying unattached snapshots and checking volume-attachment status.
Exception Handling: Ensuring robustness with error management.
Cost Optimization: Efficiently managing resources to minimize storage costs.
import boto3
def lambda_handler(event, context):
ec2 = boto3.client('ec2')
# Get all EBS snapshots
response = ec2.describe_snapshots(OwnerIds=['self'])
# Get all active EC2 instance IDs
instances_response = ec2.describe_instances(Filters=[{'Name': 'instance-state-name', 'Values': ['running']}])
active_instance_ids = set()
for reservation in instances_response['Reservations']:
for instance in reservation['Instances']:
active_instance_ids.add(instance['InstanceId'])
# Iterate through each snapshot and delete if it's not attached to any volume or the volume is not attached to a running instance
for snapshot in response['Snapshots']:
snapshot_id = snapshot['SnapshotId']
volume_id = snapshot.get('VolumeId')
if not volume_id:
# Delete the snapshot if it's not attached to any volume
ec2.delete_snapshot(SnapshotId=snapshot_id)
print(f"Deleted EBS snapshot {snapshot_id} as it was not attached to any volume.")
else:
# Check if the volume still exists
try:
volume_response = ec2.describe_volumes(VolumeIds=[volume_id])
if not volume_response['Volumes'][0]['Attachments']:
ec2.delete_snapshot(SnapshotId=snapshot_id)
print(f"Deleted EBS snapshot {snapshot_id} as it was taken from a volume not attached to any running instance.")
except ec2.exceptions.ClientError as e:
if e.response['Error']['Code'] == 'InvalidVolume.NotFound':
# The volume associated with the snapshot is not found (it might have been deleted)
ec2.delete_snapshot(SnapshotId=snapshot_id)
print(f"Deleted EBS snapshot {snapshot_id} as its associated volume was not found.")
This script is pivotal in our AWS cost optimization strategy, demonstrating the effectiveness of serverless computing in streamlining operations and reducing expenses.
Step 7: Testing the Lambda Function
To simulate a real-world scenario, start by deleting the existing EC2 instance. When an EC2 instance is deleted, AWS automatically removes the attached EBS volume as illustrated below.
However, any EBS snapshots associated with that volume remain in storage, even though they are no longer needed.
- These snapshots, termed ‘stale,’ incur additional storage costs without serving any purpose. Therefore, it’s crucial to regularly identify and remove such stale snapshots to optimize AWS storage costs effectively.
Once the instance is deleted, we can observe whether our Lambda function successfully identifies and removes any associated snapshots.
Follow these steps:
Terminate the EC2 Instance:
- Navigate to the created EC2 instance and terminate it.
Set Up the Lambda Function:
Navigate to the created Lambda function.
Under the
Code
section inlambda_function
paste the above Python code for your Lambda function.Ensure that your code includes the necessary imports (e.g.,
import boto3
) and thelambda_handler
function.
- Once your function passes testing, you can deploy it by clicking on the
Deploy
button.
The Lambda function automatically finds and deletes these stale snapshots, helping you manage your AWS expenses more efficiently.
By following these steps, you can test whether the Lambda function effectively identifies and deletes the stale snapshots, thus optimizing your AWS storage costs.
Conclusion
In this project, we’ve implemented a solution to automate the identification and deletion of stale EBS snapshots, leveraging AWS Lambda and Boto3. By optimizing storage usage, we have reduced costs and improved resource efficiency. This project demonstrates the effectiveness of automation in driving cost optimization within AWS, setting the stage for continued success in our cloud management endeavors.