tl;dr - Enable lifecycle policies across all of your S3 buckets to delete incomplete multipart uploads. This will likely save a considerable amount of money across both personal accounts and enterprise account organizations. I saved 30% in S3 storage costs in my personal account by enabling these policies.
Walkthrough of details
I utilize Amazon Web Services (AWS) S3 extensively within my AWS automation. In order to reduce costs across the environment, I was looking for opportunities to save costs as they have been growing through increased use. As a design principle, I typically try to avoid any type of persistence related costs and utilize fully ephemeral or consumption based services.
One of the areas that is unavoidable in cost growth is storage as the environment grows. As I began digging into granular costs, I realized that over 30% of my S3 storage was Incomplete Multipart Uploads. I found this when I thought I deleted everything out of my Athena queries bucket and somehow still had nearly 300 GB in storage remaining with no visible objects. Additional details about multipart uploads can be found in this AWS blog.
In order to get more visibility, I enabled AWS S3 Storage Lens. This is a must-use feature if you utilize S3 storage. It provides detailed analysis including cost and security insights across the environment. This service can also provide visibility across a multi-account organization. The base metrics are free to enable and are sufficient for this use case.
Enabling Storage Lens
To enable Storage Lens:
- Open the S3 service, click on Dashboards.
- Click "Create Dashboard"
- Provide a dashboard name.
- Select the relevant region.
- Set the Status to Enabled.
- The Dashboard Scope section has options to include additional accounts through the use of AWS Organizations.
- Select "Include Regions and buckets".
- Check the boxes for "Include all Regions" and "Include all buckets".
- Select "Free metrics".
- Select "Disable" for Metrics Export.
- Click "Create dashboard".
The dashboard will likely take around 24 hours to fully generate but will be a valuable resource to utilize for the future.
Once the dashboard is generated, you can view cost savings opportunities through the following steps:
- Open the S3 service.
- On the left side panel, navigate to "Storage Lens" and click on "Dashboards".
- Click on the newly created dashboard.
- Within the overview tab, there is a snapshot section.
- Click on "Cost efficiency".
- The section you want to look at is "% incomplete MPU bytes".
- If this is a large percentage (as you can see below, it was for me although the screen shot is post clean-up so prior to the lifecycle policy, the 30% was in the Total column), then you will certainly want to configure bucket lifecycle policies to clean this up.
Creating bucket lifecycle policies
Bucket lifecycle policies can be utilized for multiple scenarios but the focus for this will be deletion of incomplete multi-part uploads. The storage lens will provide the ability to drill-down into the bucket with the largest incomplete MPU bytes percentage in order to prioritize the buckets for cleanup.
For organizations, automation and deployment patterns should be utilized to apply bucket policies at S3 creation and continued governance. Open source tooling such as Cloud Custodian is excellent for this.
To create a lifecycle policy:
- Open the S3 service.
- Click on "Buckets" in the left side pane.
- Click on the target bucket for the policy.
- Select the "Management" tab and click "Create lifecycle rule".
- Provide a lifecycle rule name.
- Select "Apply to all objects in the bucket".
- Check the box to acknowledge the warning.
- Select only the checkbox for "Delete expired object delete markers or incomplete multipart uploads".
- Check the box for "Delete expired object delete markers"
- Check the box for "Delete incomplete multipart uploads"
- Select number of days for the deletion to occur. I selected 1 day as I have no need to keep incomplete uploads for any duration.
- Click "Create rule".
- After creating the rule, it took a few (3-4 days) for mine to take effect so don't be concerned if you do not have immediate results.
Once configured, this rule will continually keep the cost optimized across your buckets. Let me know if this helped save you or your company money! Also, be sure to check out my other articles and give me a follow on Twitter @ryanelkins.