Amazon S3 was the first service launched by AWS in March 2006. Among all words used to shape a disruptive product – scalable, reliable, fast, simple – one has been important: cheap. At that time, the cost was 2 orders of magnitude lower than operating your redundant data storage. Nowadays, S3 continues to be one of the most inexpensive data storage and over the years has become one of the pillars of any cloud-based application.
This article is going to provide several tips that can help you cut costs for S3. Let’s start understanding the pricing model. In fact, there are 3 parts that build the S3 bill:
- Storage cost, meaning the number of gigabytes stored per month
- Accessing costs, meaning the number of API invocations (PUT, GET, DELETE, LIST) to access your data
- Data transfer
Comparing to other AWS storage products (Elastic Block Storage) or databases (DynamoDB), when you use S3 you are charged for only what you used and not for what you provisioned. This should be a very good news for you, as a user. But let’s see other ideas that could help you to cut costs:
- Delete files that are no longer necessary or those that can be recreated. Right now, S3 is integrated with many other AWS services and most of them store here information that are not always useful. Or you have an app that uploads there info that can be recreated. In that case, don’t hesitate, delete it.
- Understand and use the “lifecycle” feature to delete previous versions (if you have versioning activated) and to delete incomplete multipart uploads.
- Become familiar with S3 storage classes and create rules to move data that is not commonly used into cheaper storages, like Infrequent Access zone or AWS Glacier. Pay attention that S3 IA charges you for at least 128Kb, so storing there small objects doesn’t sound like a good idea.
- Data format is also important. If you use S3 to store some files for an analytical solution, maybe it’s better to choose a binary format instead of a human readable format. For example, a number with 9 digits occupies 9 bytes in a text file, whereas in a binary format it is stored as a 4-bytes integer.
There are a few aspects here to consider:
- Price for API calls is not affected by the object size. We have a dedicated article where we proof S3 is not the efficient choice for small files.
- Right now, price for PUT calls (keep in mind that a DELETE call is also a PUT) is 10x more expensive than the price for GET calls. What we want to say? Maybe it’s better to store multiple files in a bigger archive (and make only one PUT call) and whenever you want one file, fetch the archive and extract the object that interests you.
- We didn’t mention the price for LIST calls (now it has the same price like a PUT call) because this operation should be avoided. Or at least don’t design an application that relies on this operation because it’s expensive, it’s slow and is going to affect the latency along with the number of files stored in the bucket.
Regarding this topic, it’s important to consider:
- Try to avoid public objects because you’ll pay for each access. If someone founds a public objects of yours and constantly get it, you’ll pay for that.
- Try to avoid frequent cross-region data replication. If you use case requires this pattern, maybe it’s cheaper to replicate your bucket in a region where your application is deployed instead of initiating cross-region calls.
As you can see, there are so many ways to cut your S3 costs. And, at least for us, this is something very challenging because always we balance the savings and the effort necessary to implement these savings. If you have other ideas on this topic, leave a comment. If you found this article useful, share it.