How to add customization files in Elastic Beanstalk

There is an entire section in the AWS Beanstalk documentation about configuring your application with customization files, but for us 2 aspects are actually missing: how to actually add those files and how to check for the running output of those files. Let’s say you deploy with Beanstalk a Java app that is build with […]

Monitor records import into Redshift

Let’s imagine that we have an application where new entries are constantly loaded into Redshift. Recently we published a blog post in which some best practices about data imports were presented and another one where was described a way to organize tables if your use case implies frequent data imports. But enough with the bounce, […]

How to handle Redshift non-idempotency problem on data loading

As you probably know, in Redshift constraints (uniqueness, primary key, foreign key, not null) are informational only. This means that if you insert 2 times an entry in a table that has defined a primary key, that table will contain that entry 2 times. Now, let’s imagine the following scenario: your application follows the recommendations […]

An efficient approach of organizing tables into Redshift

In a previous post, we presented several ideas to improve data loading into Redshift. Today we are going to discuss an approach for organizing data into Redshift that scales to billions of entries without affecting reading performance. Resuming in few words the entire flow and requirements: Entries are copied from S3, several times per hour […]

Tips about loading data into Redshift

The official description of AWS Redshift starts with: “a fast, fully managed, petabyte-scale data warehouse”. Our experiences with Redshift confirm these specifications, with a single but very important mention: in order to really see the advantages and the incredible power of this service,we had to put into action some solutions that at first seemed unimportant. […]

Tables and partitions in DynamoDB

In AWS DynamoDB, data are organized into tables. Even DynamoDB is schemaless, all entries that are inserted in one table must have the same primary key as that one defined for the table. But why is that one so important? Because it is used afterward to distribute entries. According to the official documentation, a table […]

Several tips for your Beanstalk service

Creating a Beanstalk environment using the AWS console is a very straightforward mission mainly because the entire flow is very intuitive and steps are clearly listed. But with this approach we must ensure several configuration options are properly set. In this article we’ll list a couple of settings we saw ignored in many situations. Our […]

Trigger a State Machine execution in response to a S3 event

Recently, we presented some conclusions we found interesting and relevant about Step Functions and one plus for us is its integration with other AWS services, especially AWS CloudWatch and Lambda. On the other hand, at this moment Step Functions doesn’t offer support to trigger an execution in response to an event occurred in AWS, like […]

Some conclusions about AWS Step Functions

Recently we set up an internal tool aimed to run several types of workflows to produce reports for upper management. We already use AWS Simple Workflow Service in a similar service, but knowing the overhead it brings in terms of dependency management, testing and initialization, we decided to investigate an alternative offered by AWS: Step […]