How to Reduce AWS Data Transfer Costs: A CFO’s Guide to Cloud Savings

If you’re using AWS, you may have noticed data transfer fees being added to your expenses. These costs are often included in your cost and usage reports, but don’t be fooled, if they’re not monitored they can quickly add up and become a significant contributor to your AWS bills.

Many organizations may face unexpectedly high data transfer charges, which reach up to lakhs of rupees per year. To get a handle on these costs and potentially reduce them, it’s essential to get a clear picture of your data transfer costs and identify which resources are driving them.

This blog explores a practical scenario that sheds light on AWS data transfer pricing, highlighting the typical challenges teams encounter and offering actionable strategies to help you optimize your cloud expenditures and effectively manage AWS costs.

Continue reading “How to Reduce AWS Data Transfer Costs: A CFO’s Guide to Cloud Savings”

Stream and Analyze PostgreSQL Data from S3 Using Kafka and ksqlDB: Part 2

Introduction

In Part 1, we set up a real-time data pipeline that streams PostgreSQL changes to Amazon S3 using Kafka Connect. Here’s what we accomplished:

  • Configured PostgreSQL for CDC (using logical decoding/WAL)
  • Deployed Kafka Connect with JDBC Source Connector (to capture PostgreSQL changes)
  • Set up an S3 Sink Connector (to persist data in S3 in Avro/Parquet format)

In Part 2 of our journey, we dive deeper into the process of streaming data from PostgreSQL to S3 via Kafka. This time, we explore how to set up connectors, create a sample PostgreSQL table with large datasets, and leverage ksqlDB for real-time data analysis. Additionally, we’ll cover the steps to configure AWS IAM policies for secure S3 access. Whether you’re building a data pipeline or experimenting with Kafka integrations, this guide will help you navigate the essentials with ease.

Continue reading “Stream and Analyze PostgreSQL Data from S3 Using Kafka and ksqlDB: Part 2”

Know How to Access S3 Bucket without IAM Roles and Use Cases

We all have used IAM credentials to access our S3 buckets. But it’s not a very safe or recommended practice to keep our Access keys and Secrets stored in a server or hard code them in our codebase.
Even if we have to use keys, we must have some mechanism in place to rotate the keys very frequently (eg: using Hashicorp Vault). Another widely adopted method is to use IAM roles attached on the EC2 instance or the AWS service accessing the bucket.

But, what if we need access to the bucket from an on-premise Data Center where we can not attach an IAM role?

Yes, we can obviously use IAM credentials and secret tokens with the rotating mechanism. But setting up the key rotation mechanism itself could be another overhead if we do not have one already in place. What if we do not require keys or roles without making the bucket public?

In this blog, I will make an attempt to cater to this problem with another alternate and easy solution.

Continue reading “Know How to Access S3 Bucket without IAM Roles and Use Cases”