opstreeblog, Author at DEVOPS DONE RIGHT

The closer you think you are, the less you’ll actually see

I hope you have seen the movie Now you see me, it has a famous quote The closer you think you are, the less you’ll actually see. Well, this blog is not about this movie but how I got stuck into an issue, because I was not paying attention and looking at the things closely and seeing less hence not able to resolve the issue.

There is a lot happening in today’s DevOps world. And HashiCorp has emerged out to be a big player in this game. Terraform is one of the open source tools to manage infrastructure as code. It plays well with most of the cloud provider. But with all these continuous improvements and enhancements there comes a possibility of issues as well. Below article is about such a scenario. And in case you have found yourself in the same trouble. You are lucky to reach the right page.

I was learning terraform and performing a simple task to launch an Ubuntu EC2 instance in us-east-1 region. For which I required the AMI Id, which I copied from the AWS console as shown in below screenshot.

Once I got the AMI Id, I tried to create the instance using terraform, below is the screenshot of the code

provider “aws” {

region = “us-east-1”

access_key = “XXXXXXXXXXXXXXXXXX”

secret_key = “XXXXXXXXXXXXXXXXXXX”

}

resource “aws_instance” “sandy” {

ami = “ami-036ede09922dadc9b“

instance_type = “t2.micro”

subnet_id = “subnet-0bf4261d26b8dc3fc”

}

I was expecting to see the magic of Terraform but what I got below ugly error.

Terraform was not allowing to spin up the instance. I tried couple of things which didn’t work. As you can see the error message didn’t give too much information. Finally, I thought of giving it a try by doing same task via AWS web console. I searched for the same ubuntu AMI and selected the image as shown below. Rest of the things, I kept to default. And well, this time it got launched.

And it confused me more. Through console, it was working fine but while using Terraform it says not allowed. After a lot of hair pulling finally, I found the culprit which is a perfect example of how overlooking small things can lead to blunder.

Culprit

While copying the AMI ID from AWS console, I had copied the 64-bit (ARM) AMI ID. Please look carefully, the below screenshot

But while creating it through console I was selecting the default configuration which by is 64-bit(x86). Look at the below screenshot.

To explain it further, I tried to launch the VM with 64-bit (ARM) manually. And while selecting the AMI, I selected the 64-bit (ARM).

And here is the culprit. 64-bit(ARM) only supports a1 instance type

Conclusion

While launching the instance with the terraform, I tried using 64-bit (ARM) AMI ID mistakenly, primarily because for same AMI there are 2 AMI IDs and it is not very visible to eyes unless you pay special attention.

So folks, next time choosing an AMI ID keep it in mind what type of AMI you are selecting. It will save you a lot of time.

Migrate your data between various Databases

Data Migration Service

Have you ever thought about migrating your production database from one platform to another

and dropped this idea later, because it was too risky, you were not ready to

bare a downtime?

If yes, then please pay attention because this is what we are going to perform
in this article.

A few days back we’re trying to migrate our production MySQL RDS from AWS to GCP, SQL, and we had to migrate data without downtime, accurate and
real-time and that too without the help of any Database Administrator.

After doing a bit research and evaluating few services we finally started working on AWS DMS (Data Migration Service) and figured out this is a great service to migrate a
different kind of data.

You can migrate your data to and from the most widely used commercial and open-source databases, and database platforms. Databases like Oracle, Microsoft SQL Server, and
PostgreSQL, MongoDB.

The source database remains fully operational during the migration,
The service supports homogeneous migrations such as Oracle to Oracle,
and also heterogeneous migrations between different database platforms.

Let’s discuss some important features of AWS DMS:

Migrates the database securely, quickly and accurately.
No downtime required, works as schema converter as well.
Supports various type or database like MySQL, MongoDB, PSQL etc.
Migrates real-time data also synchronize ongoing changes.
Data validation is available to verify database.
Compatible with a long range of database platforms like RDS, Google SQL, on-premises etc.
Inexpensive (Pricing is based on the compute resources used during the migration process).

This is a typical migration scenario.

Let’s perform step by step migration:

Note: We’ve performed migration from AWS RDS
to GCP SQL, you can choose database source and
destination as per your requirement.

Create replication instance:
A replication instance initiates the connection between the source and target databases, transfers the data, cache any changes that occur on the source database during the initial data load.

Use the fields to below to configure the parameters of your new replication instance including network and security information, encryption details, select instance class as per requirement.

After completion, all mandatory fields click the next tab, and you will be redirected
to Replication Instance tab.

Grab a coffee quickly while the instance is getting ready.

Hope you are ready with your coffee because the instance is ready now.
Now we are to create two endpoints “Source” and “Target” 2.1 Create Source Endpoint:

Click on “Run test” tab after completing all fields, make sure your Replication instance IP is whitelisted
under security group. 2.2 Create Target Endpoint

Click on “Run test” tab again after completing all fields, make sure your Replication instance IP is whitelisted under target DB authorization.

Now we’ve ready Replication Instance, Source Endpoint, and Target Endpoint.
Finally, we’ll create a “Replication Task” to start replication.

Fill the fields like:

Task Name: any name
Replication Instance: The instance we’ve created above
Source Endpoint: The source database
Target Endpoint: The target database
Migration Type: Here I choose “Migration existing data and replication
ongoing” because we needed ongoing changes.

4. Verify the task status now.

Once all the fields are completed click on the “Create task” and you will be
redirected to “Tasks” Tab.

Check your task status

The task has been successfully completed now, you can verify the inserts tabs and validation tab,

The migration is done successfully if Validation State is “Validated” that means migration has been performed successfully.

Git Inside Out

Git Inside-Out

Man Wearing Black and White Stripe Shirt Looking at White Printer Papers on the Wall

Git is basically a file-system where you can retrieve your content through addresses. It simply means that you can insert any kind of data into git for which Git will hand you back a unique key you can use later to retrieve that content. We would be learning #gitinsideout through this blog

The Git object model has three types: blobs (for files), trees (for folder) and commits.

Objects are immutable (they are added but not changed) and every object is identified by its unique SHA-1 hash
A blob is just the contents of a file. By default, every new version of a file gets a new blob, which is a snapshot of the file (not a delta like many other versioning systems).
A tree is a list of references to blobs and trees.
A commit is a reference to a tree, a reference to parent commit(s) and some decoration (message, author).

Then there are branches and tags, which are typically just references to commits.

Git stores the data in our .git/objects directory. After initialising a git repository, it automatically creates .git/objects/pack and .git/objects/info with no regular file. After pushing some files, it would reflect in the .git/objects/ folder

OBJECT Blob

blob stores the content of a file and we can check its content by command

git cat-file -p

or git show

OBJECT Tree

A tree is a simple object that has a bunch of pointers to blobs and other trees – it generally represents the contents of a directory or sub-directory.

We can use git ls-tree to list the content of the given tree object

OBJECT Commit

The “commit” object links a physical state of a tree with a description of how we got there and why.

A commit is defined by tree, parent, author, committer, comment

All three objects ( blob,Tree,Commit) are explained in details with the help of a pictorial diagram.

Often we make changes to our code and push it to SCM. I was doing it once and made multiple changes, I was thinking it would be great if I could see the details of changes through local repository itself instead to go to a remote repository server. That pushed me to explore Git more deeply.

I just created a local remote repository with the help of git bare repository. Made some changes and tracked those changes(type, content, size etc).

Below example will help you understand the concept behind it.

Suppose we have cloned a repository named kunal:

Inside the folder where we have cloned the repository, go to the folder kunal then:

cd kunal/.git/

I have added content(hello) to readme.md and made many changes into the same repository as:

adding README.md

updating Readme.md

adding 2 files modifying one

pull request

commit(adding directory).

Go to the refer folder inside .git and take the SHA value for the master head:

This commit object we can explore further with the help of cat-file which will show the type and content of tree and commit object:

Now we can see a tree object inside the tree object. Further, we can see the details for the tree object which in turn contains a blob object as below:

Below is the pictorial representation for the same:

More elaborated representation for the same :

Below are the commands for checking the content, type and size of objects( blob, tree and commit)

kunal@work:/home/git/test/kunal# cat README.md

hello

We can find the details of objects( size,type,content) with the help of #git cat-file

git-cat-file:- Provide content, type or size information for repository objects

You an verify the content of commit object and its type with git cat-file as below:

kunal@work:/home/git/test/kunal/.git # cat logs/refs/heads/master

Checking the content of a blob object(README.md, kunal and sandy)

As we can see first one is adding read me , so it is giving null parent(00000…000) and its unique SHA-1 is 912a4e85afac3b737797b5a09387a68afad816d6

Below are the details that we can fetch from above SHA-1 with the help of git cat-file :

Consider one example of merge:

Created a test branch and made changes and merged it to master.

Here you can notice we have two parents because of a merge request

You can further see the content, size, type of repository #gitobjects like:

Summary

This is pretty lengthy article but I’ve tried to make it as transparent and clear as possible. Once you work through the article and understand all concepts I showed here you will be able to work with Git more effectively.

This explanation gives the details regarding tree data structure and internal storage of objects. You can check the content (differences/commits)of the files through local .git repository which stores each object with unique SHA hash. This would clear basically the internal working of git.
Hopefully, this blog would help you in understanding the git inside out and helps in troubleshooting things related to git.

Log Parsing of Windows Servers on Instance Termination

As we all know that how critical are Logs as a part of any system, they give you deep insights about your application, what your system is doing and what caused the error. Depending on how logging is configured logs may contain transaction history, timestamps and amounts debited/credited into client’s account and a lot more.

On an enterprise level application, your system goes to multiple hosts, managing the logs across multiple hosts can be complicated. Debugging the error in the application across hundreds of log files on hundreds of servers can be very time consuming and complicated and not the right approach so it is always better to move the logs to a centralized location.

Lately in my company I faced a situation which I assume is a very commonly faced scenario in Amazon’s Cloud where we might have to retain application logs from multiple instances behind an Auto Scaling group. Let’s assume an example for better understanding.

Suppose your application is configured to be logging into C:\Source\Application\web\logs Directory. The Application running has variant incoming traffic, sometimes it receives requests which can be handled by 2 servers, other times it may require 20 servers to handle the traffic.

When there is a hike in traffic, Amazon Ec2’s smart AutoScaling Group uses the configuration and scales from 2 server to many (According to ASG Policy) and during this phase, the application running in the newly launched Ec2’s also log into C:\Source\Application\web\logs …. but when there’s a drop in traffic, the ASG triggers a scale down policy, resulting to termination of instances, which also results in deletion of all the log files inside the instances launched via ASG during high traffic time.

Faced a similar situation ? No worries, now in order to retain logs I figured out an absolute solution.
Here, in this blog, the motive is to sync the logs from dying instances at the time of their termination. This will be done using AWS Services, the goal is to trigger a Powershell Script in the instance using SSM which sync logs to S3 Bucket with sufficient information about the dying instances. For this we will require 2 things:

1) Configuring SSM agent to be able to talk to Ec2 Instances

2) Ec2 Instances being able to write to S3 Buckets

For the tutorial we will be using Microsoft Windows Server 2012 R2 Base with the AMI ID: ami-0f7af6e605e2d2db5

A Blueprint of the scenario to be understood below:

1) Configuring SSM agent to be able to talk to Ec2 Instances

SSM Agent is installed by default on Windows Server 2016 instances and instances created from Windows Server 2003-2012 R2 AMIs published in November 2016 or later. Windows AMIs published before November 2016 use the EC2Config service to process requests and configure instances.

If your instance is a Windows Server 2003-2012 R2 instance created before November 2016, then EC2Config must be upgraded on the existing instances to use the latest version of EC2Config. By using the latest EC2Config installer, you install SSM Agent side-by-side with EC2Config. This side-by-side version of SSM Agent is compatible with your instances created from earlier Windows AMIs and enables you to use SSM features published after November 2016.

This simple script can be used to update Ec2Config and then layer it with the latest version of SSM agent. This will always install AwsCli which is used to push logged archives to S3

#ScriptBlock

if(!(Test-Path -Path C:\Scripts )){

mkdir C:\Tmp
}
cd C:/Tmp
wget https://s3.ap-south-1.amazonaws.com/asg-termination-logs/Ec2Install.exe -OutFile Ec2Config.exe
wget https://s3.amazonaws.com/ec2-downloads-windows/SSMAgent/latest/windows_amd64/AmazonSSMAgentSetup.exe -OutFile ssmagent.exe
wget https://s3.amazonaws.com/aws-cli/AWSCLI64PY3.msi -OutFile awscli.msi
wget https://s3.amazonaws.com/aws-cli/AWSCLISetup.exe -OutFile awscli.exe
Invoke-Command -ScriptBlock {C:\Tmp\Ec2Config.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\awscli.exe /Ec /S /v/qn }
sleep 20
Invoke-Command -ScriptBlock {C:\Tmp\ssmagent.exe /Ec /S /v/qn }
sleep 10
Restart-Service AmazonSSMAgent
Remove-Item C:\Tmp

An IAM Role is Required for SSM to Ec2 Instance Conversation:

IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the Systems Manager API.

Add instance profile permissions for Systems Manager managed instances to an existing role

Open the IAM console at https://console.aws.amazon.com/iam/.
In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
On the Permissions tab, choose Attach policy.
On the Attach policy page, select the check box next to AmazonEC2RoleforSSM, and then choose Attach policy.

Now, Navigate to Roles > and select your role.

That should look like:

2) Ec2 Instances being able to write to S3 Buckets

An IAM Role is Required for Ec2 to be able to write to S3:

IAM instance role: Verify that the instance is configured with an AWS Identity and Access Management (IAM) role that enables the instance to communicate with the S3 API.

Add instance profile permissions for Systems Manager managed instances to an existing role

Open the IAM console at https://console.aws.amazon.com/iam/.
In the navigation pane, choose Roles, and then choose the existing role you want to associate with an instance profile for Systems Manager operations.
On the Permissions tab, choose Attach policy.
On the Attach policy page, select the check box next to AmazonS3FullAccess, and then choose Attach policy.

That should look like:

This Powershell script saved in C:/Scripts/termination.ps1 will pick up log files from:

$SourcePathWeb:

and will output logs into:

$DestFileWeb

with a IP and date-stamp to recognize and identify the instances and where the logs originate from later.
Make sure that the s3 bucket name and –region and source of log files is changed according to the preferences.

#ScriptBlock

$Date=Get-Date -Format yyyy-MM-dd
$InstanceName=”TerminationEc2″
$LocalIP=curl http://169.254.169.254/latest/meta-data/local-ipv4 -UseBasicParsing

if((Test-Path -Path C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date )){
Remove-Item “C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date” -Force -Recurse
}

New-Item -path “C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date” -type directory
$SourcePathWeb=”C:\Source\Application\web\logs”
$DestFileWeb=”C:\Users\Administrator\workdir\$InstanceName-$LocalIP-$Date/$Date/logs.zip”

Add-Type -assembly “system.io.compression.filesystem”
[io.compression.zipfile]::CreateFromDirectory($SourcePathWeb, $DestFileWeb)

C:\’Program Files’\Amazon\AWSCLI\bin\aws.cmd s3 cp C:\Users\Administrator\workdir s3://terminationec2 –recursive –exclude “*.ok” –include “*” –region us-east-1

If the above settings are done fine then manually the script should produce a success suggesting output:

index index

Check your S3, Bucket for seeing if it has synced logs to there. Now, because the focus of this blog trigger a Powershell Script in the instance using SSM which syncs the logs to S3 Bucket so we will try running the script through SSM > Run Command.

Select and run of the instances having the above script and configuration. The output should be pleasing.

index

The AMI used by the ASG should have the above configuration (Can be archived via created a ami from ec2 having above config and then adding it into Launch Configuration of the ASG). The ASG we have here for the tutorial is named after my last name : “group_kaien”.

Now, the last and the Most important step is configuration theCloudwatch > Event > Rules.

Navigating to Cloudwatch>Event>Rules: Create Rule.

This would return the following JSON config:

{

“source”: [

“aws.autoscaling”

“

“detail-type”: [

“EC2 Instance Terminate Successful”,

“EC2 Instance-terminate Lifecycle Action”

“detail”: {

“AutoScalingGroupName”: [

“group_kaien”

]

}

On the right side of Targets:

Select

SSM Run Command:

Document: AwsRunPowerShellScript
Target key: “Instanceids or tag:
Target Values:

Configure parameter

Commands: .\termination.ps1
WorkingDirectory: C:\Scripts.ps1
ExecutionTimeout: 3600 is default

Making sure that on termination event happening, the powershell script is run and it syncs logs to S3. This is what our configuration looks like:

For more on setting up Cloudwatch Events refer :
https://docs.aws.amazon.com/AmazonCloudWatch/latest/events/CWE_GettingStarted.html

Wait for the AutoScaling Policies to run such that new instances are created and terminated, with above configuration. The terminating instances will sync their logs S3 before they are fully terminated. Here’s the output on S3 for me after a scale down activity was done.

Conclusion

Now with this above, we have learned how to export logs to S3 automatically from a dying instance, with the correct date/time stamp as mentioned in the termination.ps1 script.
Hence, fulfilling the scope of the blog.
Stay tuned for more

Prometheus Overview and Setup

Overview

Prometheus is an opensource monitoring solution that gathers time series based numerical data. It is a project which was started by Google’s ex-employees at SoundCloud.

To monitor your services and infra with Prometheus your service needs to expose an endpoint in the form of port or URL. For example:- {{localhost:9090}}. The endpoint is an HTTP interface that exposes the metrics.

For some platforms such as Kubernetes and skyDNS Prometheus act as directly instrumented software that means you don’t have to install any kind of exporters to monitor these platforms. It can directly monitor by Prometheus.

One of the best thing about Prometheus is that it uses a Time Series Database(TSDB) because of that you can use mathematical operations, queries to analyze them. Prometheus uses SQLite as a database but it keeps the monitoring data in volumes.

Pre-requisites

A CentOS 7 or Ubuntu VM
A non-root sudo user, preferably one named prometheus

Installing Prometheus Server

First, create a new directory to store all the files you download in this tutorial and move to it.

mkdir /opt/prometheus-setup
cd /opt/prometheus-setup

Create a user named “prometheus”

useradd prometheus

Use wget to download the latest build of the Prometheus server and time-series database from GitHub.

wget https://github.com/prometheus/prometheus/releases/download/v2.0.0/prometheus-2.0.0.linux-amd64.tar.gz

The Prometheus monitoring system consists of several components, each of which needs to be installed separately.

Use tar to extract prometheus-2.0.0.linux-amd64.tar.gz:

tar -xvzf ~/opt/prometheus-setup/prometheus-2.0.0.linux-amd64.tar.gz .

Place your executable file somewhere in your PATH variable, or add them into a path for easy access.

mv prometheus-2.0.0.linux-amd64  prometheus
sudo mv  prometheus/prometheus  /usr/bin/
sudo chown prometheus:prometheus /usr/bin/prometheus
sudo chown -R prometheus:prometheus /opt/prometheus-setup/
mkdir /etc/prometheus
mv prometheus/prometheus.yml /etc/prometheus/
sudo chown -R prometheus:prometheus /etc/prometheus/
prometheus --version

You should see the following message on your screen:

  prometheus,       version 2.0.0 (branch: HEAD, revision: 0a74f98628a0463dddc90528220c94de5032d1a0)
  build user:       root@615b82cb36b6
  build date:       20171108-07:11:59
  go version:       go1.9.2

Create a service for Prometheus

sudo vi /etc/systemd/system/prometheus.service

[Unit]
Description=Prometheus

[Service]
User=prometheus
ExecStart=/usr/bin/prometheus --config.file /etc/prometheus/prometheus.yml --storage.tsdb.path /opt/prometheus-setup/

[Install]
WantedBy=multi-user.target

systemctl daemon-reload

systemctl start prometheus

systemctl enable prometheus

Installing Node Exporter

Prometheus was developed for the purpose of monitoring web services. In order to monitor the metrics of your server, you should install a tool called Node Exporter. Node Exporter, as its name suggests, exports lots of metrics (such as disk I/O statistics, CPU load, memory usage, network statistics, and more) in a format Prometheus understands. Enter the Downloads directory and use wget to download the latest build of Node Exporter which is available on GitHub.

Node exporter is a binary which is written in go which monitors the resources such as cpu, ram and filesystem.

wget https://github.com/prometheus/node_exporter/releases/download/v0.15.1/node_exporter-0.15.1.linux-amd64.tar.gz

You can now use the tar command to extract : node_exporter-0.15.1.linux-amd64.tar.gz

tar -xvzf node_exporter-0.15.1.linux-amd64.tar.gz .

mv node_exporter-0.15.1.linux-amd64 node-exporter

Perform this action:-

mv node-exporter/node_exporter /usr/bin/

Running Node Exporter as a Service

Create a user named “prometheus” on the machine on which you are going to create node exporter service.

useradd prometheus

To make it easy to start and stop the Node Exporter, let us now convert it into a service. Use vi or any other text editor to create a unit configuration file called node_exporter.service.

sudo vi /etc/systemd/system/node_exporter.service

This file should contain the path of the node_exporter executable, and also specify which user should run the executable. Accordingly, add the following code:

[Unit]
Description=Node Exporter

[Service]
User=prometheus
ExecStart=/usr/bin/node_exporter

[Install]
WantedBy=default.target

Save the file and exit the text editor. Reload systemd so that it reads the configuration file you just created.

sudo systemctl daemon-reload

At this point, Node Exporter is available as a service which can be managed using the systemctl command. Enable it so that it starts automatically at boot time.

sudo systemctl enable node_exporter.service

You can now either reboot your server or use the following command to start the service manually:

sudo systemctl start node_exporter.service

Once it starts, use a browser to view Node Exporter’s web interface, which is available at http://your_server_ip:9100/metrics. You should see a page with a lot of text:

Starting Prometheus Server with a new node

Before you start Prometheus, you must first edit a configuration file for it called prometheus.yml.

vim /etc/prometheus/prometheus.yml

Copy the following code into the file.

# my global configuration which means it will applicable for all jobs in file
global:
  scrape_interval:     15s # Set the scrape interval to every 15 seconds. Default is every 1 minute. scrape_interval should be provided for scraping data from exporters 
  evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute. Evaluation interval checks at particular time is there any update on alerting rules or not.

# Load rules once and periodically evaluate them according to the global 'evaluation_interval'. Here we will define our rules file path 
#rule_files:
#  - "node_rules.yml"
#  - "db_rules.yml"

# A scrape configuration containing exactly one endpoint to scrape: In the scrape config we can define our job definitions
scrape_configs:
  # The job name is added as a label `job=` to any timeseries scraped from this config.
  - job_name: 'node-exporter'
    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'. 
    # target are the machine on which exporter are running and exposing data at particular port.
    static_configs:
      - targets: ['localhost:9100']

After adding configuration in prometheus.yml. We should restart the service by

systemctl restart prometheus

This creates a scrape_configs section and defines a job called a node. It includes the URL of your Node Exporter’s web interface in its array of targets. The scrape_interval is set to 15 seconds so that Prometheus scrapes the metrics once every fifteen seconds. You could name your job anything you want, but calling it “node” allows you to use the default console templates of Node Exporter.

Use a browser to visit Prometheus’s homepage available at http://your_server_ip:9090. You’ll see the following homepage. Visit http://your_server_ip:9090/consoles/node.html to access the Node Console and click on your server, localhost:9100, to view its metrics.