Speeding up Ansible Execution Part 1

The knowledge of one of the SCM tools is a must for any DevOps engineer, ANSIBLE is one of the popular tools in this category, we all are aware of the ease that Ansible provides whether it is infra provisioning, orchestration or application deployment.
The reason for the vast popularity of Ansible is the long list of modules it provides to support any level of automation, moreover it also gives users the flexibility to create their own modules as per their requirement.
But The purpose of this blog is not to mention the features that ansible provides, but to show how we can speed up our playbook execution in Ansible, as a beginner executing ansible, is very easy and it also feels like saving a lot of time with it, but as you dive deep into it, you will come to know that running ansible playbooks will engage you for a considerable amount of time.
There are a lot of articles available on the internet on how we can speed up our ansible execution, so I have decided to sum up those articles into my blog, with the following methods, we can reduce our execution time without compromising with the overall performance of Ansible.
Before starting, I request  you guys to make a small change in your ansible configuration file (ansible.cfg), this small change will help you in tracking the time it will take for the playbook execution, and it also lists out the time is taken by each task.
Just add these lines to your ansible.cfg file under default section,

[default]

callback_whitelist = profile_tasks

Forks

When you are running your playbooks on various hosts, then you may have noticed that the number of servers where the playbook executes simultaneously is 5. You can increase this number inside the ansible.cfg file:
# ansible.cfg

forks = 10


or with a command line argument to ansible-playbook with the -f or –forks options. We can increase or decrease this value as per our requirement.
while using forks we should use “local_action” or “delegated” steps limited in number, as with higher fork value it will affect the ansible-server’s performance.

Async

In ansible, each task blocks the playbook, meaning the connections stay open until the task is done on each node, which is some cases takes a lot of time, here we can use “async” for those particular tasks, with the help of this ansible will automatically move to another task without waiting for the task execution on each node.
To launch a task asynchronously, we need to specify its maximum runtime and how frequently we would like to poll for status, it’s default value in 10 sec.
tasks:

– name: “name of the task”  

command: “command we want to execute”     

async: 40    

poll: 15
The only condition is that the subsequent tasks must not have a dependency on this task.

Free Strategy 

When running Ansible playbooks, you might have noticed that the Ansible runs every task on each node one by one, it will not move to another task until a particular task is completed on each node, which will take a lot of time, in some cases.
By default, the strategy is set to “linear”, we can set it to free.

– hosts: “hosts/groups”  

name: “name of the playbook”  

strategy: free


It will run the playbook on each host independently, without waiting for each node to complete.
Facts gathering is the default feature while executing playbook, sometimes we don’t need it.
In those cases, we can disable facts gathering, This has advantages in scaling Ansible in push mode with very large numbers of systems.

– hosts: “hosts/groups”  

name: “name of the playbook”  

gather_facts: no

Pipelining 

For each task in Ansible, there are lots of ssh connection created, which results in increasing the total execution time. Pipelining reduces the number of ssh operations required to execute a module by executing many Ansible modules without an actual file transfer. We just have to make these changes in the ansible.cfg file,
# ansible.cfg Pipelining = True
Although this can result in a very significant performance improvement when enabled, Pipelining is disabled by default because requiretty is enabled by default for many distros.

Poll Interval

When we run any the task in Ansible, it starts polling to check if the task is completed on the host or not, we can decrease this polling interval time in ansible.cfg to increase its performance, but it will increase the CPU usage, so we need to adjust its value accordingly We just have to adjust this the parameter in the ansible.cfg file,
internal_poll_interval=0.001

so, these are the various ways to decrease our playbook execution time in Ansible, generally we don’t use all these methods in a single setup, we use these features as per the requirement, 
The main motive of writing this blog is to determine the factors which will help in fine-tuning the Ansible performance, and there are many more factors which serves the same purpose but here I am mentioning the most important parameters among them.
I hope I have covered all the important aspects of the blog, feel free to provide your valuable feedback.
Thanks !!!

Source:

https://mitogen.networkgenomics.com/ansible_detailed.html

Lets Get Started With Packer

In this blogpost, we will see how to get started with packer. We will cover installation, writing a template for creating AWS AMI. To get the basic understanding of how packer works, You can refer to our previous blog “Intro To Packer”.
Installation 
  1. Official method to download packer as precompiled binary, packer does not provide system packages and neither they have any plan to make it avail as such:-$curl -L https://releases.hashicorp.com/packer/1.4.0/packer_1.4.0_linux_amd64.zip
  2. After downloading the binary unzip it to the location you want to keep it. If you want it to be installed such that it can be used by system-wide users, do  not unzip in user space $sudo unzip packer_1.4.0_linux_amd64.zip -d /usr/local/packer 
  3. After unzipping the package, the directory should contain a single binary program called packer . 
  4. The final step to installation is to make sure the directory you installed Packer to is set on the PATH, so that it can be used using a command line. Open the /etc/environment and append the below line to the end of the file export PATH=”$PATH:/usr/local/packer” After adding the line into the file to let the change reflect source the environment file $source /etc/environment 
  5. Verify the installation by firing packer command or simply check its version by    $packer –version . You should see the version of packer as an output.
Once installed, running packer is as simple as packer build , which will take the build-file and run the steps we provide within. Let’s get started with a simple build file.
 
Setting Up Stage
 
                                                                                                                         
As we are building an image for AWS cloud, there are certain prerequisites which need to be taken care of.
You should have IAM user who has access to create and destroy ec2 instance, create an AMI, create and destroy security groups etc. You can find sample IAM policy for packer user in sample minimum IAM user policy for Packer.
 
After setting up your IAM user for packer, generate the access key and id and save it.
Now having noted the key, you can either directly use it in your template (which is not suggested) or you can configure it as an environment variable or the AWS CLI config on which you have the packer installed.
 
I have configured it with AWS CLI config so I did not have to define in variable section or in the builder section. You can also pass your access keys as variable as an option while running packer build command.
Here we will be installing apache webserver in the image. I have named this json file as httpd.json and used httpd.sh script to install httpd under provisioner section.
 
 
Below is the sample httpd.json file
 
{
     “variables”: {
     “ami_id”: “ami-0a574895390037a62”,
     “app_name”: “httpd”
   },
   “builders”: [{
     “type”: “amazon-ebs”,
     “region”: “ap-south-1”,
     “vpc_id”: “vpc-df95d4b7”,
     “subnet_id”: “subnet-175b2d7f”,
     “source_ami”: “{{user `ami_id`}}”,
     “instance_type”: “t2.micro”,
     “ssh_username”: “ubuntu”,
     “ami_name”: “PACKER-DEMO-{{user `app_name` }}”,
     “tags”: {
         “Name”: “PACKER-DEMO-{{user `app_name` }}”,
         “Env”: “DEMO”

       }
   }],

  “provisioners”: [
   {
       “type”: “shell”,
       “script”: “httpd.sh”
    }
]

}

 

 
Below is the simple httpd.sh
 

#!/bin/bash

sudo apt-get update
sudo apt-get install -y httpd

 

 
First Validate your template by firing below command:-
packer validate httpd.json
 
You should get the output as a success, or as an error indicating the line number.
 
Now run packer build to build your image:-
 
packer build httpd.json
 
After a successful build, you will get AMI id as output and success message.
 
==> amazon-ebs: Prevalidating AMI Name: PACKER-DEMO-httpd
   amazon-ebs: Found Image ID: ami-0a574895390037a62
==> amazon-ebs: Creating temporary keypair: packer_5cd559df-84ce-ff8a-fa93-0c4477d988e4
==> amazon-ebs: Creating temporary security group for this instance: packer_5cd559e2-ea81-be15-b94a-c28493c0d3ff
==> amazon-ebs: Authorizing access to port 22 from [0.0.0.0/0] in the temporary security groups…
==> amazon-ebs: Launching a source AWS instance…
==> amazon-ebs: Adding tags to source instance
   amazon-ebs: Adding tag: “Name”: “Packer Builder”
   amazon-ebs: Instance ID: i-06ed051a3435865c4
==> amazon-ebs: Waiting for instance (i-06ed051a3435865c4) to become ready…
==> amazon-ebs: Using ssh communicator to connect: *.*.*.*
==> amazon-ebs: Waiting for SSH to become available…
==> amazon-ebs: Connected to SSH!
==> amazon-ebs: Stopping the source instance…
   amazon-ebs: Stopping instance
==> amazon-ebs: Waiting for the instance to stop…
==> amazon-ebs: Creating AMI PACKER-DEMO-httpd from instance i-06ed051a3435865c4
   amazon-ebs: AMI: ami-0ce41081a3b649374
==> amazon-ebs: Waiting for AMI to become ready…
==> amazon-ebs: Adding tags to AMI (ami——)…
==> amazon-ebs: Tagging snapshot: snap-0ee3ce80ec289ed24
==> amazon-ebs: Creating AMI tags
   amazon-ebs: Adding tag: “Name”: “PACKER-DEMO-httpd”
   amazon-ebs: Adding tag: “Env”: “DEMO”
==> amazon-ebs: Creating snapshot tags
==> amazon-ebs: Terminating the source AWS instance…
==> amazon-ebs: Cleaning up any extra volumes…
==> amazon-ebs: No volumes to clean up, skipping
==> amazon-ebs: Deleting temporary security group…
==> amazon-ebs: Deleting temporary keypair…
Build ‘amazon-ebs’ finished.
==> Builds finished. The artifacts of successful builds are:
–> amazon-ebs: AMIs were created:
ap-south-1: ami——————–

 

 
Few things to keep in mind:-
 
  • Packer does not create the image of any running instance, instead, it spins a temporary instance and create the image, post image creation it destroys all the resources which were created by a packer in order to create images. 
  • Though packer gives us ease of taking machine AMI’s programmatically, purging of an older image should also be kept in mind because AMIs gets stored over s3 and it might add up to your cost. 
  • Though a rollback becomes a lot easier in immutable infra. It can become a pain in the neck if you frequently make changes in production. 
  • We cannot expect it to solve all our problems, its only job is to create an image. You will have to decide when to create an image and what post action needs to be taken or deployed after image creation.
I hope the above setup will help you in getting started with it. Later we will discuss how we can use it along with Ansible and Terraform to achieve immutable Infra.
I appreciate any suggestions and comments or any questions/doubts faced while implementing it.

Intro to Packer

Packer is an opensource tool developed by HashiCorp to create machine images for multiple cloud platforms like AWS, GCP, Azure or even VMWare. As the name suggests it packs all your software, packages, configurations while baking your machine images. Perhaps Packer is the only tool right now in the market which solely focuses on creating machine images and giving us the ability to automate the machine image creation process.

In this blog post, we will learn What Packer does and how it does things. Sounds Interesting!!!!

 
What is Packer and Machine Images
 
“Packer can be used to creating identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs on every major operating system, and is highly performant, creating machine images for multiple platforms in parallel.” 
https://www.packer.io/intro/ 
It does installs and configures the software by using different SCM tools such as Ansible, Chef or Puppet, shell scripts within your Packer-made images. You can either include your scripts in json template itself or you can source it from a file.
 
“A machine image is a single static unit that contains a pre-configured operating system and installed software which is used to quickly create new running machines. Machine image formats change for each platform. Some examples include AMIs for EC2, VMDK/VMX files for VMware, OVF exports for VirtualBox, etc”
https://www.packer.io/intro/ 
 
Why the Heck We Should Consider to Learn Packer!!!!!!!!
 
Consider a couple of scenarios mentioned below:-
 
Scenario1
 
If you want to have an immutable infrastructure in place. The key guideline behind an immutable infrastructure is that you never modify a running server. If a change is required, you instead completely replace the server with a new instance that contains the update or change.
The new server instance is created with an origin image that is built upon or a restored image from a previously defined server state. Version control and tag your images for easy rollback and distribution. Image contains all the application code, runtime dependencies, and configuration–in essence, the state needed for the software to run as expected. You will want to minimize the time required to bake all your required stuff into your image which can be achieved if you have proper tag maintained over your previous release images which can be used as an origin-golden image to bake the new image. The entire process of baking and using images become outstandingly easy by using Packer.
 
Scenario2
 
If you have autoscaling in place, there must be a requirement to scale up a new serviceable VMs as soon as possible but there are some concerns which spoil your expectations of serviceable VMs in the least time:
  • OS boot 
  • OS configuration 
  • SCM with Ansible or Chef 
  • Setting up your application
With having a pre-baked image in place, your time to scale up your VMs will drastically decrease.
 
So How Does Packer Works!!!!!!!!
 
Packer uses the JSON file as a template, it takes template as in input rolls up a temporary VMs based on the details provided, does the required configuration and stops it. After stopping the VMs it starts creating the image and save it as the name/tag provided in the template.
 
json file packer engine EC2 AMI
              
                                                       
Basic Concepts of Packer
 
There are two things which you will need to know to get started with packer:
  • Templates 
  • Commands
 
Templates
 
There are four sections in the Packer template: 
  • Variables(optional)-is an object of one or more key/value strings that define user variables contained in the template. If it is not specified, then no variables are defined 
  • Builders(required)- is an array of one or more objects that defines the builders that will be used to create machine images for this template, and configures each of those builders. 
  • Provisioners(optional)- is an array of one or more objects that defines the provisioners that will be used to install and configure the software for the machines created by each of the builders 
  • post-processors(optional)-  is an array of one or more objects that defines the various post-processing steps to take with the built images. If not specified, then no post-processing will be done
Sub-Commands
 
Likewise, Unix packer also takes subcommand and options. There are three sub-commands:
  • build-The packer build command takes a template and runs all the builds within it in order to generate a set of artefacts. The various builds specified within a template are executed in parallel unless otherwise specified. And the artefacts that are created will be outputted at the end of the build. 
  • validate- The packer validate command is used to validate the syntax and configuration of a template. The command will return a zero exit status on success, and a non-zero exit status on failure. Additionally, if a template doesn’t validate, any error messages will be outputted. 
  • inspect -The packer inspect takes a template and outputs the various components a template defines. This can help you quickly learn about a template without having to dive into the JSON itself. The command will tell you things like what variables a template accepts, the builders it defines, the provisioners it defines and the order they’ll run, and more.

Hope this blog helps you understand the basics of Packer. Having covered all the basics understanding, we can now “Get Started With Packer”.

ANSIBLE DYNAMIC INVENTORY IS IT SO HARD?

Thinking what the above diagram is all about. Once you are done with this blog, you will know exactly what it is. Till one month ago, I was of the opinion that Dynamic Inventory is a cool way of managing your AWS infrastructure as you don’t have to track your servers you just have to apply proper tags and Ansible Dynamic Inventory magically manages the inventory for you. Having said that I was not really comfortable using dynamic inventory as it was a black box I tried going through the Python script which was very cryptic & difficult to understand. If you are of the same opinion, then this blog is worth reading as I will try to demystify how things work in Dynamic Inventory and how you can implement your own Dynamic inventory using a very simple python script.

You can refer below article if you want to implement Dynamic inventory for your AWS infrastructure.

https://aws.amazon.com/blogs/apn/getting-started-with-ansible-and-dynamic-amazon-ec2-inventory-management/

Now coming to what is dynamic inventory and how you can create one. You have to understand what Ansible accepts as an inventory file. Ansible expects a JSON in the below format. Below is the screenshot showing the bare minimum content which is required by Ansible. Ansible expects a dictionary of groups (each group having a list of group>hosts, and group variables in the group>vars dictionary), and a _meta dictionary that stores host variables for all hosts individually (inside a hostvars dictionary).

So as long as you can write a script which generates output in the above JSON format. Ansible won’t give you any trouble. So let’s start creating our own custom inventory.

I have created a python script customdynamicinventory.py which reads the data from input.csv and generates the JSON as mentioned above. For simplicity, I have kept my input.csv as simple as possible. You can find the code here:-

https://github.com/SUNIL23891YADAV/dynamicinventory.git

If you want to test it just clone the code and replace the IP, user and key details as per your environment in the input.csv file. To make sure that our python script is generating the output in standard JSON format as expected by Ansible. You can run ./customdynamicinventory.py –list
And it will generate the output in standard JSON format as shown in below screenshot.




If you want to check how the static inventory file would have looked for the above scenario. You can refer to the below screenshot. It would have served the same purpose as the above dynamic inventory

Now to make sure your custom inventory is working fine. You can run

ansible all -i  customdynamicinventory.py -m ping

It will try to ping all the hosts mentioned in the CSV. Let’s check it

See it is working, that’s how easy it is.

Instead of a static CSV file, we can have a database where all the hosts and related details are getting updated dynamically. Then Ansible dynamic inventory script can use this database as an inventory source as long as it returns a JSON structure, mentioned in the first screenshot.

Migrate your data between various Databases

Data Migration Service

 
Have you ever thought about migrating your production database from one platform to another
and dropped this idea later, because it was too risky, you were not ready to
bare a downtime?
If yes, then please pay attention because this is what we are going to perform
in this article.
A few days back we’re trying to migrate our production MySQL RDS from AWS to GCP,  SQL, and we had to migrate data without downtime, accurate and
real-time and that too without the help
of any Database Administrator.
 
After doing a bit research and evaluating few services we finally started working on AWS DMS (Data Migration Service) and figured out this is a great service to migrate a
different kind of data.
 
You can migrate your data to and from the most widely used commercial and open-source databases, and database platforms. Databases like Oracle, Microsoft SQL Server, and
PostgreSQL, MongoDB.
The source database remains fully operational during the migration,
The service supports
homogeneous migrations such as Oracle to Oracle,
and also heterogeneous migrations between different database platforms.
 

Let’s discuss some important features of AWS DMS:

 
  • Migrates the database securely, quickly and accurately.
  • No downtime required, works as schema converter as well.
  • Supports various type or database like MySQL, MongoDB, PSQL etc.
  • Migrates real-time data also synchronize ongoing changes.
  • Data validation is available to verify database.
  • Compatible with a long range of database platforms like RDS, Google SQL, on-premises etc.
  • Inexpensive (Pricing is based on the compute resources used during the migration process).
This is a typical migration scenario.
Let’s perform step by step migration:

Note: We’ve performed migration from AWS RDS
to GCP SQL, you can choose database source and
destination as per your requirement.

  1. Create replication instance:
    A replication instance initiates the connection between the source and target databases, transfers the data, cache any changes that occur on the source database during the initial data load.
    Use the fields to below to configure the parameters of your new replication instance including network and security information, encryption details, select instance class as per requirement.

    After completion, all mandatory fields click the next tab, and you will be redirected
    to Replication Instance tab.
    Grab a coffee quickly while the instance is getting ready.

    Hope you are ready with your coffee because the instance is ready now.


  2. Now we are to create two endpoints “Source” and “Target” 2.1 Create Source Endpoint:

    Click on “Run test” tab after completing all fields, make sure your Replication instance IP is whitelisted
    under security group. 2.2 Create Target Endpoint


    Click on “Run test” tab again after completing all fields, make sure your Replication instance IP is whitelisted under target DB authorization.
    Now we’ve ready Replication Instance, Source Endpoint, and Target Endpoint.
  3. Finally, we’ll create a “Replication Task” to start replication.
    Fill the fields like:
  • Task Name: any name
  • Replication Instance: The instance we’ve created above
  • Source Endpoint: The source database
  • Target Endpoint: The target database
  • Migration Type: Here I choose “Migration existing data and replication
    ongoing” because we needed ongoing changes.
 
4. Verify the task status now.
Once all the fields are completed click on the “Create task” and you will be
redirected to “Tasks”
Tab.
Check your task status
 
The task has been successfully completed now, you can verify the inserts tabs and validation tab,
The migration is done successfully if Validation State is “Validated” that means migration has been performed successfully.