Intro to Packer

Packer is an opensource tool developed by HashiCorp to create machine images for multiple cloud platforms like AWS, GCP, Azure or even VMWare. As the name suggests it packs all your software, packages, configurations while baking your machine images. Perhaps Packer is the only tool right now in the market which solely focuses on creating machine images and giving us the ability to automate the machine image creation process.

In this blog post, we will learn What Packer does and how it does things. Sounds Interesting!!!!

 
What is Packer and Machine Images
 
“Packer can be used to creating identical machine images for multiple platforms from a single source configuration. Packer is lightweight, runs on every major operating system, and is highly performant, creating machine images for multiple platforms in parallel.” 
https://www.packer.io/intro/ 
It does installs and configures the software by using different SCM tools such as Ansible, Chef or Puppet, shell scripts within your Packer-made images. You can either include your scripts in json template itself or you can source it from a file.
 
“A machine image is a single static unit that contains a pre-configured operating system and installed software which is used to quickly create new running machines. Machine image formats change for each platform. Some examples include AMIs for EC2, VMDK/VMX files for VMware, OVF exports for VirtualBox, etc”
https://www.packer.io/intro/ 
 
Why the Heck We Should Consider to Learn Packer!!!!!!!!
 
Consider a couple of scenarios mentioned below:-
 
Scenario1
 
If you want to have an immutable infrastructure in place. The key guideline behind an immutable infrastructure is that you never modify a running server. If a change is required, you instead completely replace the server with a new instance that contains the update or change.
The new server instance is created with an origin image that is built upon or a restored image from a previously defined server state. Version control and tag your images for easy rollback and distribution. Image contains all the application code, runtime dependencies, and configuration–in essence, the state needed for the software to run as expected. You will want to minimize the time required to bake all your required stuff into your image which can be achieved if you have proper tag maintained over your previous release images which can be used as an origin-golden image to bake the new image. The entire process of baking and using images become outstandingly easy by using Packer.
 
Scenario2
 
If you have autoscaling in place, there must be a requirement to scale up a new serviceable VMs as soon as possible but there are some concerns which spoil your expectations of serviceable VMs in the least time:
  • OS boot 
  • OS configuration 
  • SCM with Ansible or Chef 
  • Setting up your application
With having a pre-baked image in place, your time to scale up your VMs will drastically decrease.
 
So How Does Packer Works!!!!!!!!
 
Packer uses the JSON file as a template, it takes template as in input rolls up a temporary VMs based on the details provided, does the required configuration and stops it. After stopping the VMs it starts creating the image and save it as the name/tag provided in the template.
 
json file packer engine EC2 AMI
              
                                                       
Basic Concepts of Packer
 
There are two things which you will need to know to get started with packer:
  • Templates 
  • Commands
 
Templates
 
There are four sections in the Packer template: 
  • Variables(optional)-is an object of one or more key/value strings that define user variables contained in the template. If it is not specified, then no variables are defined 
  • Builders(required)- is an array of one or more objects that defines the builders that will be used to create machine images for this template, and configures each of those builders. 
  • Provisioners(optional)- is an array of one or more objects that defines the provisioners that will be used to install and configure the software for the machines created by each of the builders 
  • post-processors(optional)-  is an array of one or more objects that defines the various post-processing steps to take with the built images. If not specified, then no post-processing will be done
Sub-Commands
 
Likewise, Unix packer also takes subcommand and options. There are three sub-commands:
  • build-The packer build command takes a template and runs all the builds within it in order to generate a set of artefacts. The various builds specified within a template are executed in parallel unless otherwise specified. And the artefacts that are created will be outputted at the end of the build. 
  • validate- The packer validate command is used to validate the syntax and configuration of a template. The command will return a zero exit status on success, and a non-zero exit status on failure. Additionally, if a template doesn’t validate, any error messages will be outputted. 
  • inspect -The packer inspect takes a template and outputs the various components a template defines. This can help you quickly learn about a template without having to dive into the JSON itself. The command will tell you things like what variables a template accepts, the builders it defines, the provisioners it defines and the order they’ll run, and more.

Hope this blog helps you understand the basics of Packer. Having covered all the basics understanding, we can now “Get Started With Packer”.

ANSIBLE DYNAMIC INVENTORY IS IT SO HARD?

Thinking what the above diagram is all about. Once you are done with this blog, you will know exactly what it is. Till one month ago, I was of the opinion that Dynamic Inventory is a cool way of managing your AWS infrastructure as you don’t have to track your servers you just have to apply proper tags and Ansible Dynamic Inventory magically manages the inventory for you. Having said that I was not really comfortable using dynamic inventory as it was a black box I tried going through the Python script which was very cryptic & difficult to understand. If you are of the same opinion, then this blog is worth reading as I will try to demystify how things work in Dynamic Inventory and how you can implement your own Dynamic inventory using a very simple python script.

You can refer below article if you want to implement Dynamic inventory for your AWS infrastructure.

https://aws.amazon.com/blogs/apn/getting-started-with-ansible-and-dynamic-amazon-ec2-inventory-management/

Now coming to what is dynamic inventory and how you can create one. You have to understand what Ansible accepts as an inventory file. Ansible expects a JSON in the below format. Below is the screenshot showing the bare minimum content which is required by Ansible. Ansible expects a dictionary of groups (each group having a list of group>hosts, and group variables in the group>vars dictionary), and a _meta dictionary that stores host variables for all hosts individually (inside a hostvars dictionary).

So as long as you can write a script which generates output in the above JSON format. Ansible won’t give you any trouble. So let’s start creating our own custom inventory.

I have created a python script customdynamicinventory.py which reads the data from input.csv and generates the JSON as mentioned above. For simplicity, I have kept my input.csv as simple as possible. You can find the code here:-

https://github.com/SUNIL23891YADAV/dynamicinventory.git

If you want to test it just clone the code and replace the IP, user and key details as per your environment in the input.csv file. To make sure that our python script is generating the output in standard JSON format as expected by Ansible. You can run ./customdynamicinventory.py –list
And it will generate the output in standard JSON format as shown in below screenshot.




If you want to check how the static inventory file would have looked for the above scenario. You can refer to the below screenshot. It would have served the same purpose as the above dynamic inventory

Now to make sure your custom inventory is working fine. You can run

ansible all -i  customdynamicinventory.py -m ping

It will try to ping all the hosts mentioned in the CSV. Let’s check it

See it is working, that’s how easy it is.

Instead of a static CSV file, we can have a database where all the hosts and related details are getting updated dynamically. Then Ansible dynamic inventory script can use this database as an inventory source as long as it returns a JSON structure, mentioned in the first screenshot.

Migrate your data between various Databases

Data Migration Service

 
Have you ever thought about migrating your production database from one platform to another
and dropped this idea later, because it was too risky, you were not ready to
bare a downtime?
If yes, then please pay attention because this is what we are going to perform
in this article.
A few days back we’re trying to migrate our production MySQL RDS from AWS to GCP,  SQL, and we had to migrate data without downtime, accurate and
real-time and that too without the help
of any Database Administrator.
 
After doing a bit research and evaluating few services we finally started working on AWS DMS (Data Migration Service) and figured out this is a great service to migrate a
different kind of data.
 
You can migrate your data to and from the most widely used commercial and open-source databases, and database platforms. Databases like Oracle, Microsoft SQL Server, and
PostgreSQL, MongoDB.
The source database remains fully operational during the migration,
The service supports
homogeneous migrations such as Oracle to Oracle,
and also heterogeneous migrations between different database platforms.
 

Let’s discuss some important features of AWS DMS:

 
  • Migrates the database securely, quickly and accurately.
  • No downtime required, works as schema converter as well.
  • Supports various type or database like MySQL, MongoDB, PSQL etc.
  • Migrates real-time data also synchronize ongoing changes.
  • Data validation is available to verify database.
  • Compatible with a long range of database platforms like RDS, Google SQL, on-premises etc.
  • Inexpensive (Pricing is based on the compute resources used during the migration process).
This is a typical migration scenario.
Let’s perform step by step migration:

Note: We’ve performed migration from AWS RDS
to GCP SQL, you can choose database source and
destination as per your requirement.

  1. Create replication instance:
    A replication instance initiates the connection between the source and target databases, transfers the data, cache any changes that occur on the source database during the initial data load.
    Use the fields to below to configure the parameters of your new replication instance including network and security information, encryption details, select instance class as per requirement.

    After completion, all mandatory fields click the next tab, and you will be redirected
    to Replication Instance tab.
    Grab a coffee quickly while the instance is getting ready.

    Hope you are ready with your coffee because the instance is ready now.


  2. Now we are to create two endpoints “Source” and “Target” 2.1 Create Source Endpoint:

    Click on “Run test” tab after completing all fields, make sure your Replication instance IP is whitelisted
    under security group. 2.2 Create Target Endpoint


    Click on “Run test” tab again after completing all fields, make sure your Replication instance IP is whitelisted under target DB authorization.
    Now we’ve ready Replication Instance, Source Endpoint, and Target Endpoint.
  3. Finally, we’ll create a “Replication Task” to start replication.
    Fill the fields like:
  • Task Name: any name
  • Replication Instance: The instance we’ve created above
  • Source Endpoint: The source database
  • Target Endpoint: The target database
  • Migration Type: Here I choose “Migration existing data and replication
    ongoing” because we needed ongoing changes.
 
4. Verify the task status now.
Once all the fields are completed click on the “Create task” and you will be
redirected to “Tasks”
Tab.
Check your task status
 
The task has been successfully completed now, you can verify the inserts tabs and validation tab,
The migration is done successfully if Validation State is “Validated” that means migration has been performed successfully.

Best Practices of Ansible Role

I have written about Ansible Roles in my career. But when I talk about the “Best Practice of writing an Ansible Role”, half of them were not following the best practices.
When I started writing this blog, I had only limited knowledge of Ansible Roles and about the practices being followed. But reading more on Ansible roles has helped in enhancing my knowledge.
Without the proper understanding of the Architecture of Ansible Role, I was incapable of enjoying all the functionality for writing an Ansible Role. Earlier, I used “command” and “shell” modules for writing an Ansible Role. Here, in this blog, I’ve discussed the best practices of Ansible Role. Let’s
read these in detail.
 

Continue reading “Best Practices of Ansible Role”

Docker Logging Driver

The  docker logs command batch-retrieves logs present at the time of execution. The docker logs command shows information logged by a running container. The docker service logs command shows information logged by all containers participating in a service. The information that is logged and the format of the log depends almost entirely on the container’s endpoint command.

These logs are basically stored at “/var/lib/docker/containers/.log”, So basically it is not easy to use this file by using Filebeat because the file will change every time when the new container is up with a new container id.

So, How to monitor these logs which are formed in different files ? For this Docker logging driver were introduced to monitor the docker logs.

Docker includes multiple logging mechanisms to help you get information from running containers & services. These mechanisms are called logging drivers. These logging drivers are configured for the docker daemon.

To configure the Docker daemon to default to a specific logging driver, set the value of log-driver to the name of the logging driver in the daemon.json file, which is located in /etc/docker/ on Linux hosts or C:\ProgramData\docker\config\ on Windows server hosts.

The default logging driver is json-file. The following example explicitly sets the default logging driver to syslog:

{                                            
  “log-driver”: “syslog”
}

After configuring the log driver in daemon.json file, you can define the log driver & the destination where you want to send the logs for example logstash & fluentd etc. You can define it either on the run time execution command as “–log-driver=syslog –log-opt syslog-address=udp://logstash:5044” or if you are using a docker-compose file then you can define it as:

“`
logging:
driver: fluentd
options:
fluentd-address: “192.168.1.1:24224”
tag: “{{ container_name }}”
“`

Once you have configured the log driver, it will send all the docker logs to the configured destination. And now if you will try to see the docker logs on the terminal using the docker logs command, you will get a msg:

“`
Error response from daemon: configured logging driver does not support reading
“`

Because all the logs have been parsed to the destination.

Let me give you an example that how i configured logging driver fluentd
and parse those logs onto Elasticsearch and viewed them on Kibana. In this case I am configuring the logging driver at the run-time by installing the logging driver plugin inside the fluentd but not in daemon.json. So make sure that your containers are created inside the same docker network where you will be configuring the logging driver.

Step 1: Create a docker network.

“`
docker network create docker-net
“`

Step 2: Create a container for elasticsearch inside a docker network.

“`
docker run -itd –name elasticsearch -p 9200:9200 –network=docker-net elasticsearch:6.4.1
“`

Step 3: Create a fluentd configuration where you will be configuring the logging driver inside the fluent.conf which is further being copied inside the fluentd docker image.

fluent.conf

“`

@type forward
port 24224
bind 0.0.0.0

@type copy

@type elasticsearch
host elasticsearch
port 9200
logstash_format true
logstash_prefix fluentd
logstash_dateformat %Y%m%d
include_tag_key true
type_name access_log
tag_key app
flush_interval 1s
index_name fluentd
type_name fluentd

@type stdout

“`

This will also create an index naming as fluentd & host is defined in the name of the service defined for elasticsearch.

Step 4: Build the fluentd image and create a docker container from that.

Dockerfile.fluent

“`
FROM fluent/fluentd:latest
COPY fluent.conf /fluentd/etc/
RUN [“gem”, “install”, “fluent-plugin-elasticsearch”, “–no-rdoc”, “–no-ri”, “–version”, “1.9.5”]
“`

Here the logging driver pluggin is been installed and configured inside the fluentd.

Now build the docker image. And create a container.

“`
docker build -t fluent -f Dockerfile.fluent .
docker run -itd –name fluentd -p 24224:24224 –network=docker-net fluent
“`

Step 5: Now you need to create a container whose logs you want to see on kibana by configuring it on the run time. In this example, I am creating an nginx container and configuring it for the log driver.

“`
docker run -itd –name nginx -p 80:80 –network=docker-net –log-driver=fluentd –log-opt fluentd-address=udp://:24224 opstree/nginx:server
“`

Step 6: Finally you need to create a docker container for kibana inside the same network.

“`
docker run -itd –name kibana -p 5601:5601 –network=docker-net kibana
“`

Now, You will be able to check the logs for the nginx container on kibana by creating an index fluentd-*.

Types of Logging driver which can be used:

       Driver           Description

  •  none:           No logs are available for the container and docker logs does  not return any output.
  •  json-file:     The logs are formatted as JSON. The default logging driver for Docker.
  •  syslog:     Writes logging messages to the syslog facility. The syslog daemon must be running on the host machine.
  •  journald:     Writes log messages to journald. The journald daemon must be running on the host machine.
  •  gelf:     Writes log messages to a Graylog Extended Log Format (GELF) endpoint such as Graylog or Logstash.
  •  fluentd:     Writes log messages to fluentd (forward input). The fluentd daemon must be running on the host machine.
  •  awslogs:     Writes log messages to Amazon CloudWatch Logs.
  •  splunk:     Writes log messages to splunk using the HTTP Event Collector.
  •  etwlogs:     Writes log messages as Event Tracing for Windows (ETW) events. Only available on Windows platforms.
  •  gcplogs:     Writes log messages to Google Cloud Platform (GCP) Logging.
  •  logentries:     Writes log messages to Rapid7 Logentries.