System Monitoring

One of the main task of a system administrator is system monitoring, system monitoring usually involves monitoring the ram & disk space usage of the system …. In this blog I’ll be talking about my experience as a system admin & how I do it.

Usually system monitoring is divided into 2 parts Continuous system monitoring and troubleshooting system issues when system crosses a threshold value & you have to figure out the issue & try to resolve it.

In continuous system monitoring a system is put under continuous monitoring i.e the system ram usage is within defined limit or not, the disk space occupancy should not cross a predefined threshold …. To achieve continuous monitoring you can use couple of tools available in market such as nagios, omd we are primarily using these tools their would be other tools available also for this purpose.

Continuous system monitoring serves one purpose where they notify about any deviation from the expected state of the system, the next step is to troubleshoot this issue & resolve it accordingly. As a first step I usually execute top command, top is a very powerful command apart from just viewing the processes activity in real you can do a lot of things i.e

  • If you want to add/remove fields : press f & then you can choose the fields to add/remove
  • If you want to change ordering of  fields : press O & then you can move fields
  • If you want to change the sort order : press F or O
there are lot of other options available as well, if you want to explore them pressing h will provide you a list of all the options.

You can also read about htop, htop is an advanced form of top where you can view some graphs as well though I’ven’t used htop so much but I’m planning to 🙂

One thing to note sometimes you are not able ot run top command due to high resource utilization, in that case you have to use cat /proc/loadavg to view the load on the system & cat /proc/meminfo to view current memory state of the system.

One of the useful command if top doesn’t work
ps -eo pmem,pcpu,vsize,pid,cmd | sort -k 1 -nr | head -5
This command will give the top 5 processes by memory usage.

Also there are couple of other commands that you can use
free : To view the memory usage of system
df : To view the file system information
du : To view the disk usage

One tip : To increase the memory of system you can create a swap memory & it is always recommended to create a swap on a partition only. Another best practice for swap area is if your system RAM is below 8 gb your swap area should be double of your ram otherwise it should be half of your RAM size

Tip : Setting up Git Jenkins integration on windows box

If you have ever tried setting up git as a version control system in a Jenkins installation on a windows box you would have faced an error message ssh key not available.

The reason behind this issue is that if you are using git with ssh protocol it tries to use your private key to perform git operations over ssh protocol & the location it expects is the .ssh folder at home directory of user. To fix this issue you have to create a HOME environment variable and point to your home directory where your .ssh folder exists after that restart Jenkins & now it should work fine.

Linux Utility to manage login to systems

One of the problem I used to have as build & release engineer is to manage login to huge number of boxes through my Linux system. At the scale of 5-10 machine it’s a not a big problem but once you have close to 100+ boxes then it is not humanly possible to remember the ip’s of those boxes.
The usual approach for this problem is to maintain a reference file, from where you map machine name with the ip & find the ip of the box from this file, but again after some time this solution seems to be not that efficient. Another solution is to have a DNS server where you can store such mappings & then you can access these machines using their names only, this is the idle solution but what if you don’t have DNS server also still you have to execute the ssh command ‘ssh user@machine”.

I developed a simple solution for this problem, I created a utility script connect.sh, this script takes machine name as an argument & then we have multiple conditions statements which checks which ssh command to be executed for the machine name.

#!/bin/bash
if [ “mc1” == $1 ]; then
    ssh user@
elif [ “mc2” == $1 ]; then
    ssh user@
elif [ “mc3” == $1 ]; then
    ssh user@
.
.
.
fi

This solution worked really well for me as now I’m saved from executing whole ssh command, also for machine name I’ve followed a convention i.e _ for example the entry for a machine for release environment that is hosting an application catalog the machine name would be release_catalog, similarly dev_catalog, staging_catalog, pt_catalog.. so you don’t have to remember machine names as well :).

Automated DB Updater Release First Release

Initial version of Automated DB Updater Release ADU

With this blog I’m releasing the intial version of a python utility to provide automated db updates across various environments for different components.

The code for this utility is hosted on github
https://github.com/sandy724/ADU

You can clone the read only copy of this codebase by url given below
https://github.com/sandy724/ADU.git

To understand the basic idea about this utility go thorugh this blog
http://sandy4blogs.blogspot.in/2013/07/automated-db-updater.html

How to use this utility
Checkout the code at some directory, add the path of this directory in PYTHONPATH environment variable
Create a database with a script’s metadata table with given below ddl

CREATE TABLE `script_metadata` (
  `name` varchar(100) DEFAULT NOT NULL,
  `version` int(11) DEFAULT NOT NULL,
  `executed` tinyint(1) NOT NULL DEFAULT ‘0’,
  `env` varchar(30) DEFAULT NOT NULL,
  `releas` varchar(30) DEFAULT NOT NULL,
  `component` varchar(30) DEFAULT NOT NULL
)
Create a database.properties, containing connection properties of each environment database

[common_db]
dbHost=localhost
dbPort=3306
dbUser=root
dbPwd=root
db=test
 
 
[env1]
dbHost=localhost
dbPort=3306
dbUser=root
dbPwd=root
db=test

Here common_db represents connection to database which will contain metadata of scripts for monitoring

Now execute the pythong utility
Copy the client(updateDB.py) to directory of your choice, make sure that property configration file should also be at this directory
python updateDB.py -f -r –env

Automated DB Updater

In continuation with my blog series I’m finally introducing a automated db updater tool. You can read about the idea in previous blogs by going .

The short form of my tool is ADU(Automated DB Updater). Now some details about this tool

Each application will have database_script folder at the root level, this folder will contain folders corresponding to each release i.e release1, release2, release3…

A database release folder will contain

  • Meta file :sql_sequence.txt, this file will contain the sequence in which sql files will be executed, only files mentioned in this file will be entertained
  • SQL Files : A sql file must have a naming convention like this __.sql/__.sql

Process of automatic execution of scripts on an environment

  • Input
    • release_name : to figure out the folder from where scripts will be executed
    • environment : Environment on which scripts will be executed
  • Execution
    • sql_sequence.txt file will be read line by line having one sql file name in each line
    • The sql file will be verified whether it has been already executed or not
    • If the sql file is already executed then two conditions are verified
      • A new version of sql should be available
      • Undo version of last executed sql should be present
    • After execution of undo file the latest version of the sql file will be executed and the info is stored accordingly that it has been executed so that it will not be picked again
  • Validations & Boundary Conditions
    • All the files mentioned in sql_sequence.txt should exist.
    • Undo script should be present for all the versions of a sql file barring the latest version of sql file.
    • Undo script will only be executed if next version of script is available.

Very soon I’ll share the github url of this project keep waiting 🙂