What is a Bare Git Repository?

While using git you must have come across some questions like:

  • If git is a Distributed Version Control System then where is the remote repo of git stored in our system?
  • Why can’t we directly push into or clone from our local repository?
  • Can I host my own private remote repository just like GitLab or GitHub for my own small project?
  • What if there was no GitLab or GitHub or Bitbucket or… I mean… you got the point, right?

Well… and answer to all these questions is the same and that is a Bare Git Repository.

LET’S GET STARTED

First things first, what is a Bare Git Repository and how is it different from a normal Git Repository.

Continue reading “What is a Bare Git Repository?”

Why GitOps is so exciting?

Initially, we had the DevOps framework in which Development and Operation team collaborated to create an agile development ecosystem. Then a new wave came with the name of “DevSecOps” in which we integrated the security into the existing DevOps process. But nowadays a new terminology “GitOps” is getting famous because of its “Single Source of Truth” nature. Its fame has reached to this level that it was a trending topic at KubeCon.

Continue reading “Why GitOps is so exciting?”

Git Inside Out

Git Inside-Out
Man Wearing Black and White Stripe Shirt Looking at White Printer Papers on the Wall

Git is basically a file-system where you can retrieve your content through addresses. It simply means that you can insert any kind of data into git for which Git will hand you back a unique key you can use later to retrieve that content. We would be learning #gitinsideout through this blog

The Git object model has three types: blobs (for files), trees (for folder) and commits. 

Objects are immutable (they are added but not changed) and every object is identified by its unique SHA-1 hash
A blob is just the contents of a file. By default, every new version of a file gets a new blob, which is a snapshot of the file (not a delta like many other versioning systems).
A tree is a list of references to blobs and trees.
A commit is a reference to a tree, a reference to parent commit(s) and some decoration (message, author).

Then there are branches and tags, which are typically just references to commits.

Git stores the data in our .git/objects directory. After initialising a git repository, it automatically creates .git/objects/pack and .git/objects/info with no regular file. After pushing some files, it would reflect in the .git/objects/ folder

OBJECT Blob

blob stores the content of a file and we can check its content by command

git cat-file -p

or git show

OBJECT Tree

A tree is a simple object that has a bunch of pointers to blobs and other trees – it generally represents the contents of a directory or sub-directory.

We can use git ls-tree to list the content of the given tree object

OBJECT Commit

The “commit” object links a physical state of a tree with a description of how we got there and why.

A commit is defined by tree, parent, author, committer, comment

All three objects ( blob,Tree,Commit) are explained in details with the help of a pictorial diagram.

Often we make changes to our code and push it to SCM. I was doing it once and made multiple changes, I was thinking it would be great if I could see the details of changes through local repository itself instead to go to a remote repository server. That pushed me to explore Git more deeply.

I just created a local remote repository with the help of git bare repository. Made some changes and tracked those changes(type, content, size etc).

Below example will help you understand the concept behind it.

Suppose we have cloned a repository named kunal:

Inside the folder where we have cloned the repository, go to the folder kunal then:

cd kunal/.git/

I have added content(hello) to readme.md and made many changes into the same repository as:

adding README.md

updating Readme.md

adding 2 files modifying one

pull request

commit(adding directory).

Go to the refer folder inside .git and take the SHA value for the master head:

This commit object we can explore further with the help of cat-file which will show the type and content of tree and commit object:

Now we can see a tree object inside the tree object. Further, we can see the details for the tree object which in turn contains a blob object as below:

Below is the pictorial representation for the same:

More elaborated representation for the same :

Below are the commands for checking the content, type and size of objects( blob, tree and commit)

kunal@work:/home/git/test/kunal# cat README.md
hello

We can find the details of objects( size,type,content) with the help of #git cat-file

git-cat-file:- Provide content, type or size information for repository objects

You an verify the content of commit object and its type with git cat-file as below:

kunal@work:/home/git/test/kunal/.git # cat logs/refs/heads/master

Checking the content of a blob object(README.md, kunal and sandy)

As we can see first one is adding read me , so it is giving null parent(00000…000) and its unique SHA-1 is 912a4e85afac3b737797b5a09387a68afad816d6

Below are the details that we can fetch from above SHA-1 with the help of git cat-file :

Consider one example of merge:

Created a test branch and made changes and merged it to master.

Here you can notice we have two parents because of a merge request

You can further see the content, size, type of repository #gitobjects like:

Summary

This is pretty lengthy article but I’ve tried to make it as transparent and clear as possible. Once you work through the article and understand all concepts I showed here you will be able to work with Git more effectively.

This explanation gives the details regarding tree data structure and internal storage of objects. You can check the content (differences/commits)of the files through local .git repository which stores each object with unique  SHA  hash. This would clear basically the internal working of git.
Hopefully, this blog would help you in understanding the git inside out and helps in troubleshooting things related to git.

Git-Submodule

Rocket Science has always fascinated me, but one thing which totally blows my mind is the concept of modules aka. modular rockets. The literal definition of modules statesA modular rocket is a type of multistage rocket which features components that can be interchanged for specific mission requirements.” In simple terms, you can say that the Super Rocket depends upon those Submodules to get the things done.
Similarly is the case in the Software world, where super projects have multiple dependencies on other objects. And if we talk about managing projects Git can’t be ignored, Moreover Git has a concept of Submodules which is slightly inspired by the amazing rocket science of modules.

Hour of Need

Being a DevOps Specialist we need to do provisioning of the Infrastructure of our clients which is sometimes common for most of the clients. We decided to Automate it, which a DevOps is habitual of. Hence, Opstree Solutions initiated an Internal project named OSM. In which we create Ansible Roles of different opensource software with the contribution of each member of our organization. So that those roles can be used in the provisioning of the client’s infrastructure.
This makes the client projects dependent on our OSM. Which creates a problem statement to manage all dependencies which might get updated over the period. And to do that there is a lot of copy paste, deleting the repository and cloning them again to get the updated version, which is itself a hair-pulling task and obviously not the best practice.
Here comes the git-submodule as a modular rocket to take our Super Rocket to its destination.

Let’s Liftoff with Git-Submodules

A submodule is a repository embedded inside another repository. The submodule has its own history; the repository it is embedded in is called a superproject.

In simple terms, a submodule is a git repository inside a Superproject’s git repository, which has its own .git folder which contains all the information that is necessary for your project in version control and all the information about commits, remote repository address etc. It is like an attached repository inside your main repository, which can be used to reuse a code inside it as a “module“.
Let’s get a practical use case of submodules.
We have a client let’s call it “Armstrong” who needs few of our ansible roles of OSM for their provisioning of Infrastructure. Let’s have a look at their git repository below.

$    cd provisioner
$    ls -a
     .  ..  ansible  .git  inventory  jenkins  playbooks  README.md  roles
$    cd roles
$    ls -a
     apache  java   nginx  redis  tomcat
We can see in this Armstrong’s provisioner repository(a git repository) depends upon five roles which are available in OSM’s repository to help Armstrong to provision their infrastructure. So we’ll add submodules osm_java and others.

$    cd java
$    git submodule add -b armstrong [email protected]:oosm/osm_java.git osm_java
     Cloning into './provisioner/roles/java/osm_java'...
     remote: Enumerating objects: 23, done.
     remote: Counting objects: 100% (23/23), done.
     remote: Compressing objects: 100% (17/17), done.
     remote: Total 23 (delta 3), reused 0 (delta 0)
     Receiving objects: 100% (23/23), done.
     Resolving deltas: 100% (3/3), done.

With the above command, we are adding a submodule named osm_java whose URL is [email protected]:oosm/osm_java.git and branch is armstrong. The name of the branch is coined armstrong because to keep the configuration of each of our client’s requirement isolated, we created individual branches of OSM’s repositories on the basis of client name.
Now if take a look at our superproject provisioner we can see a file named .gitmodules which has the information regarding the submodules.

$    cd provisioner
$    ls -a
     .  ..  ansible  .git  .gitmodules  inventory  jenkins  playbooks  README.md  roles
$    cat .gitmodules
     [submodule "roles/java/osm_java"]
     path = roles/java/osm_java
     url = [email protected]:oosm/osm_java.git
     branch = armstrong

Here you can clearly see that a submodule osm_java has been attached to the superproject provisioner.

What if there was no submodule?

If that was a case, then we need to clone the repository from osm and paste it to the provisioner then add & commit it to the provisioner phew….. that would also have worked.
But what if there is some update has been made in the osm_java which have to be used in provisioner, we can not easily sync with the OSM. We would need to delete osm_java, again clone, copy, and paste in the provisioner which sounds clumsy and not a best way to automate the process.
Being a osm_java as a submodule we can easily update that this dependency without messing up the things.

$    git submodule status
     -d3bf24ff3335d8095e1f6a82b0a0a78a5baa5fda roles/java/osm_java
$    git submodule update --remote
     remote: Enumerating objects: 3, done.
     remote: Counting objects: 100% (3/3), done.
     remote: Total 2 (delta 0), reused 2 (delta 0), pack-reused 0
     Unpacking objects: 100% (2/2), done.
     From [email protected]:oosm/osm_java.git     0564d78..04ca88b  armstrong     -> origin/armstrong
     Submodule path 'roles/java/osm_java': checked out '04ca88b1561237854f3eb361260c07824c453086'

By using the above update command we have successfully updated the submodule which actually pulled the changes from OSM’s origin armstrong branch.

What have we learned? 

In this blog post, we learned to make use of git-submodules to keep our dependent repositories updated with our super project, and not getting our hands dirty with gullible copy and paste.
Kick-off those practices which might ruin the fun, sit back and enjoy the automation.

Referred links:
Image: google.com
Documentation: https://git-scm.com/docs/gitsubmodules


Gitolite

 

Requirement

We need private git repositories for internally use in our project so we use Gitolite for this requirement. Our client has a lot of consultants, partners and short term employees working with their code so they needed a good way of controlling access to the repos and preferably without giving each of them a unix user on the server where the repo is hosted.

What is Gitolite?

Gitolite is basically an access layer on top of Git. Users are granted access to repos via a simple config file and we as an admin only needs the users public SSH key and a username from the user. Gitolite uses this to grant or deny access to our Git repositories. And it does this via a git repository named gitolite-admin.

Installation

We need a public key and a Gitolite user through which we will setup the Gitolite.

In this case I have used my base machine(Ubuntu) public key so that only my machine can manage Gitolite.

Now we will copy this public key to a virtual machine

$ scp ~/.ssh/gitolite.pub [email protected]:/home/git

 

where vagrant is the user of my virtual machine & its IP is 192.168.0.20

Now we will install & create a gitolite user on remote machine which will be hosting gitolite.

root@git:~# apt-get install gitolite3
 
root@git:~# adduser gitolite
 
Now we need to remove password of gitolite user from below command
 
root@git:~# passwd -d gitolite
 
 
Let’s move & change the ownership of this public key.
root@git:~# mv gitolite.pub /home/gitolite/
root@git:~# chown gitolite:gitolite /home/gitolite/gitolite.pub
 
Become the gitolite user
 
root@git:~# su – gitolite
 
Now setup the gitolite with the public key
 
gitolite@git:~# gitolite setup -pk gitolite.pub
 
Now to manage the repositories, users and access-rights we will download the gitolite-admin(git repository) to our base machine.
 
$ git clone [email protected]:gitolite-admin
$ cd gitolite-admin
$ ls -l
total
8
drwxr-xr-x
2 nitin nitin 4096 Jan 10 17:52 conf/
drwxr-xr-x
2 nitin nitin 4096 Jan  9 13:43 keydir/
 
where “keydir” is the directory where we store our user’s keys and that key name must be same as existing username on the system.
 
In conf directory there is a “gitolite.conf” file which controls which repositories are available on the system and who has which rights to those repositories.
We just need to add new repository name & users who will access it and this file will create the repo & grant the permission on it accordingly.
Let us explore my gitolite.conf file in which I have added a new repository called “opstreeblog”
$ cat conf/gitolite.conf

# Group name & members

@admin = nitin
@staff    = jatin james
 
# Gitolite admin repository

repo gitolite-admin
RW+   = gitolite @admin
 
# Read-Write permission to all the users on testing repo

repo testing
RW+    = @all
 
# Read-Write permission to user sandy & the admin group. And Read-Only access to staff group

repo opstreeblog
   RW+   = sandy @admin
   R         = @staff

where ‘@’ denotes the user group i.e @staff is a group & jatin, james are the users of this group and these names must be similar to the key name stored in keydir directory.
For example “jatin” user must have the public key named “jatin.pub”

Let’s have a quick test of our setup

$ git commit conf/gitolite.conf -m “added opstreeblog repo”
 
[master 357bbc8] added “opstreeblog” repo
 
1 files changed, 9 insertions(+), 1 deletions(-)
 
nitin@Latitude-3460:~/gitolite-admin$ git push origin master
 
Counting objects: 7, done.
Delta compression using up to 4 threads.
Compressing objects: 100% (3/3), done.
Writing objects: 100% (4/4), 428 bytes, done.
Total
4 (delta 0), reused 0 (delta 0)
remote: Initialized empty Git repository in /home/gitolite/repositories/opstreeblog.git/
To gitbox:gitolite-admin d595439..357bbc8
master -> master
 
I hope that gives you a good overview of how to install and manage Gitolite.