Skip to content

Git Features for Collaboration and Automation

Published: at 03:46 PM

Introduction

In the first two articles of this series, we covered both introductory and advanced Git features. Now, let’s dive into Git’s collaboration and automation tools. These features are particularly useful in team environments or when managing large repositories. We’ll explore Git Hooks, Submodules, Reflog, and other advanced workflows.


TL;DR

You can find a shorter cheat sheet version of this article here.

Table of contents

Open Table of contents

Git Hooks

Git hooks allow you to automate tasks when certain events happen in a repository. Some examples: before or after a commit, push, or merge. Hooks are simple shell scripts, located in the .git/hooks directory of your project. You can use hooks to enforce coding standards, run tests, or trigger external services.

Example: Pre-commit Hook

Imagine you want to check for files larger than 1MB and prevent them from going into the commit. This is useful in projects where large files can bloat the repository:

#!/bin/sh
# .git/hooks/pre-commit
# Prevent committing files larger than 1MB
max_size=1000000  # 1MB in bytes

echo "Checking for large files..."

for file in $(git diff --cached --name-only); do
  if [ -f "$file" ]; then
    file_size=$(wc -c <"$file")
    if [ "$file_size" -ge "$max_size" ]; then
      echo "$file is too large ($file_size bytes). Maximum allowed size is 1MB."
      exit 1
    fi
  fi
done

Make sure to make the hook executable:

chmod +x .git/hooks/pre-commit

This script will automatically check file sizes every time you commit.

Example: Commit message

You can also run a commit-msg hook to ensure commit message format (for example length):

#!/bin/sh
# .git/hooks/commit-msg
# Enforce a commit message length of 10 characters or more

min_length=10
commit_message=$(cat "$1")

if [ ${#commit_message} -lt $min_length ]; then
  echo "Error: Commit message is too short."
  exit 1
fi

Submodules

Git submodules allow you to include external repositories inside your project. This is useful for managing dependencies or shared libraries that should be developed separately but still included in your main project.

Example: Adding a Submodule

You can add a submodule to your project like this:

git submodule add https://github.com/example/library.git path/to/submodule

This command creates a folder at path/to/submodule where the external repository will be checked out. The state of the submodule is tracked, but changes to the submodule’s repository must be committed separately.

To update a submodule, run:

git submodule update --remote

Reflog

Sometimes you might accidentally reset a branch or lose a commit, and Git’s reflog can help recover those lost changes. The reflog tracks every change made to the HEAD of the repository, making it easier to find old commits.

Example: Recovering a Lost Commit

If you mistakenly reset a branch and lost commits, use reflog to find the commit:

git reflog

This will show you the history of all changes:

d5f5e7f HEAD@{0}: reset: moving to HEAD^
a72c0e4 HEAD@{1}: commit: Added new feature

To recover the lost commit:

git reset --hard a72c0e4

Now your branch is restored to the commit that was accidentally reset.


Bisect

The git bisect command is a powerful debugging tool that helps you quickly find the commit that introduced a bug or issue in your code. It automates the process of binary search within your Git history, allowing you to efficiently narrow down the exact commit that caused the problem.

How git bisect Works

  1. Binary Search: git bisect works by dividing your commit history into two parts: a “good” part where the code works as expected and a “bad” part where the bug is present. It then repeatedly checks the middle commit, asking you if it’s good or bad, cutting the search space in half with each step. This is much faster than manually checking commits one by one.

  2. Iterative Testing: You will mark commits as either “good” (bug-free) or “bad” (buggy) as git bisect guides you through the process. Git continues narrowing the range of suspect commits until it pinpoints the exact commit that introduced the bug.

Basic Workflow

  1. Start Bisect:

    • First, tell Git that you want to begin the bisect process:
    git bisect start
  2. Mark a Bad Commit:

    • Identify the commit where the bug is present, usually the most recent one:
    git bisect bad
  3. Mark a Good Commit:

    • Identify a commit from the past where the bug was not present:
    git bisect good <commit-id>
  4. Iterative Search:

    • Git will now check out a commit in the middle of your good and bad range.
    • You test this version of your code. If the bug is present, mark it as bad:
    git bisect bad
    • If the bug is not present, mark it as good:
    git bisect good
  5. Repeat:

    • Git will continue narrowing down the range of commits by checking the midpoint, and you’ll keep marking commits as either good or bad.
  6. Identify the Culprit:

    • Once the offending commit is found, Git will output the commit details. You can inspect this commit to understand what change introduced the bug.
  7. End Bisect:

    • After you’ve found the commit, terminate the bisect session:
    git bisect reset

Example: Using git bisect

Suppose you know that the most recent commit (HEAD) contains a bug, but you are unsure when it was introduced. You also know that the code was working fine a few commits ago. Here’s how you can use git bisect:

  1. Start the bisect session:

    git bisect start
  2. Mark the most recent commit as bad:

    git bisect bad
  3. Mark an older commit as good (one where the bug wasn’t present):

    git bisect good abc1234
  4. Git will check out a middle commit, and you test it. Let’s say the bug is still present, so you mark it as bad:

    git bisect bad
  5. Git checks out another commit halfway through, and you find the bug is not present, so you mark it as good:

    git bisect good
  6. This process continues until Git identifies the specific commit that introduced the bug.

Automating Bisect

You can automate the testing part of git bisect if your project has a script that can determine whether the bug is present. For example:

git bisect run ./test_script.sh

Git will automatically run your script on each checked-out commit and mark it as good or bad based on the exit code of the script (0 for good, non-zero for bad).

When to Use git bisect

Advantages of git bisect


Worktrees

Worktrees allow you to have multiple working directories in the same Git repository. This is particularly useful when you need to work on multiple branches simultaneously without switching between them.

Example: Creating a Worktree

Let’s say you’re working on a feature branch but need to quickly check something on the main branch:

git worktree add ../main-worktree main

This creates a new directory, ../main-worktree, where the main branch is checked out. You can now work on both branches simultaneously.

To remove a worktree:

git worktree remove ../main-worktree

Sparse Checkout

Sparse checkout is a Git feature that allows you to check out only specific parts of a large repository. This is useful when you only need a subset of files from a repository.

Example: Using Sparse Checkout

Let’s say you have a large monorepo hosted at https://github.com/example/monorepo.git, and you only want to work with the frontend/ and backend/ directories.

First, clone the repository:

git clone --no-checkout https://github.com/example/monorepo.git
cd monorepo

Next, specify the directories or files you want to check out:

git sparse-checkout set src/ include/ docs/

Then, use checkout command, to fetch specified directories:

git checkout

This will check out only the src, include, and docs directories.


Git Flow and Branching Strategies

Git branching strategies help teams manage the development process more efficiently. Two common strategies are Git Flow and GitHub Flow.

Git Flow

In Git Flow, you have several branches:

Example: Starting a Feature Branch

git checkout -b feature/new-feature develop

After finishing the feature, you merge it back into the develop branch:

git checkout develop
git merge feature/new-feature
git push origin develop

GitHub Flow

GitHub Flow is a simpler alternative to Git Flow, involving just two branches:


Conclusion

In this third part of our Git series, we explored essential features for collaboration and automation. From Git hooks to submodules, reflog, and branching strategies, these advanced tools help streamline development in both individual and team environments.

Stay tuned for more on how to further optimize your Git workflows in upcoming articles.