9  Git & GitHub

~8 hours Version Control Beginner-Intermediate

Learning Objectives

  • Understand why version control is essential for research
  • Master Git fundamentals: commits, branches, merges
  • Use GitHub for collaboration and sharing
  • Implement best practices for research workflows
How This Connects

Version control pairs perfectly with the reproducibility practices in Module 7. Together, they form the foundation of modern research workflows.

8.1 Why Version Control?

Ever had files like analysis_v2_final_FINAL_revised.do? Version control solves this by tracking every change systematically.

8.2 Git Fundamentals

# Initialize a repository
$ git init

# Check status
$ git status

# Stage files
$ git add filename.py
$ git add .                    # Stage all

# Commit changes
$ git commit -m "Add regression analysis"

# View history
$ git log --oneline

# See changes
$ git diff

8.3 Introduction to GitHub

# Clone a repository
$ git clone https://github.com/username/repo.git

# Connect local to GitHub
$ git remote add origin https://github.com/username/repo.git

# Push changes to GitHub
$ git push origin main

# Pull changes from GitHub
$ git pull origin main

8.4 Branching and Merging

# Create and switch to branch
$ git checkout -b feature-new-analysis

# Make commits on branch...
$ git add . && git commit -m "Add new analysis"

# Switch back to main
$ git checkout main

# Merge branch into main
$ git merge feature-new-analysis

# Delete branch after merge
$ git branch -d feature-new-analysis

8.5 Collaboration Workflows

Pull Requests

Pull requests (PRs) are GitHub's way of proposing changes. The workflow:

  1. Create a branch for your feature
  2. Push branch to GitHub
  3. Open a Pull Request
  4. Request review
  5. Merge when approved

8.6 Best Practices for Research

The .gitignore File

# .gitignore for research projects

# Data files (often too large or sensitive)
data/raw/*.csv
data/raw/*.dta
*.xlsx

# Output (can be regenerated)
output/figures/*
output/tables/*

# Environment
.env
__pycache__/
*.pyc
.Rhistory

# OS files
.DS_Store
Thumbs.db

GitHub Features for Researchers

  • Releases: Tag versions of your code (v1.0, v2.0)
  • Issues: Track bugs and tasks
  • GitHub Pages: Host project websites for free
  • Actions: Automate testing and deployment
  • Zenodo integration: Get a DOI for your code