0 Languages & Platforms
Before diving into coding, let's get oriented. This module introduces the three programming languages I use throughout the course—Python, Stata, and R—and the software environments where you'll write and run your code.
If you're completely new to programming, this can feel overwhelming. Three languages? Multiple softwares? Don't worry. By the end of this module, you'll understand what each tool is for and which ones to focus on first. You don't need to master everything—you need to know where to start.
What You'll Learn
- The differences between Python, Stata, and R—and when to use each
- How to navigate the software environments (IDEs) for each language
- How to set up your computer for coding, or use cloud-based alternatives
The Three Languages
Each language in this course emerged from a different community with different goals. Understanding their origins helps you choose the right one for each task.
| Aspect | Python | Stata | R |
|---|---|---|---|
| Born | 1991 | 1985 | 1993 |
| Origin | General-purpose scripting | Statistical analysis | Statistical computing |
| License | Free, open-source | Commercial (paid) | Free, open-source |
| Strength | ML, versatility, AI tools | Econometrics, panel data | Statistics, visualization |
| Learning curve | Gentle | Gentle for basics | Steeper initially |
| AI assistance | Excellent | Good | Good |
My Recommendation: Start with Python and Stata
One of the most common questions I get is "Which language should I learn first?" Here's my answer as of the beginning of this course (Jan 2026):
- Python is excellent for beginners and is the language that AI tools (ChatGPT, Claude, Copilot) understand best. You may not strictly need it directly for your economics research (depending on your subfield), hence I encourage all to prioritize learning it as an investment: when you ask an LLM to help you code, it will be most fluent in Python. This makes debugging and learning much faster.
- Stata is probably still the dominant language in academic economics. You'll need it to replicate published papers, work as a research assistant, and collaborate with senior researchers. Most replication packages from top journals are in Stata.
R is great for statistical analysis, and is superior if you want advanced visualization (ggplot2) or specific causal inference packages (fixest, rdrobust). Also, it's *free*!
The Software Environments (IDEs)
An IDE (Integrated Development Environment) is the software where you write and run your code. Think of it like a word processor for code—it provides syntax highlighting, error checking, and tools to run your programs.
Each language has a preferred environment, but some IDEs (like VS Code) can handle multiple languages. I've created detailed guides for each:
RStudio
The standard IDE for R. Free, powerful, and designed specifically for statistical analysis.
Read the RStudio Guide →Stata
Stata has its own built-in IDE. Learn the interface and the essential Do-file Editor.
Read the Stata Guide →Visual Studio Code
My recommended editor for Python. Free, works with any language, great AI tool integration.
Read the VS Code Guide →Jupyter & Google Colab
Interactive notebooks for Python. Colab requires no installation—just open in your browser.
Read the Notebooks Guide →(Note: part 2 of the course --ProTools ER2-- will cover AI-powered IDEs like "Claude code desktop" and "LM studio". For the moment, however, it is important to learn coding outside of fully-AI-assisted environments. Hold your FOMO! (="Fear Of Missing Out", for boomers))
Quick Start: No Installation Needed
If you want to start coding immediately without installing anything, use these cloud-based options:
| Language | Cloud Option | Link |
|---|---|---|
| Python | Google Colab | colab.research.google.com |
| R | Posit Cloud | posit.cloud |
| Stata | University remote desktop | Check with your IT department |
Which Tool for Which Task?
Here's some tips to choosing tools for common research tasks (these suggestions assume you are a PhD candidate in Economics; but even so, take them with a pinch of salt: "most needed" tools vary across subfields. In the end, you need to figure out what's best!):
| Task | (Probably) best tool | Why |
|---|---|---|
| Replicating an economics paper | Stata | Most replication packages are in Stata (statement valid as of now; expect things to change) |
| Machine learning / Deep learning | Python | Python has excellent libraries. You can run code in Colab (an online IDE) if you want to use cloud GPUs (however, you'll need to pay ...) |
| Publication-quality visualizations | R | Great visualization capabilities (eg ggplot2) |
| Web scraping | Python | Best libraries (BeautifulSoup, Selenium) |
| Panel data econometrics | Stata or R | Purpose-built for this |
| Quick exploratory analysis | I use Python, but you'll use what you are most comfortable with. I often end up using an online IDE for those tasks (eg. Colab) | No setup, immediate results |
| Collaborating with economists | popularity is still: Stata>=R>>Python (probably, my guess). | Any language really; expect python to gain momentum. |
| Learning with AI assistance | Python | AI tools are most fluent in Python |
File Formats Reference
As you work with code and data, you'll encounter many different file types. The key distinction to remember:
Files by Language
Each programming language has its own file formats. Here's what belongs to what:
.py
CODE
.ipynb
CODE
.do
CODE
.dta
DATA
.ado
CODE
.log
OUTPUT
.R
CODE
.Rmd
CODE
.rds
DATA
.RData
DATA
.csv
DATA
.xlsx
DATA
.json
DATA
.txt
DATA
- Sharing data with anyone? Use
.csv- it works everywhere - Working in Stata? Save as
.dtato keep your variable labels - Working in R? Save as
.rdsto preserve data types - Always save your code! The
.py,.do, or.Rfile is more important than the data output
Test Your Knowledge!
10 questions in 40 seconds. Can you identify the file formats?
What's Next?
Now that you understand the landscape, here's what to do:
- Choose your starting point. If you're new to programming, I recommend starting with Python in Google Colab (zero setup required). However, the learning habits you set up initially may be very persistent. So, I encourage you to explore different environments early on to find what works best for you. I particularly enjoy working on Visual Studio because it provides a robust development environment with excellent debugging capabilities, and you can code in different languages, all in one environment.
- Read the relevant guide. Use the guide cards above to learn your chosen environment.
- Move to Module 1. Once you can run code, you're ready to start learning the basics.
Don't feel like you need to read all the IDE guides now. You can always come back to them when you need a specific tool. The most important thing is to start coding. Look at the bottom right of the page: there is a chatbot assistant especially trained to answer questions on the course materials. Use it to ask questions as you go along! A log (ie a transcript) of the conversations will be stored and I will use it to improve the course and expand the course materials in the direction people need the most. So, using the chatbot you'll be contributing to a public good!