← Back to Module 0: Languages & Platforms
Stata Interface Guide
Stata is the most widely used statistical software in academic economics. It's particularly strong for econometrics, panel data analysis, and survey data. If you're going into applied economics research, you will use Stata.
What is Stata?
Unlike R and Python, Stata is both a programming language and a complete software package. When you buy Stata, you get everything: the language, the interface, the documentation, and thousands of built-in statistical commands. This "all-in-one" approach is part of why economists love it—everything just works together.
For the nerds: more on Stata's language layers
Stata's language operates at two levels:
- Interactive commands / do-files — The primary way users interact with Stata. You type commands like
regress y x1 x2orsummarize income, and you can save sequences of commands in.dofiles to create reproducible scripts. - Ado-files — Stata's higher-level programming language for writing new commands. Most of Stata's built-in commands are actually written in ado, and users can write their own. Ado is interpreted (not compiled).
Mata then sits underneath as a compiled, C-like matrix language for performance-critical code.
So when people say "Stata," they're typically referring to both the software application and its scripting/command language. Mata is a separate but integrated language within the Stata ecosystem.
Stata requires a license ($125+ for students, more for professionals). However, most universities provide free access through site licenses. Check with your IT department or library. If you're at Sciences Po, Stata is available on campus computers and through remote desktop.
Stata Versions
Stata comes in several versions. The main differences are in memory capacity and processing speed:
| Version | Variables Limit | Best For |
|---|---|---|
| Stata/BE (Basic Edition) | 2,048 variables | Small datasets, learning |
| Stata/SE (Standard Edition) | 32,767 variables | Most research projects |
| Stata/MP (Multiprocessor) | 120,000 variables | Large datasets, parallel processing |
For most coursework and research, Stata/SE is sufficient. Your university likely provides either SE or MP.
The Stata Interface
When you open Stata, you'll see a window with five main panels. Hover over each numbered region below to learn what it does:
Hover over a numbered region to learn about that part of Stata.
The Do-file Editor: Where Real Work Happens
The Command window is fine for quick tests, but all serious Stata work should be done in Do-files. A Do-file is a script that contains a sequence of Stata commands. Using Do-files makes your work reproducible, shareable, and easier to debug.
To open the Do-file Editor:
- Go to Window > Do-file Editor > New Do-file Editor
- Or press
Ctrl+9(Windows) /Cmd+9(Mac) - Or type
doeditin the Command window
Essential Stata Commands
Here are the commands every Stata user needs to know:
| Command | What it does | Example |
|---|---|---|
use |
Load a dataset | use "data.dta", clear |
describe |
Show variable names and types | describe |
summarize |
Summary statistics | summarize income age |
tabulate |
Frequency tables | tab education |
generate |
Create a new variable | gen log_income = log(income) |
replace |
Modify existing variable | replace age = . if age < 0 |
regress |
OLS regression | reg income education age |
help |
Get help on any command | help regress |
A Sample Do-file
Here's what a typical Do-file looks like. I recommend always starting with the commands shown below:
/*******************************************************************************
* Project: Analysis of Wage Data
* Author: Your Name
* Date: January 2026
* Purpose: Explore determinants of wages
*******************************************************************************/
* Clear everything and set up
clear all
set more off
* Set working directory (adjust to your path)
cd "/Users/yourname/research/wage_project"
* Start a log file to save all output
log using "analysis_log.txt", text replace
* Load the data
use "wage_data.dta", clear
* Examine the data
describe
summarize
* Summary statistics for key variables
summarize wage education experience, detail
* Run a simple regression
regress wage education experience i.female
* Close the log file
log close
- Always start with
clear allto ensure a clean environment - Use
set more offto prevent Stata from pausing output - Start a log file to save all your output
- Comment liberally using
*or/* */ - Use relative paths after setting your working directory
Understanding Log Files
A log file is a text record of everything that happens in your Stata session: every command you run and all the output it produces. Log files are essential for reproducibility—they let you (and others) see exactly what you did and what results you got.
Creating a Log File
You start and stop logging with simple commands:
* Start logging to a text file (overwrites if exists)
log using "my_analysis.log", text replace
* ... run your analysis here ...
* Stop logging
log close
The text option creates a plain text file (readable anywhere). Without it, Stata creates a .smcl file (Stata Markup and Control Language), which preserves formatting but only opens in Stata.
Sample Log File
Here's what a typical log file looks like after running some basic analysis:
-------------------------------------------------------------------------------
name:
log: /Users/researcher/projects/wages/my_analysis.log
log type: text
opened on: 15 Jan 2026, 10:32:15
. * Load the dataset
. use "wages.dta", clear
. * Describe the data
. describe
Contains data from wages.dta
Observations: 1,000
Variables: 5 15 Jan 2026 09:15
-------------------------------------------------------------------------------
Variable Storage Display Value
name type format label Variable label
-------------------------------------------------------------------------------
wage float %9.0g Hourly wage (dollars)
education byte %9.0g Years of education
experience byte %9.0g Years of experience
female byte %9.0g sex 1 = Female
age byte %9.0g Age in years
-------------------------------------------------------------------------------
Sorted by:
. * Summary statistics
. summarize wage education experience
Variable | Obs Mean Std. dev. Min Max
-------------+---------------------------------------------------------
wage | 1,000 22.47 12.35 5.25 98.50
education | 1,000 13.24 2.68 8 20
experience | 1,000 17.82 11.45 0 45
. * Run regression
. regress wage education experience female
Source | SS df MS Number of obs = 1,000
-------------+---------------------------------- F(3, 996) = 142.56
Model | 45892.123 3 15297.374 Prob > F = 0.0000
Residual | 106834.877 996 107.263 R-squared = 0.3005
-------------+---------------------------------- Adj R-squared = 0.2984
Total | 152727.000 999 152.880 Root MSE = 10.357
------------------------------------------------------------------------------
wage | Coefficient Std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
education | 2.4521 0.1423 17.23 0.000 2.1728 2.7314
experience | 0.3845 0.0412 9.33 0.000 0.3036 0.4654
female | -3.2156 0.6534 -4.92 0.000 -4.4982 -1.9330
_cons | -8.7234 2.0145 -4.33 0.000 -12.6781 -4.7687
------------------------------------------------------------------------------
. log close
name:
log: /Users/researcher/projects/wages/my_analysis.log
log type: text
closed on: 15 Jan 2026, 10:32:18
-------------------------------------------------------------------------------
Key Parts of a Log File
- Header — Shows when and where the log was created
- Commands — Each command you ran appears after a dot (
.) - Output — The results of each command appear directly below it
- Footer — Shows when the log was closed
- Use
log using "filename", appendto add to an existing log instead of replacing it - Name your logs descriptively:
analysis_v2_2026-01-15.log - Store logs alongside your do-files for easy reference
- If you forget to close a log, use
log close _allto close any open logs
Essential Keyboard Shortcuts
| Shortcut (Windows) | Shortcut (Mac) | Action |
|---|---|---|
Ctrl + D |
Cmd + Shift + D |
Run selected code in Do-file Editor |
Ctrl + 9 |
Cmd + 9 |
Open Do-file Editor |
Ctrl + S |
Cmd + S |
Save current Do-file |
Page Up |
Page Up |
Previous command (in Command window) |
F1 |
F1 |
Help for selected command |
Getting Help in Stata
Stata has excellent built-in documentation. To get help on any command:
help regress— Opens help for the regress commandsearch panel data— Searches all documentation for "panel data"findit xtreg— Searches for user-written commands
Video Tutorials
- Official Stata Video Tutorials — Free tutorials from StataCorp
- Stata Tutorial: Introduction to Stata (YouTube) — Good beginner overview
- StataCorp YouTube Channel — Official channel with many tutorials