Fixed Effects

Will Horne

Panel Data

Panel Data (or longitudinal data) is data in which you observe the same unit over multiple time periods
- A panel survey in which the same individual is resurveyed over multiple rounds
- Test scores for schools over multiple years
- State level unemployment rates
Because the same units appear in the data many times, our OLS assumptions will be violated
- Are independent errors plausible in these cases?

Hierarchical Data

Sometimes, our data is nested in some way that creates a hierarchy
- For example, individuals might be nested in cities
- Cities are nested in states
- And states are nested in countries
Again, if we don’t account for this structure, our estimates will be biased.

Fixed Effects

The most popular (but not the only) way to deal with these data structures is with fixed effects models.
- With nested data, random effects models are often good options. Beyond the scope of this class but see posted reading.
Intuition: Fixed effects models allow us to control for the invariant features of these units. This allows us to estimate within unit effects

Example: International Trade Policy

Imagine we want to know whether a visit from Germany’s chancellor increases trade with Germany in the next year
Global trade data is often structured as “stacked” data, with the each row being a country year, with each country entering the data many times
Trade policy with Germany might also depend on things like the local culture, a countries history with Germany, geography, resource endowments and more.
- This would make a messy DAG, how can we deal with this?

Panel Data Structure

   Year Country Geography CurrentPolitics ChancellorVisit TradeWithGermany
1  1990 Finland      1500      -0.4738078       152.72004        107.05227
2  1991 Finland      1500       1.7566339       148.09848         97.35936
3  1992 Finland      1500      -1.5531657       147.80263         98.52434
4  1993 Finland      1500      -1.3853555       150.05642        108.38914
5  1994 Finland      1500      -1.3254298       147.58362        102.70535
6  1995 Finland      1500       0.8584380       148.64469         96.46766
7  1990  France       400      -0.9530696        32.12637         25.61469
8  1991  France       400       0.1494820        39.29465         28.65127
9  1992  France       400      -0.3079278        31.96830         25.85378
10 1993  France       400       2.3832139        41.56196         31.36891
11 1994  France       400      -0.5136004        41.46984         20.30912
12 1995  France       400      -0.7170133        41.17091         34.09697

DAG

Adding a Fixed Effect

If we add a “fixed effect” for a country, we control for everything about that country that does not vary (or varies very slowly) over the time period of the data
For example, controlling for “France” implicitly controls for France’s proximity to Germany, it’s resource endowment, it’s political culture and it’s history with Germany
- This makes identifying effects much easier!

Cleaner DAG

What do fixed effects do?

Essentially, fixed effects control for the unit
- Can be: Country, individual, state, school, year, etc
As with any control, this means we hold the unit fixed and looked only at variation within the unit
Key point: We get rid of the “between” variation. Maybe we don’t want to do this!
- Pooled regression retains between variation (but is likely biased)
- Random effects allow for partial pooling
  - We will return to these, but bias concerns remain!

Simple Example

Imagine we tracked two people and looked at how many hours per week they exercised and how many colds they got every year. Data from just two years might look like this:

# A tibble: 4 × 5
# Groups:   Individual [2]
  Individual  Year Exercise MeanExercise WithinExercise
  <chr>      <dbl>    <dbl>        <dbl>          <dbl>
1 Esnaina     2024        5          6             -1  
2 Esnaina     2025        7          6              1  
3 Mariam      2024        4          3.5            0.5
4 Mariam      2025        3          3.5           -0.5

Pooled Data

What is the relationship between colds and exercise?

Pooled Regression

Adding Individual Means

Think of the individual means as the origin (0,0) on each individual graph

Just Between Variation

These are the individual specific factors that we are going to control away. It appears that Mariam has higher propensity for colds (and excercise)!

Individual Data

Combining and Regressing

Intuition

We removed all of the between variation by finding the individual specific mean of each variable, and then using only that variation
By using fixed effects, we lose the ability to utilize between group variation
- If we are interested in between variation, we cannot use fixed effects!
By controlling for an individual, we are also controlling for all time-invariant features of that individual
We still need to control for things that vary within the individual

Mechanics

The standard way to write a fixed effects regressions is as follows:

\[ Y_{it} = \beta_{i} + \beta_{1}X_{it} + \epsilon_{it} \]

What is different from the standard OLS line?

We now have individual specific intercepts \(\beta_{i}\)

\(X_{it}\) now varies across individuals and time

Our error is generated by both individual and temporal factors. Sometimes the error might be decomposed, for example \(u_{i}\) and \(\epsilon_{t}\) plus any idiosyncratic error

Varying Intercepts

Tip

It may not be immediately obvious, but letting the intercept vary by individual is the same as adding a “dummy” (aka binary) variable for the individual?

What does the coefficient of a binary variable tell us?

Equivalently: a control for “India” tells us the change we would expect in Y, given that we are in India rather than not in India.

Gapminder

Country Specific Intercepts

So, we see that at any level of GDP per Capita, India has a longer life expectancy by ~15 years.

Mechanics

One way to fit fixed effects regression: include a dummy variable for every individual

Or, fit the following regression:

\[ Y_{it} - \bar{Y}_{i} = \beta_{0} + \beta_{1}(X_{it} - \bar{X}_{i}) + \epsilon_{it} \]

Why does this work?

fixest packge

OLS estimation, Dep. Var.: lifeExp
Observations: 1,704
Fixed-effects: country: 142
Standard-errors: Clustered (country) 
               Estimate Std. Error t value  Pr(>|t|)    
log(gdpPercap)  9.76896   0.701507 13.9257 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 5.06038     Adj. R2: 0.832467
                Within R2: 0.410119

Note - there can be problems fitting fixed effects on non-continous data, but fixest has implemented options for most GLMs that we’ve discussed.

Interpretation

OLS estimation, Dep. Var.: lifeExp
Observations: 1,704
Fixed-effects: country: 142
Standard-errors: Clustered (country) 
               Estimate Std. Error t value  Pr(>|t|)    
log(gdpPercap)  9.76896   0.701507 13.9257 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 5.06038     Adj. R2: 0.832467
                Within R2: 0.410119

How should we interpret this?

In a year where the log of GDP per capita is one unit higher than it typically is for that country, we would expect life expectancy to be about 9.8 years longer than it typically is for that country

Policy Specific Example

Imagine we wanted to know whether building more houses leads to increases or decreases in home prices in the following quarter
We can imagine various things we would want to control for, like crime rates, quality of education in the city, etc
But, both new builds and housing prices are probably also driven by unobservable, city specific factors.
We can “control away” those city specific factors (as long as they don’t vary over time in our data) by using fixed effects

The Data

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 130710  453759  648997  706368 1051994 1339969

     city popularity    crime edu_quality new_houses    price
1 Atlanta -0.9034835 41.07687    6.915975         29 242180.7
2 Atlanta -0.9034835 49.29670    4.428896         33 257568.8
3 Atlanta -0.9034835 44.48307    4.720051         28 213320.5
4 Atlanta -0.9034835 52.05405    5.361506         20 211581.7
5 Atlanta -0.9034835 41.67420    4.830725         31 267976.9
6 Atlanta -0.9034835 39.43562    4.002756         24 263874.4

Pooled Data

City Level Averages

This is the “between” relationship. We can interpret this as “cities where more houses are built tend to have higher prices”

Within Effects

Naive (Pooled) Model


Call:
lm(formula = price ~ new_houses + crime + edu_quality, data = sim_data)

Residuals:
    Min      1Q  Median      3Q     Max 
-705236 -119945  -17964  118379  671889 

Coefficients:
             Estimate Std. Error t value Pr(>|t|)    
(Intercept) -347899.5    40615.0  -8.566   <2e-16 ***
new_houses    22329.2      402.8  55.437   <2e-16 ***
crime           197.5      506.9   0.390    0.697    
edu_quality    1561.0     5314.1   0.294    0.769    
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 185000 on 1296 degrees of freedom
Multiple R-squared:  0.7039,    Adjusted R-squared:  0.7032 
F-statistic:  1027 on 3 and 1296 DF,  p-value: < 2.2e-16

Fixed Effects Model

OLS estimation, Dep. Var.: price
Observations: 1,300
Fixed-effects: city: 13
Standard-errors: Clustered (city) 
            Estimate Std. Error   t value   Pr(>|t|)    
new_houses  -392.977   160.2970  -2.45156 3.0506e-02 *  
crime       -403.296    38.2012 -10.55717 1.9874e-07 ***
edu_quality 1958.689   663.3114   2.95290 1.2079e-02 *  
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
RMSE: 30,302.4     Adj. R2: 0.991937
                 Within R2: 0.027989

Summing Up

Why the Naive Model Gets it “Wrong”: Popular cities have higher housing prices and more new homes built. When you pool data across all cities, you capture that correlation.

How Fixed Effects Fix the Bias: Fixed effects allow each city to have its own intercept. The regression then looks within each city’s variation—eliminating city-level confounders. The Within-city estimate on new houses flips to negative (increased supply lowers price.

When to use Fixed Effects: Whenever you suspect time-invariant differences across groups (cities, states, firms, etc.) could bias your estimates.

Assumptions and Cavaets

Each regressor must vary over time for at least some i, and cannot be perfcetly colinear with fixed effects
Fixed Effects cannot solve reverse causality
- A model (FE or pooled) of police spending per capita will predict that more police causes more crime. Why?
  - Other time series approaches more appropriate, but beyond scope
Cannot solve time variant unobserved heterogeniety
Time invariant variables must have time-invariant effects (see Ren and Allison on Canvas)

Random Effects?

Random effects allow for “partial pooling”
- Rather than giving letting each unit specific mean take on any value, we assume \(\beta_{i}\) follow a known distribution.
Random effects have some nice features
- Estimates are more precise (smaller standard errors)
- Uses information about all cases to improve estimates of individual intercepts
- Rather than within variation, estimates are a weighted average of within and between variation
  - Is this a good thing? Maybe!

Random Effects Problems

Random effects only isolate a credible within variation if the individual intercepts are unrelated to our treatment + control variables
- This seems unlikely. Would be saying “The unobserved city effect is not related to new housing builds, crime, education, etc
In practice, the simple random effects model is almost never used anymore
- Random effects live on in hierarchical/multilevel modelling. Important and useful, but unfortunately beyond the scope…
- Mcelreath’s Statistical Rethinking (2020)or Gelman and Hill (2006) are both good introductions

Should you cluster your standard errors?

Standard errors are supposed to be independent, but if data is clustered this may not hold
Conventional wisdom is to cluster your standard errors at the level of your fixed effects. fixest even does this automatically. But….
- Only necessary if you have treatment effect heterogeniety
- or if your treatment assignment is clustered (ie, a whole classroom gets treated)
See Abadie et al paper on canvas for much more!

Two Way Fixed Effects

Can we get greedy and try to control away unobserved variation from multiple sources, like individuals and they city they live in
Yes!, but…
- Suppose we do use city and individual. Now we are at only looking within-individual and within-city, controlling all city specific and individual specific heterogeniety
- Individuals who don’t move don’t show up in the regression because they don’t have any variation on city, and so their effect is “absorbed”.

Two Way Fixed-Effects with Time

It would be very nice to also be able to adjust for any trends that vary with time
This was a very popular method in papers on development.
Imai and Kim show that generally: “It is impossible to simultaneously and non-parametrically adjust for unit-specific and time-specific unobserved confounders under the two-way fixed effects framework”
- Robert Kubinec has a good blog post on issues with TWFE: https://www.robertkubinec.com/post/fixed_effects/
We will return to these points when we cover Diff-in-Diff