Chapter 7. Basic Methods for Establishing Causal Inference
Chapter 8. Advances Methods for Establishing Causal Inference
Initial Postings: Read and reflect on the assigned readings for the week. Then post what you thought was the most important concept(s), method(s), term(s), and/or any other thing that you felt was worthy of your understanding in each assigned textbook chapter.Your initial post should be based upon the assigned reading for the week, so the textbook should be a source listed in your reference section and cited within the body of the text. Other sources are not required but feel free to use them if they aid in your discussion.
Also, provide a graduate-level response to each of the following questions:
- Causal inference is used as a secondary or tertiary tool in root cause analysis. Please explain how causal inference and root cause analysis are used in problem detection. Respond to this discussion board (DB) in the context of your field of employment. For example, if you are in I.T., respond to this DB by explaining the cause of a network failure; or if you are the in the food industry, use this DB to explain the cause of a recent decline in customer satisfaction. Please address each component of the discussion board. Also, cite examples according to APA standards.
[Your post must be substantive and demonstrate insight gained from the course material. Postings must be in the student’s own words – do not provide quotes!]
[Your initial post should be at least 450+ words and in APA format (including Times New Roman with font size 12 and double spaced). Post the actual body of your paper in the discussion thread then attach a Word version of the paper for APA review]
Basic Methods for Establishing Causal Inference
Chapter 7
© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education
Learning Objectives
Explain the consequences of key assumptions falling within a causal model
Explain how control variables can improve causal inference from regression analysis
Use control variables in estimating a regression equation
Explain how proxy variables can improve causal inference from regression analysis
Use proxy variables in estimating a regression equation
Explain how functional form choice can affect causal inference from regression analysis
‹#›
© 2019 McGraw-Hill Education.
The assumptions to estimate the parameters of a regression equation are:
The data-generating process for an outcome, Y, can be expressed as: Yi = α + β1X1i + … βKXKi + Ui
{Yi, Xi, …, XKi) is a random sample
E[U] = E[U × X1] = … = E[U × XK] = 0
If these assumptions hold, we can use our regression equation estimates as “good guesses” for the parameters.
Assessing Key Assumptions within a Causal Model
‹#›
© 2019 McGraw-Hill Education.
Assumption 1 states that: the determining function is linear in the parameters, and that other factors—in the form of the error term—are additive (they simply add on at the end)
For example:
Total Costs = Fixed Costs + f1Factor1 + … + fKFactorJ
FactorJ represents a factor of production and f1 its price
If we have data on Factor1 through FactorK , where K < J
Total Costs = α + β1Factor1i + … βKFactorKi + Ui
Assessing Key Assumptions within a Causal Model
‹#›
© 2019 McGraw-Hill Education.
Assumption 2 states that our sample is random
There are many ways to collect a random sample, but all start with first defining the population
For example, we may define the population as all individuals in the United States, and then randomly draw Social Security numbers to build the sample.
When dealing with populations that span multiple periods of time, we treat what was observed for a given period of time as realization from a broader set of possibilities
Random Sample
‹#›
© 2019 McGraw-Hill Education.
Key merit of drawing a random sample is that, on average, it should look like a smaller version of the population from which we are drawing
The information in a random sample should “represent” the population
For any given sample of data, randomness does not guarantee that it represents the population well
Random vs Representative Sample
‹#›
© 2019 McGraw-Hill Education.
Random vs Representative Sample
If we have a random sample of 20 people asking them about their age and rating of the product from all the customers
But problem with this sample is it is not representative of a population of age over 40
To avoid situations like this, it is common practice to take measures to collect a representative sample
‹#›
© 2019 McGraw-Hill Education.
Age and Rating Data for a New Product
‹#›
© 2019 McGraw-Hill Education.
Representative sample: a sample whose distribution approximately matches that of the population for a subset of observed, independent variables
Constructing a representative sample:
Step 1: Choose the independent variables whose distribution you want to be representative
Step 2: Use information about the population to stratify (categorize) each of the choses variables
Step 3: Use information about the population to pre-set the proportion if the sample that will be selected from each stratum
Step 4: Collect the sample by randomly sampling from each stratum, where the number of random draws from each stratum is set according to the proportions determined in Step 3
Random vs. Representative Sample
‹#›
© 2019 McGraw-Hill Education.
We are interested in how rating depend on age, so we have age in the role of independent variable:
Step 1: With just one independent variable, this step is trivial—we want a representative sample according to age
Step 2: We need to utilize information we have about the population. We know that 30% of the population is over the age of 40. We can stratify the data into two groups: over 40 and 40 and under.
Random vs Representative Sample
‹#›
© 2019 McGraw-Hill Education.
Random vs Representative Sample
Step 3: We use our knowledge of the population to determine the proportion of our same coming from these two strata: 30% should be over 40 and 70% 40 and under. If our sample size is N = 1,000, we will have 300 who are over 40 and 700 who are 40 and under
Step 4: We may collect a random sample larger than 1,000 to ensure there are at least 300 who are over 40 and at least 700 who are 40 and under. Then, randomly select 300 from the subgroup who are over 40, and randomly select 700 from the group who are 40 and under
‹#›
© 2019 McGraw-Hill Education.
The concepts of random and representative are not mutually exclusive when it comes to data samples. A sample can be both
If we construct a representative sample, then by construction it is not truly a random sample
Constructing a representative sample ensures that we observe the pertinent range of our independent variables
Construction of a representative sample often ensures that we have substantial variation in the independent variables
Random vs. Representative Sample
‹#›
© 2019 McGraw-Hill Education.
Consequences of Nonrandom Samples
The construction of a representative sample generally results in nonrandom sample
A sample that is nonrandom is also known as selected sample
Two fundamental ways in which a sample can be nonrandom or selected. It can be selected according to:
The independent variables (Xs)
The dependent variable (Y)
‹#›
© 2019 McGraw-Hill Education.
Selection by independent variable
THE REGRESSION LINE FOR THE DATA SET IS:
RATING = 40 + 0.5AGE.
USING JUST DATA FOR AGE < 30 WILL SIMPLY LIMIT WHERE, ALONG THE LINE, WE ARE OBSERVING DATA. USING JUST THESE DATA POINTS WILL SKEW OUR ESTIMATES FOR THE REGRESSION LINE. Assessing Key Assumptions within a Causal Model ‹#› © 2019 McGraw-Hill Education. Selection by dependent variable SAMPLE IS SELECTED SUCH THAT THE ONLY OBSERVATIONS WHERE THE RATING IS ABOVE 60 (ABOVE THE GREEN LINE). SELECTION OF SAMPLE DEPENDING ON RATING (DEPENDENT VARIABLE) MAY CAUSE PROBLEMS WHEN ESTIMATING REGRESSION EQUATION. SELECTION OF SAMPLE DEPENDING ON DEPENDENT VARIABLE MAY CREATE A SITUATION WHERE E[Ui] = E[Xi[Ui] = 0 MAY HOLD TRUE FOR THE FULL POPULATION, BUT E[Ui] 0 and E[Xi[Ui] 0 FOR THE SELECTED SUBSET OF THE POPULATION. Assessing Key Assumptions within a Causal Model ‹#› © 2019 McGraw-Hill Education. Selection by depended variable SELECTING DATA POINTS WHERE RATING IS ABOVE 60, HAS TWO IMPORTANT CONSEQUENCES: THE MEAN VALUE OF THE ERRORS IS POSITIVE FOR THE SELECTED SUBSET AND, THE ERRORS AND AGE ARE NEGATIVELY CORRELATED. Assessing Key Assumptions within a Causal Model ‹#› © 2019 McGraw-Hill Education. Assumption 3 states that E[U] = E[U × X1] = … = E[U × XK] = 0. This means we assume the errors have a mean of zero and are not correlated with the treatments in the population Violation of this assumption, meaning there exists correlation between the errors and at least one treatment, is known as an endogeneity problem The component(s) of the error, Ui, that are correlated with a treatment(s), X, as confounding factors No Correlation Between Errors and Treatment ‹#› © 2019 McGraw-Hill Education. Three main forms in which endogeneity problems generally materialize: Omitted variable: Any variable contained in the error term of a data generating process, due to lack of data or simply a decision not to include it Measurement error: When one or more of the variables in the determining function (typically at least one of the treatments) is measured with error. Simultaneity: This can arise when one or more of the treatments is determined at the same time as the outcome; often occurs when some amount of reverse causality occurs No Correlation Between Errors and Treatment ‹#› © 2019 McGraw-Hill Education. Control variable: any variable included in a regression equation whose purpose is to alleviate an endogeneity problem Confounding factor that is added to a determining function Control Variables ‹#› © 2019 McGraw-Hill Education. Yi = α + β1X1i + … βKXKi + Ui If the variable C is a confounding factor within the data-generating process, if… C affects the outcome, Y C is correlated with at least one treatment (Xj) Then… C is a good control, and its inclusion as part of the determining function can help mitigate an endogeneity problem Criterion for a Good Control ‹#› © 2019 McGraw-Hill Education. Dummy variable is a dichotomous variable (one that takes on values 0 or 1)that is used to indicate the presence or absence of a given characteristic Typically utilized in regression equations in lieu of categorical, ordinal, or interval variables Dummy Variables ‹#› © 2019 McGraw-Hill Education. Categorical variable Indicates membership to one of a set of two or more mutually exclusive categories that do not have an obvious ordering Ordinal variable Indicates membership to one of a set of two or more mutually exclusive categories that do not have an obvious ordering, but the difference in values is not meaningful Interval variable Indicates membership to one of a set of two or more mutually exclusive categories that have an obvious ordering, and the difference in values is meaningful Types of Variables ‹#› © 2019 McGraw-Hill Education. Suppose we have a data-generating process as: Salesi = α + β1Commisioni + β2Locationi + Ui We cannot regress “Sales” on “Commission” and “Location” since Location does not take on numerical values Instead include the dummy variables created for Location as part of the determining function, rather than the Location variable itself: Salesi = α + β1Commisioni + β2LosAngelesi + β2Chicagoi + Ui Base group is the excluded dummy variable among a set of dummy variables representing a categorical, ordinal, or interval variable Dummy Variables ‹#› © 2019 McGraw-Hill Education. Selecting Controls The variables that theory says should affect the outcome should all be included in the regression All these variables belong as part of the data-generating process These variables can serve as valuable data sanity checks A data sanity check for a regression is a comparison between the estimated coefficient for an independent variable in a regression and the value for that coefficient as predicted by theory ‹#› © 2019 McGraw-Hill Education. When Selecting Controls: Identify variables that theoretically should or might affect the outcome Include variables that theoretically should affect the outcome For variables that theoretically might affect the outcome, include those that prove to affect the outcome empirically through a hypothesis test For variables that theoretically might affect the outcome, discard those that prove irrelevant through a hypothesis test Selecting Controls ‹#› © 2019 McGraw-Hill Education. Proxy variable is a variable used in a regression equation in order to proxy for a confounding factor, in an attempt to alleviate the endogeneity problem caused by that confounding factor Proxy Variables ‹#› © 2019 McGraw-Hill Education. Functional form choice can affect causal inference from regression analysis Assuming the following data-generation function: Salesi = α + βHoursi + Ui Implies that value of sales change with hours at a constant rate of β (e.g. if β is 12 then each increase in hours will increase sales by 12) Form of the Determining Function ‹#› © 2019 McGraw-Hill Education. Functional form choice can affect causal inference from regression analysis Hours may affect Sales in a non-linear way, such that they have a large effect for the first few hours, but the effect diminishes as hours become large A quadratic determining function might be better than the linear determining function The causal relationship between Sales and Hours: Salesi = α + βHoursi + β2Hours2i + Ui Form of the Determining Function ‹#› © 2019 McGraw-Hill Education. Salesi = α + βHoursi + β2Hours2i + Ui Where we set Hours = X1 and Hours2 = X2 and it looks like a generic multiple regression equation Form of the Determining Function ‹#› © 2019 McGraw-Hill Education. Consequences of using the wrong function form: Constrains the shape of the relationship between sales and hours If we assume it is linear, the effect is constant β. If we assume it is quadratic, the effect is not constant – simple calculus will show it is + hours. Use Weierstrass approximation theorem: if a function is continuous, it can be approximated as closely as desired with polynomial function Form of the Determining Function ‹#› © 2019 McGraw-Hill Education. Quadratic Relationship Between Y and X ‹#› © 2019 McGraw-Hill Education. THIS FUNCTION CLEARLY CANNOT BE APPROXIMATED BY LINEAR OR QUADRATIC FUNCTION. HOWEVER THERE IS A POLYNOMIAL THAT CAN GET EXTREMELY CLOSE TO THIS HIGHLY IRREGULAR FUNCTION. Example of a Continuous but Highly Irregular Function ‹#› © 2019 McGraw-Hill Education. Laffer Curve THE LAFFER CURVE IS BASED ON THE IDEA THAT TAX REVENUE WILL BE ZERO BOTH WITH A ZERO TAX RATE AND A 100% TAX RATE BUT IS POSITIVE FOR TAX RATES IN BETWEEN ‹#› © 2019 McGraw-Hill Education. Interpretations of β for Different Log Functional Forms Log-log measures elasticity, the percentage change in one variable with a percentage change in another ‹#› © 2019 McGraw-Hill Education. image1 image2 image3.JPG image4 image5.JPG image6.JPG image7 image8 image9.JPG image10.JPG image11 image12
Advanced Methods for Establishing Causal Inference
Chapter 8
© 2019 McGraw-Hill Education. All rights reserved. Authorized only for instructor use in the classroom. No reproduction or distribution without the prior written consent of McGraw-Hill Education
Learning Objectives
Explain how instrumental variables can improve causal inference in regression analysis
Execute two-state least square regression
Judge which type of variables may be used as instrumental variables
Identify a difference-in-difference regression
Execute regression incorporating fixed effects
Distinguish the dummy variable approach from a within estimator for a fixed effect regression model
‹#›
© 2019 McGraw-Hill Education.
Instrumental variables
In the context of regression analysis, a variable that allows us to isolate the causal effect of a treatment on an outcome due to its correlation with the treatment and the lack of correlation with the outcome
Can improve causal inference in regression analysis
Instrumental Variables
‹#›
© 2019 McGraw-Hill Education.
A firm attempting to determine how its sales depend on price it charges for its product
Beginning with a simple data-generating process:
Salesi = α + β1Pricei + Ui
If local demand factor depends on local income, then local income is a confounding factor:
Salesi = α + β1Pricei + β2Incomei + Ui
Instrumental Variables: An Example
‹#›
© 2019 McGraw-Hill Education.
Including income in the model removes local income as confounding factor
Does its inclusion ensure that no other confounding factors still exist?
Many possibilities may come to mind, including local competition, market size, and market growth rate
Instrumental Variables: An Example
‹#›
© 2019 McGraw-Hill Education.
We may be unable to collect data on all confounding factors or find suitable proxies
Then we are unable to remove the endogeneity problem by including controls and/or proxy variables
A widely used method for measuring causality that can circumvent this problem involves instrumental variables
Instrumental Variables
‹#›
© 2019 McGraw-Hill Education.
Suppose we know price differences across some of the stores were solely due to differences in fuel costs
When two locations have different prices, we generally cannot attribute differences in sales to price differences, since these two locations likely differ in local competition
Rather than use all of the variation in price across the stores to measure the effect of price on sales, we focus on the subset of price movements due to variation in fuel costs
Instrumental Variables
‹#›
© 2019 McGraw-Hill Education.
WHEN TWO LOCATIONS HAVE DIFFERENT PRICES ONLY BECAUSE THEIR FUEL COSTS DIFFER, ANY DIFFERENCE IN SALES CAN BE ATTRIBUTED TO PRICE, SINCE FUEL COSTS DON’T IMPACT SALES PER SE
Instrumental Variables: An Example
‹#›
© 2019 McGraw-Hill Education.
Suppose we have the following data-generating function:
Yi = α + β1X1i + β2X2i + … + βKXKi + Ui
Variable Z is a valid instrument for Xi if Z is both exogenous and relevant, if:
Exogenous: It has no effect on the outcome variable beyond the combined effects of all variables in the determining function (X1…XK)
Relevant: For the assumed data-generating process, Z is relevant as an instrumental variable if it is correlated with X1 after controlling for X2….XK
Instrumental Variables
‹#›
© 2019 McGraw-Hill Education.
Two-stage least squares regression (2SLS) is the process of using two regressions to measure the causal effect of a variable while utilizing an instrumental variable
The first stage of 2SLS determines the subset of variation in Price that can attributed to changes in fuel costs; we can call the variable that tracks this variation
The second stage determines how Sales change with the movements of
This means that if we see Sales correlate with , there is reason to interpret this co-movement as the causal effect of Price
Two-Stage Least Square Regression
‹#›
© 2019 McGraw-Hill Education.
For an assumed data-generating process:
Yi = α + β1X1i + β2X2i + … + βKXKi + Ui
Suppose X1 is endogenous and Z is a valid instrument for X1. We execute 2SLS, in the first stage we assume:
X1i = γ + δ1Zi + δ2X2i + … + δKXKi + Vi
Then regress X1 on Z, X2…,XK and calculate predicted values for X1, defined as:
= + 1Z + 2X2 + … + XK
Two-Stage Least Square Regression
‹#›
© 2019 McGraw-Hill Education.
In the second stage, regress Y on , X2, …, XK
From the second stage regression, the estimated coefficient for is a consistent estimate for β1 (the causal effect of X1 on Y) and the estimated coefficient on X2 is a consistent estimate for β2
Run two consecutive regressions using the predictions from the first as an independent variable in the second
Statistical software combines this process into a single command
Two-Stage Least Square Regression
‹#›
© 2019 McGraw-Hill Education.
2SLS Estimates for Y Regressed on
X1, X2, and X3
‹#›
© 2019 McGraw-Hill Education.
Summary of 2SLS where we have J endogenous variables and L J instrumental variables
Yi = α + β1X1i + β2X2i + … + βKXKi + Ui
Suppose X1, …, XJ are endogenous and Z1, …, ZL are valid instruments for X1, …, XJ
Execution of 2SLS proceeds as follows:
Two-Stage Least Square Regression
‹#›
© 2019 McGraw-Hill Education.
Two-Stage Least Square Regression
Regress X1, …, XJ on Z1, …, ZK , XJ+1 , … XK in J separate regressions
Obtain predicted values , …, using the corresponding estimated regression equations in Step 1. This concludes “Stage 1”
Regress Y on , …, , XJ+1 , … XK , which yields consistent estimates for α, β1, …, βK. This is “Stage 2”
‹#›
© 2019 McGraw-Hill Education.
An instrumental variable must be exogenous and relevant, and if so, we can use 2SLS to get consistent estimates for the parameters of the determining function
Can we assess whether the instrumental variable possesses these two characteristics?
Evaluating Instruments
‹#›
© 2019 McGraw-Hill Education.
An instrumental variable is exogenous if it is uncorrelated with unobservables affecting the dependent variable
For a data-generating process Yi = α + β1X1i + … + βKXKi + Ui , an instrumental variable Z must have Corr(Z, U) = 0
To prove this, regress Y on X1,…..XK, and calculate the residuals as: ei = Yi – ‒ X1i ‒ … ‒ XKi
We could then calculate the sample correlation between Z and the residuals, believing this to be an estimate for the correlation between Z and U
Exogeneity
‹#›
© 2019 McGraw-Hill Education.
The problem is that the residuals were calculated using a regression with an endogenous variable
Our parameter estimates are not consistent, meaning the sample correlation between Z and the residuals generally is not an estimator for the correlation between Z and U
If the number of instrumental variables is equal to the number of endogenous variables, there is no way to test for exogeneity
If the number of instrumental variables is greater than the number of endogenous variables, there are tests that can be performed to find evidence that at least some instrumental variables are not exogenous, but there is no way to test that all are exogenous
Exogeneity
‹#›
© 2019 McGraw-Hill Education.
Testing for relevance is simple and can be added when conducting 2SLS
For a data-generating process: Yi = α + β1X1i + … + βKXKi + Ui where X1 is endogenous, Z is relevant if it is correlated with X1 after controlling for X1, …, XK
We can assess whether this is true by regressing X1 on Z, X2…,XK
Relevance
‹#›
© 2019 McGraw-Hill Education.
Regression Output for Price Regressed on Income and Fuel Costs
‹#›
© 2019 McGraw-Hill Education.
It is important to establish convincing evidence that an instrumental variable(s) is relevant
Doing so avoids common criticism of instrumental variables centered on the usage of weak instruments
A weak instrument is an instrumental variable that has little partial correlation with the endogenous variable whose causal effect on an outcome it is meant to measure
Relevance
‹#›
© 2019 McGraw-Hill Education.
Regression Results for X1 Regressed on X2, X3,Z1, and Z2
‹#›
© 2019 McGraw-Hill Education.
Regression Results for Y Regressed on , X2, and X3
‹#›
© 2019 McGraw-Hill Education.
Classical Applications of Instrumental Variables for Business
Cost variables are popular choices as instrumental variables, particularly in demand estimations
Any variable that affects the costs of producing the good or service (input prices, cost per unit, etc.) can be to be a valid instrument for Price
Prices charged typically depend on costs
Cost variables are often both relevant and exogenous when used to instrument for Price in a demand equation
‹#›
© 2019 McGraw-Hill Education.
Classical Applications of Instrumental Variables for Business
Policy change is another popular choice as an instrumental variable
Local sales tax and/or price regulations can serve as instrumental variables for Price in a demand equation
Labor laws can serve as instrumental variables for wages when seeking to measure the effect of wages on productivity
Policy changes often affect business decisions (making them relevant) but often occur for reasons not related to business outcomes (exogenous)
‹#›
© 2019 McGraw-Hill Education.
With panel data we are able to observe the same cross-sectional unit multiple times at different points in time
Difference-in- difference regression
Fixed-effects model
Dummy variable estimation
Within estimation
Panel Data Method
‹#›
© 2019 McGraw-Hill Education.
Consider an individual who owns a large number of liquor stores in the states of Indiana and Michigan
Suppose Indiana state government decides to increase the sales tax on liquor sales by 3%
The owner may want to know the effect of this tax increase on her profit
Difference-in-Differences
‹#›
© 2019 McGraw-Hill Education.
To learn the effect of tax increase on the profit, the store owner collects data for two years as shown below:
Difference-in-Differences
‹#›
© 2019 McGraw-Hill Education.
To assess the effect of a tax hike on profit, the store owner may assume the following data-generating process:
Profitsit = α + βTaxHikeit + Uit
Profitsit is the profit of store i during Year t, and TaxHikeit equals 1 if the 3% tax hike was in place for store i during Year t and 0 otherwise
We could regress Profits on TaxHike, but difficult to argue that TaxHike is not endogenous
TaxHike equals 1 for a specific group of stores at a specific time; this method of administering the treatment may be correlated with unobserved factors affecting Profits
Difference-in-Differences
‹#›
© 2019 McGraw-Hill Education.
Control for a cross-sectional group (g = Indiana, Michigan) and for time (t = 2016, 2017)
Assume the following model:
Profitsigt = α + β1Indianag + β2Yeart + β3TaxHikegt Uigt
The data-generating process can also be written as:
Profitsigt = α + β1Indianag + β2Yeart + β3Indianag × Yeart + Uigt
Difference-in-Differences
‹#›
© 2019 McGraw-Hill Education.
β3 is the diff-in-diff for profits in this example
Difference in profits between 2017 and 2016 for Indiana:
α + β1 + β2 + β3 + Uigt ‒ (α + β1 + Uigt)= β2 + β3
Difference in profits between 2017 and 2016 for Michigan:
α + β2 + Uigt ‒ (α + Uigt)= β2
Take the difference between the change in profits in Indiana and Michigan to get the diff-in-diff:
β2 + β3 ‒ β2 = β3
Difference-in-Differences
‹#›
© 2019 McGraw-Hill Education.
Difference-in-Differences for Liquor Profits in Indiana and Michigan
‹#›
© 2019 McGraw-Hill Education.
Difference-in-Differences
Difference-indifferences (diff-in-diff) is the difference in the temporal change for the outcome between the treated and untreated group
Diff-in-diff highly effective and applies for dichotomous treatments spanning two periods
‹#›
© 2019 McGraw-Hill Education.
Fixed effects model is a data-generating process for panel data that includes controls for cross-sectional groups
The controls for cross-sectional groups are call fixed effects
For a data-generating process to be characterized as a fixed effects model, it need have only controls for the cross-sectional groups
Can control for time periods by including time trends
Outcomeigt = α+ δ2Group2g + … + δGGroupGg + γTimet + βTreatmentgt+ Uigt
The Fixed-Effects Model
‹#›
© 2019 McGraw-Hill Education.
The Fixed-Effects Model
By controlling for the groups and periods, many possible confounding factors in the data-generating process are eliminated
Can add controls (Xigt’s) beyond the fixed effects and time dummies to help eliminate some of the remaining confounding factors
Two ways of estimating the fixed-effects model include: dummy variable estimation and within estimation
‹#›
© 2019 McGraw-Hill Education.
Dummy variable estimation uses regression analysis to estimate all of the parameters in the fixed effects data-generating process
Regress the Outcome on dummy variables for each cross-sectional group (except the base unit), dummy variables for each period (except the base period), and the treatment
The Fixed-Effects Model: Dummy Variable Estimation
‹#›
© 2019 McGraw-Hill Education.
Subset of Dummy Variable Estimation Results for Sales Regressed on Tax Rate
‹#›
© 2019 McGraw-Hill Education.
The Fixed-Effects Model: Dummy Variable Estimation
Interpreting the table from the previous slide:
Each state coefficient measures the effect on a store’s profits of moving the store from the base state (State 1) to that alternative state, for a given year and tax rate
Each year coefficient measures the effect on a store’s profits of moving the store from the base year (Year 1) to that alternative year, for a given state and tax rate
The coefficient on Tax Rate measures the effect on a store’s profits of changing the Tax Rate, for a given state and year
‹#›
© 2019 McGraw-Hill Education.
The Fixed-Effects Model: Within Estimation
Within estimation uses regression analysis of within-group differences in variables to estimate the parameters in the fixed effects data-generating process, except for those corresponding to the fixed effects (and the constant)
Eliminates the need to estimate the coefficient for each fixed effect
‹#›
© 2019 McGraw-Hill Education.
The Fixed-Effects Model: Within Estimation
Outcomeigt = α+ δ2Group2g +…+ δGGroupGg + γTimet + Treatmentgt+ Uigt
We estimate the parameters γ2, …, γT, β via within estimation:
Determine the cross-sectional groups and calculate group-level means: = and =
Create new variables: Outcome*igt = Outcomeigt ‒ , Treatment*igt = Treatmentgt ‒
Regress Outcome* on Treatment* and the Period dummy variables
‹#›
© 2019 McGraw-Hill Education.
Comparing Estimation Methods
Dummy variable estimation provides estimates for the fixed effects (the effects of switching groups on the outcome), whereas within estimation does not
For dummy variable estimation R-squared is often misleadingly high, suggesting a very strong fit
For within estimation, R-squared is more indicative that the variation in Treatment is explaining variation in the Outcome
Both estimation models eliminate confounding factors that are fixed across periods for the groups or are fixed across groups over time
Both estimation models could yield inaccurate estimates if there are unobserved factors that vary within a group over time
‹#›
© 2019 McGraw-Hill Education.
image1
image2
image3
image4
image5
image6
image7
image8
image9
image10
image11.JPG
image12
image12.JPG
image13.JPG
image14
image15
image16