Economics 421/521 - Econometrics - Winter 2011 - Lecture 8 (HD)

Economics 421/521 - Econometrics - Winter 2011 - Lecture 8 (HD)

Review of Assumptions for Best Linear Unbiased Estimators

The speaker reviews the assumptions needed to guarantee that estimators are best linear unbiased estimators, known as BLUE. These assumptions are C6 and C7, which state that errors are uncorrelated and correlated with the x's.

Violations of Assumptions

  • Two different assumptions get violated: correlation among errors and correlation of right-hand side variables with error terms.
  • Correlated errors violate assumption C6, leading to a loss of efficiency but unbiased and consistent estimates.
  • Correlation between right-hand side variables and error terms violates assumption C7, usually resulting in bias/inconsistency problems.

Examples

  • Autocorrelation occurs when errors are correlated over time (violating C6), while correlation between RHS variables and error terms (violating C7) is an efficiency problem.
  • Three examples were used to illustrate these two problems.
  • Model: Y = beta1 + beta2X2T + UT where UT = RhoUT-1 + VT. This shows the correlation between X2T and UT, violating assumption C6.
  • Model: YT = beta1 + beta2*YT-1 + ET where ET is not correlated. This model does not violate assumption C6 but may violate assumption C7.

Notation

  • The notation used in the transcript includes X2T for a variable at time T, beta for true values, XT for observed values, UT for errors at time T, Rho for autocorrelation coefficient, VT for white noise/error term.

Notation and Unbiasedness

The speaker discusses the notation used in the model and how to ensure unbiasedness.

Notation

  • The speaker introduces a notation for the model being discussed.
  • They refer to a variable called "80" which is defined as data 2 plus the sum of 80.

Unbiasedness

  • To ensure unbiasedness, the expected value needs to be zero. However, if UT is correlated with any X, then it won't be zero.
  • In order for this expectation to be zero, UT needs to be independent of not just the current X but every X.
  • If XT+1 is correlated with UT, then one of the X's is correlated with one of the U's and YT is correlated with UT so this is not independent of all our exes.

Correlation between Y and U

The speaker discusses how correlation between Y and U can cause problems in the model.

Correlation Problem

  • If YT is correlated with UT, it introduces randomness that may be correlated with error terms causing problems in the model.
  • When X's are fixed by an experimenter there are no problems because there cannot be a correlation between two non-random variables.

Correlation between Variables and Time

In this section, the speaker explains how variables are correlated with time in a model. The correlation is between the variable and UT, not between x2. The speaker clarifies that when calling the last time, it is a contemporaneous variable at time T.

Understanding Correlation

  • YT minus 1 correlates with UT, which is what we call XT plus 1 or YT.
  • When X2T and beauty decor are involved, there are huge problems.
  • If UT is correlated with everything but the current one (YT minus 1), it causes bias that's consistent but doesn't cause significant problems.
  • There's no big problem if X2T is uninvolved.

Contemporaneous Correlation

  • Two cases:
  • UT is correlated with X2T causing all kinds of problems.
  • UT is correlated with everything except for the current one causing bias that's consistent but not significant.
  • When there's a correlation between error terms and today's air temperature, it's not a big problem if today's error term is closed to yesterday or tomorrow.

Bias Introduction

  • Model 1 was unbiased consistent but inefficient.
  • In this case, because the correlation isn't contemporaneous and outside of the current Y value, it introduces bias. However, it's usually ignored as it's thought to be small.

Bias and Consistency

The speaker discusses the issue of bias in large sample sizes, but notes that it is generally ignored if n is fairly big because consistency is more important.

Bias in Large Sample Sizes

  • For a big sample size, there may be bias.
  • If n is fairly big, bias is generally pretty small.
  • Consistency is more important than bias.

Ignoring the Problem

  • If the model is consistent, we ignore the problem of bias.
  • We make sure n is big enough and then ignore the problem.

Correlated Models

The speaker introduces a third line that's exactly like the second model but now with contemporaneous correlation. They explain how to construct YT minus 1 and how to make UT correlated with it.

Constructing Contemporaneously Correlated Variables

  • The third line of the model has contemporaneous correlation.
  • To construct YT minus 1, use UT minus 1 plus UT.
  • Put UT minus 1 in here to create contemporaneous correlation between variables.

Problems with Contemporaneously Correlated Variables

  • There's a contemporaneous correlation between right-hand side variables and errors.
  • This creates biased, inconsistent, and inefficient models.

Fixing Inefficient Models

The speaker explains that while ignoring problems works for consistent models with small biases, inefficient models need fixing.

Fixing Inefficient Models

  • Ignoring problems works for consistent models with small biases.
  • Inefficient models need fixing to make them more efficient.

Correlated Variables Outside the Model

The speaker explains that contemporaneous correlation is not the only problem with correlated variables. They also discuss how YT is correlated outside of the model.

Correlation Outside of the Model

  • YT is correlated outside of the model.
  • This creates both contemporaneous and non-contemporaneous correlation problems.

Inefficiency in OLS

The speaker discusses inefficiency in OLS and how it can lead to misleading results.

Inefficiency in OLS

  • OLS provides small standard errors but doesn't account for serial correlation in errors.
  • Errors are highly correlated over time, leading to biased and inefficient models.
  • T statistics can be misleading due to inefficiency expressing itself as low standard errors.

Serial Correlation in Errors

The speaker explains how serial correlation in errors can lead to misleading results and why it's important to account for it.

Serial Correlation in Errors

  • Errors are highly correlated over time, making T statistics biased.
  • Accounting for serial correlation is important to avoid being misled by low standard errors.

Correlated Errors

In this section, the speaker discusses the problem of correlated errors in Example 1. The X's are correlated with years, and not only are the exes correlated with the errors, but the errors themselves are also correlated.

Correlated Errors

  • The problem is C6 and C7.
  • Exes are correlated with years.
  • Errors themselves are correlated.

Model Bias

In this section, the speaker talks about model bias and how it can be identified. They explain that they care more about concepts than numbers and that identifying bias is more important than identifying specific variables.

Identifying Model Bias

  • The speaker cares more about concepts than numbers.
  • Identifying bias is more important than identifying specific variables.
  • C4 and C8 are arbitrary; there is no natural choice.

Durbin Watson Test

In this section, the speaker introduces a test for detecting problems caused by correlation among errors called the Durbin Watson test. They explain that when near zero, it indicates high correlation among errors.

Durbin Watson Test

  • A test for detecting problems caused by correlation among errors.
  • When near zero, it indicates high correlation among errors.
  • It tests from first-order term here.

Bruce Godfrey Country LM Test

In this section, the speaker introduces another test called the Bruce Godfrey Country LM test. This test is used to detect higher-order serial correlation when there are missing terms or other issues with using the Durbin Watson test.

Bruce Godfrey Country LM Test

  • Used to detect higher-order serial correlation.
  • Used when there are missing terms or other issues with using the Durbin Watson test.
  • Null hypothesis is row on equals Row 2 equals row P equals 0.

Regression with Lags

In this section, the speaker explains how to perform regression with lags and how to calculate the number of observations for this regression.

Steps for Regression with Lags

  • Regress the estimated error.
  • Regress beauty hat from there that's been an error on all the X's at 2 T X and T plus a constant - and the lads of this up the P UT minus 1 to UT minus P.
  • Regress UT on all the X's and labs of the use yesterday's use a day before of the P days before.
  • Start data at T equals three so as not to lose any observations when using one lag.

Calculating Number of Observations

  • The number of observations is t minus P because we can't get lags of errors beyond u1 when X's with T is 1.
  • The number of observations for this regression is t minus P.

Computing R-Squared

In this section, the speaker explains how to compute n minus P R squared.

Steps for Computing R-Squared

  • Compute n minus P R squared by multiplying RT minus P by R squared in female always before.
  • This is an LM statistic that's distributed chi square of P where P is the number of restrictions like always.

The Cochran or Cut Procedure

In this section, the speaker explains what the Cochran or cut procedure is and why it is no longer necessary.

Explanation of the Cochran or Cut Procedure

  • The Cochran for cut procedure was built for a time when we did not have very powerful computers.
  • It was designed merely because we didn't have very powerful computers.
  • Although there's learning and tuition to be gained from forcing you through this, you'll never use it.

Why It Is No Longer Necessary

  • The student version that we use now does maximum likelihood estimation for a r1 and ARP models.
  • We can get a much more efficient better estimator so although I think there's learning and tuition to be gained from forcing you through this, you'll never use it.

Correcting for Autocorrelation

In this section, the speaker discusses how to correct for autocorrelation in a model.

Durbin Watson Test

  • The Durbin Watson statistic tests for first-order autocorrelation.
  • It is biased when there is lagged dependent variable on the right-hand side of the equation.
  • Use Durbin's H instead when there is a lagged dependent variable.
  • Durbin's H can be calculated from the program output and is distributed normally.

Correcting for Bias

  • When there is contemporaneous correlation between variables, the model gives a biased estimate.
  • The estimate of UT hat is biased towards an inconsistent estimator.
  • Use Durbin's H instead of the Durbin Watson statistic to get a precise answer.

Bruce Godfrey Test

  • The Bruce Godfrey test checks for higher-order serial correlation.
  • Use it when other tests are inconclusive or do not work.

Summary

In this section, the speaker summarizes the three tests discussed earlier and their applications.

Three Tests

  • There are three tests to check for autocorrelation in a model:
  • Durbin Watson test for first-order autocorrelation
  • Durbin's H when there is a lagged dependent variable on the right-hand side of the equation
  • Bruce Godfrey test for higher-order serial correlation

The speaker emphasizes that these tests should be used to correct bias in models due to autocorrelation.

Estimating Nonlinear Models with Nonlinear Least Squares

In this section, the speaker explains how to estimate nonlinear models using nonlinear least squares.

Lagging the Model and Multiplying by Rho

  • To estimate a nonlinear model, we first need to lag the model one period and multiply through by Rho.
  • The resulting equation is y_t - rhoy_(t-1) = beta_1 + (beta_2x_2,t - 1)rho + u_t - rhou_(t-1).

Transforming the Model

  • We can transform the model further by subtracting one model from another.
  • This results in e_t = u_t - rho*u_(t-1), which means that our transformed model no longer has serially correlated errors.

Nonlinearity of the Model

  • The presence of Rho in multiple places makes this a nonlinear model.
  • Ordinary Least Squares (OLS) cannot be used because it assumes linearity in parameters, which is not satisfied here.

Estimating with Nonlinear Least Squares

  • We can use nonlinear least squares to estimate this model efficiently.
  • EViews can estimate this model using non-linear least squares by adding "a on" at the end of regression statement when doing linreg for any AR(p) process.

Example: Estimating an AR(2) Model with Nonlinear Least Squares

In this section, the speaker provides an example of estimating an AR(2) model using nonlinear least squares.

Transforming the Model

  • We can transform the model by subtracting one model from another.
  • This results in e_t = u_t - rho_1u_(t-1) - rho_2u_(t-2).

Nonlinearity of the Model

  • The presence of Rho in multiple places makes this a nonlinear model.
  • Ordinary Least Squares (OLS) cannot be used because it assumes linearity in parameters, which is not satisfied here.

Estimating with Nonlinear Least Squares

  • We can use nonlinear least squares to estimate this model efficiently.
  • EViews can estimate this model using non-linear least squares by adding "a on" at the end of regression statement when doing linreg for any AR(p) process.

Quasi Differencing

The speaker discusses quasi differencing and how it is used to transform data in order to isolate an error that is no longer heteroscedastic. They also discuss how this transformation induces nonlinear parameters.

Quasi Difference Model

  • A quasi difference model is when you estimate the quasi difference model y t minus y.
  • Differencing YT minus Rho YT minus 1 is called quasi difference.
  • This transformation helps to get rid of heteroscedasticity.

Transforming Data

  • To fix the heteroscedasticity problem, we need to find a way to transform the data so that we isolate an error that there's no longer heteroscedastic.
  • We divide it by the thing causing it, which is the standard error, to get rid of the heteroscedasticity.
  • Essentially, we're finding a way to take the original model and transform it into something else where the error is no longer a problem.

Inducing Nonlinear Parameters

  • By transforming our data using quasi differencing, we induce nonlinear parameters.
  • We fix one problem but introduce another: we get rid of material correlation and induce nonlinear parameters.

Estimating Models with Linear Parameters

The speaker explains how they would estimate models with linear parameters if they knew what Rho was. They also explain how they can calculate Y star and X star in a spreadsheet.

Linear Parameters

  • If we know what Rho is, then estimating models with linear parameters becomes easy.
  • We form a variable YT minus Row 1 YT minus 1 and call that YT strong.
  • Then we form X 2 t minus Rho X 2 t minus 1 and call that X star X 2 star.

Calculating Y Star and X Star

  • We can calculate Y star by subtracting off Row times the observation before.
  • We can change our constant from 1 to 1 minus Rho and add another variable that's all 1 minus Rho.
  • We estimate Y star as beta 1 star plus beta 2 x 2 star plus T.

Nonlinear Least Squares

The speaker explains how introducing Rho turns estimation into a nonlinear problem, which requires nonlinear least squares. They also discuss how to fix the intercept when looking at the transformed model.

Introducing Rho

  • Once we introduce Rho, estimation turns into a nonlinear problem.
  • Since we don't know what Rho is, we have to estimate it, which turns it into a nonlinear least squares problem.

Fixing Intercept

  • If you look at the transform model, it will have a different intercept but the same slope.
  • You have to adjust the intercept to get back here.
  • To do this, you divide this by 1 minus relatively natural intercept so that the true models are aligned.

Serial Correlation

The speaker discusses serial correlation and how to estimate it.

Estimating Rho

  • To estimate Rho, the speaker forms Y star B 1 star plus B 2 X 2 star plus et.
  • This can be estimated easily if we know Rho.
  • However, since we don't know Rho, we need to make this a nonlinear least squares problem.

Test Coming Up

  • There is a test coming up soon.
  • Students can access old tests by clicking on "midterm" in the sidebar.
  • The review will cover everything up through Chapter 12, with only a small section on ARCH models.
  • Students are encouraged to figure out what they don't know over the weekend and ask questions during the next class.
Video description

Economics 421/521 - Econometrics - Winter 2011 - Lecture 8 (HD)

Economics 421/521 - Econometrics - Winter 2011 - Lecture 8 (HD) | YouTube Video Summary | Video Highlight