[Spring 2010] Regression etc.

Tue 27 December 2011 by Adrian Brasoveanu

Corpus-based evaluation of centering theory: Centering: A Parametric Theory and Its Instantiations, Poesio et al (2004) (discussion led by Bern)

Intro to anaphora resolution algorithms (Hobbs 1978, Lappin & Leass 1994, Centering Theory): anaphora resolution.ppt

Tutorial on regression (see the references in the R scripts for sources):

  1. Warm-up (skewness, histograms, bootstrapping), DATA = MODEL + ERROR and what this means for regression modeling vs. the modeling / analyses done in generative linguistics, the mean as the simplest kind of least squares model, models with two means and associated t-tests as linear regressions with a single dummy variable: CLG-regression-1.r

  2. Overall picture (the generality of regression modeling), intro to ANOVA, model comparison / selection, linear regression with a single continuous predictor, standardized plots, degrees of freedom, linear regression as a least squares model: CLG-regression-2.r

  3. Scatter plot matrices, F-values (recap), R-squared, correlation, the relation between correlation, covariance and standardized variables / plots, inference for simple linear regression, measuring sampling error for the slope based on (1) the spread around the regression line, (2) the spread of the predictor and (3) the size of the sample (we used (1) and (3) to calculate the standard error of the mean; (2) is new): CLG-regression-3.r

  4. Multiple linear regression, graphical comparison of the multiple regression model and the simple linear regression models, more on anova and model selection, removing the intercept, adding interactions, multicollinearity and variable centering, another example of multiple regression with interaction terms, interpreting interactions: CLG-regression-4.r