Given a set of points and a positive integer , which polynomial of degree is “closest” to the given points?

Definition: The method of least squares finds the coefficients of minimizing

Theorem: The line minimizing has slope and y-intercept

This is solved by taking partial derivatives on

Definition: is known as the predicted value of
Definition: is the ith residual

Residual plots graph residuals over

If a residual plot shows no patterns or trends, the relationship is taken to be linear

Our theorem can also work on more complicated relationships, as long as they can be linearized

  • Exponential regression:
  • Logarithmic regression:
  • Logistic regression:

There are many other curvilinear models

The Linear Model

It’s more realistic to think of values as sampling from a random variable

Definition: Let denote the pdf of the random variable for a given value . is the regression curve of on

Definition: The simple linear model makes four assumptions:

  1. is a normal pdf for all
  2. is the same for all
  3. Each conditional distribution is independent

Theorem: Let and be a set of points satisfying the simple linear model, . The MLE estimators for and are given by , , and where .

Theorem: Let and be the MLE estimators for and , respectively.

  • and are normally distributed
  • and are unbiased, and

Theorem:

  • , , and are mutually independent
  • has a chi square distribution with degrees of freedom

The unbiased estimator for is

Theorem: has Student distribution with degrees of freedom. Let ,

  1. : if
  2. if
  3. if

Let ,
is a confidence interval for

One useful test is checking for , which says whether changes with or not

is generally less interesting, but also has similar well-defined tests. Let ,
is a confidence interval for

Likewise, is a confidence interval for
This is almost exactly the same as the regular test for variance, but with one less degree of freedom, since each estimated parameter essentially consumes a degree of freedom.

Theorem: A confidence interval for is given by , where and .

This theorem is nifty but we might want a confidence interval for the actual range , not the expected range . We can derive this with the random variable .

Theorem: A prediction interval for at the fixed value is given by , where and .

We can also devise a test to check the relationship between two independent regressions, where

Theorem: where is a Student distribution with degrees of freedom. Let , define decision tests in the usual way ( if )

Covariance and Correlation

The current discussion was concerned with regression data. A more complicated situation is when measurements are of the form , where both and are random but correlated variables.

Covariance is an important tool in measuring this kind of relation, but reflects the units of the individual variables

Definition: The correlation coefficient of and is denoted where and

Theorem: For any and ,

  • for some constants and

Can we estimate ? One useful identity is

We can use this to define the sample correlation coefficient,

This is also known as the Pearson product-moment correlation coefficient (in honor of Karl Pearson)

We can use the square of the sample correlation coefficient and derive

  1. Represents the total variability in the dependent variable
  2. Represents the variation in the ‘s not accounted for by the linear regression with

Therefore, is the proportion of the total variation in the ‘s that can be attributed to the linear relationship with , sometimes called the coefficient of determination

The Bivariate Normal Distribution

After looking at relationships between two random variables, we start to wonder, can we generalize the normal distribution to higher dimensions?

Jumping to the definition…

Definition: is a bivariate normal distribution with parameters , and

Theorem: Suppose and have a bivariate normal distribution,

  1. is normal with and is normal with
  2. is the correlation coefficient between and

Theorem: The maximum likelihood estimators for , and are respectively , and

How do we test the two variables are independent? We test

Theorem: Under the null hypothesis , has a Student distribution with degrees of freedom