# Chapter 11 : T-Statistics, ANOVA and Chi-Square

### Topics covered in this snack-sized chapter:

T-test looks at the difference in means of a continuous variable between two groups.
The T distribution is a family of similar probability distributions.
- A specific T distribution depends on a parameter known as the degrees of freedom.

The T statistic allows researchers to use sample data to test hypotheses about an unknown population mean.
- The advantage of the T statistic is that the T statistic does not require any knowledge of the population standard deviation.

- The T- Statistic can be used to test hypothesis about a completely unknown population; both and are unknown, and the only available information about the population comes from the sample.

All that is required for a hypothesis test with T, is a sample and a reasonable hypothesis about the population mean.
There are two general situations where this type of hypothesis test is used:

###### ANOVA (Analysis of Variance)

ANOVA is used to see an association between a continuous outcome variable and a categorical determining variable.
The ANOVA is a statistics option under the means function that allows for testing the difference between the mean outcome scores for the two or more categories of the determining variable.

Chi-Square is a statistical test commonly used to compare observed data with data we would expect to obtain according to a specific hypothesis.
Chi-Square is used to look at the statistical significance of an association between a categorical outcome and a categorical determining variable.
Whenever a sample is obtained from a population, you expect to find some discrepancy or "sampling error" between the sample mean and the population mean.
The goal for a hypothesis test is to evaluate the significance of the observed discrepancy between a sample mean and the population mean.
The hypothesis test attempts to decide between the following two alternatives:
- Is it reasonable that the discrepancy between M and is simply due to sampling error and not the result of a treatment effect?

- Is the discrepancy between M and more than would be expected by sampling error alone? That is, the sample mean significantly different from the population mean?

- How much difference between M and μ is reasonable to expect?

The T-Statistic requires that you use the sample data to compute an estimated standard error of M.

Where,

s = Sample Standard Deviation,
n = Number of scores on the test
The one-sample*
*test is used to determine whether the population mean equals a specified value.

The T statistic forms a ratio.
The top of the ratio contains the obtained difference between the sample mean and the hypothesized population mean.
The bottom of the ratio is the standard error which measures how much difference is expected by chance.
The two-sample*
*test is used to determine whether the population mean equals a specified value.

Tests for significant effect of 1 or more factors:
- Each factor may have 2 or more levels.

- Can also test for interactions between factors.

- For just 1 factor with 2 levels, ANOVA = T-test.

ANOVA really looks for difference in means*
*between groups (factors & levels).
Total variability = Variability due to factors + error.
is used to measure the deviation of observed frequencies from an expected or theoretical distribution.

Where,

O = Observed frequency (# of events, etc.).
E = Expected frequency under H_{0.
}