# Chapter 5 : Distribution Shape

### Topics covered in this snack-sized chapter:

In statistics, the concept of the shape of the distribution refers to the shape of a probability distribution.
Measures of shape:
- Symmetric or Skewed (asymmetric).

A Correlation is a single number that describes the degree of relationship between two variables.
An important measure of the shape of a distribution is called Skewness.
Skewness is asymmetry in a statistical distribution, in which the curve appears distorted or skewed either to the left or to the right.
##### Positively skewed:

A distribution is said to be positively skewed if the scores tend to cluster towards the lower end of the scale with increasingly fewer scores at the upper end of the scale.
##### Negatively skewed:

With a negatively skewed distribution, most of the scores tend to occur towards the upper end of the scale while increasingly fewer scores occur towards the lower end.
Pearson coefficient of skewness is based on arithmetic mean, mode, median and standard deviation.
Pearson mode or first skewness coefficient:

Pearson's median or second skewness coefficient:
If S_{k
} = 0, then the frequency distribution is normal and symmetrical.
If S_{k
} > 0, then the frequency distribution is positively skewed.
If S_{k
} < 0, then the frequency distribution is negatively skewed.
The mass of the distribution is concentrated on the right of the figure.
It has relatively few low values.

The mass of the distribution is concentrated at the middle of the figure.

The mass of the distribution is concentrated on the left of the figure.
It has relatively few high values.

A graphical representation of a distribution by a rectangle, the ends of which mark the maximum and minimum values, and in which the median, first and third quartiles are marked by lines parallel to the ends.
Graphical display of data using 5 number summary:

The box plot will look as if the box was shifted to the right.
- The left tail will be longer.

- The median will be closer to the right line of the box in the box plot.

The box plot will look symmetric if the distribution is normal and there are few exceptionally large or small values.

The box plot will look as if the box was shifted to the left.
- The right tail will be longer.

- The median will be closer to the left line of the box in the box plot.

Correlation Coefficient = r
- Measures the strength of the linear relationship between two quantitative variables.

Ranges between –1 and 1.
- The closer to–1, the stronger the negative linear relationship becomes.

- The closer to1, the stronger the positive linear relationship becomes.

- The closer to 0, the weaker any linear relationship becomes.

A Negative Correlation Coefficient indicates that as one variable increases, the other decreases, and vice-versa.

A Positive Correlation Coefficient means that as the value of one variable increases, the value of the other variable increases; as one decreases the other decreases.

Zero correlation coefficient means that there is no relation between the variables.