Normal Distribution · dandruce.co.uk

Mean & standard deviation

μ meantranslation 0.0

σ std devstretch 1.0

Properties of the normal curve

Symmetric about the mean μ - the left and right halves are mirror images.
The mean, median and mode are all equal, sitting at the centre μ.
The total area under the curve is exactly 1, because it represents total probability.
Bell-shaped: one peak at μ, falling away smoothly on both sides.
The tails get closer to the x-axis but never touch it - every value is theoretically possible.
Because the distribution is continuous, P(X ≤ a) and P(X < a) are identical. The area of a single exact point is 0, so ≤ and < (and ≥ and >) always give the same probability.
Changing μ slides the curve sideways (a translation); changing σ stretches or squashes it. A larger σ gives a wider, flatter curve; a smaller σ gives a taller, narrower one.

The 68-95-99.7 rule

Tap a band to show or hide it on the graph.

Also called the empirical rule. For any normal distribution, about 68% of values fall within one standard deviation of the mean, about 95% within two, and about 99.7% within three.

These percentages never change, whatever the values of μ and σ, because every normal curve has the same shape once standardised. The coloured bands on the graph show each region, and they move and resize as you drag the sliders.

Standardising (the transformation)

Standardising is a transformation of the curve. Subtracting μ translates it so the mean sits at 0, and dividing by σ stretches it to a standard deviation of 1. That turns any X ~ N(μ, σ²) into the standard normal Z ~ N(0, 1):

$$Z = \frac{X - \mu}{\sigma}$$

Rearranging reverses the transformation - stretching the z-values back out by σ and shifting them by μ to recover the original x-values:

$$x = z\sigma + \mu$$

So you can move from x to z and back again: Z counts how many standard deviations above or below the mean a value lies.

Distribution

μ mean

σ std dev

Probability

Result

Using a calculator

A basic (non-graphical) calculator often only gives the lower-tail probability P(X < x). Build the others from it:

P(X < a)	read directly
P(X > a)	`1 - P(X < a)`
P(a < X < b)	`P(X < b) - P(X < a)`

Casio ClassWiz

Press MENU and choose Distribution.
Select Normal CD (cumulative).
Enter Lower, Upper, then σ and μ.
For P(X < a): Lower = -1×10^99, Upper = a. For P(X > a): Lower = a, Upper = 1×10^99. For between: Lower = a, Upper = b.
Press = to read the probability.

Casio CG100

Open the Statistics (or Run-Matrix) app from the main menu.
Press F5 (DIST) → F1 (NORM) → F2 (Ncd).
Set Data: Variable, then enter Lower, Upper, σ, μ.
Use the same Lower/Upper trick as above for <, > or between.
Press EXE (or Draw) to get the probability and a shaded graph.

Standardise a value

μ mean

σ std dev

x value

Z-score

How standardising works

Any normal distribution can be converted to the standard normal Z ~ N(0, 1) using the Z-score formula, so a single table covers every μ and σ.

Standardise: x → z

$$Z = \frac{X - \mu}{\sigma}$$

Rearranging reverses it, turning a z-score back into an x-value:

Reverse: z → x

$$x = z\sigma + \mu$$

Once standardised, P(X < x) = Φ(z), where Φ (capital "phi") is the area to the left under the standard normal curve - the value your tables or calculator return.

Why do we standardise?

Statistical tables only list probabilities for the standard normal $Z \sim N(0,\ 1)$. Standardising lets us read any normal probability from that one table, whatever the original mean and standard deviation.

For example, if heights are $X \sim N(175,\ 100)$, then

$$P(X < 185) = P\!\left(Z < \frac{185 - 175}{10}\right) = P(Z < 1.0)$$

which we look up directly.

One curve, three scales: read the same point as a real value x, as a z-score, and in terms of μ and σ.

Take many samples from a population and watch what happens to the sample means. Whatever the population shape, their distribution turns out normal.

Population & samples

Shape

n sample size

Same x-axis scale on both graphs Animate single samples

Most recent sample

No sample taken yet

The Central Limit Theorem

Regardless of the population's shape, the distribution of sample means becomes approximately normal as the sample size grows. This is the Central Limit Theorem.

$$\bar{X} \sim N\!\left(\mu,\ \frac{\sigma^2}{n}\right)\ \text{ for large } n$$

The mean of the sampling distribution equals μ. Its spread - the standard error - is σ/√n, which shrinks as n grows, so larger samples give more precise estimates.

Standard error vs standard deviation

Standard deviation (σ) - how spread out individual values are in the population.

Standard error (σ/√n) - how spread out the sample means are; it measures how precisely a sample mean estimates μ.

Standard error

$$\mathrm{SE}(\bar{X}) = \frac{\sigma}{\sqrt{n}}$$

Quadrupling the sample size halves the standard error, since σ/√(4n) = (σ/√n)/2 - that is why bigger samples are more reliable.

True population parameters

In real life we rarely know these exactly - which is why we sample and use statistical inference.

The population - this is what we sample from

Distribution of sample means - the red curve is the predicted normal