The binomial distribution formula (PMF)
where the binomial coefficient counts the number of ways to choose k successes from n trials:
p is the probability of success on each trial and (1 - p) is the probability of failure. The term pk(1-p)n-k is the probability of one particular arrangement of k successes and n-k failures; the coefficient counts how many such arrangements there are.
The cumulative distribution P(X ≤ k)
The cumulative probability P(X ≤ k) is just several PMF values added together - every individual probability from 0 up to and including k:
The sum runs over every whole value i from 0 up to and including k, so it expands to P(X = 0) + P(X = 1) + ... + P(X = k). For example, P(X ≤ 3) = P(X = 0) + P(X = 1) + P(X = 2) + P(X = 3).
When can I use the binomial distribution?
- Fixed number of trials (n) - you decide n in advance.
- Two outcomes - each trial is a success or a failure.
- Independent trials - one trial does not affect any other.
- Constant probability (p) - p is the same on every trial.
Mean, variance and standard deviation
For X ~ B(n, p):
The standard deviation is the square root of the variance, √(np(1-p)). As n grows the bars cluster more tightly around the mean (relative to n).
Full probability table
Working
Using a calculator
Calculators give Binomial PD for P(X = k) and Binomial CD for the lower tail P(X ≤ k). Build the others from these:
| P(X = k) | Binomial PD at k |
| P(X ≤ k) | Binomial CD at k |
| P(X < k) | CD at k - 1 |
| P(X ≥ k) | 1 - CD at k - 1 |
| P(X > k) | 1 - CD at k |
| P(a ≤ X ≤ b) | CD at b - CD at a - 1 |
Casio ClassWiz
- Press
MENUand choose Distribution. - Select Binomial PD for P(X = k), or Binomial CD for P(X ≤ k).
- Choose Variable (a single x).
- Enter
x(= k),N(= n) andp. - Press
=to read the probability, then combine using the table above.
Casio CG100
- Open the Statistics (or Run-Matrix) app from the main menu.
- Press
F5(DIST) →F5(BINM) →F1(Bpd) for P(X = k), orF2(Bcd) for P(X ≤ k). - Set Data: Variable, then enter
x(= k),Numtrial(= n) andp. - Use the same combinations as above for <, >, ≥ or between.
- Press
EXEto get the probability.
The approximation
The binomial B(n, p) is approximated by a normal distribution with the same mean and variance:
It works because when n is large the discrete bars of the binomial begin to resemble the continuous bell curve. A common rule of thumb is that the conditions np > 5 and n(1 - p) > 5 should both hold. Breaking that down:
- A large n gives more trials, so the bars become smoother and more bell-shaped.
- A p closer to the middle (near 0.5) keeps the distribution symmetric rather than bunched up at one end.
- Both conditions are really one idea: the mean must sit far enough from both ends (0 and n). np is the distance from the left end and n(1 - p) the distance from the right end - if either is small, the curve is squashed against that end and the approximation is poor.
Continuity correction
The binomial is discrete (whole numbers only); the normal is continuous. Each binomial bar extends 0.5 either side of its value, so P(X = 5) is the area from 4.5 to 5.5 under the normal. Adjust the boundary by ±0.5:
| P(X = k) | P(k - 0.5 < Y < k + 0.5) |
| P(X ≤ k) | P(Y < k + 0.5) |
| P(X < k) | P(Y < k - 0.5) |
| P(X ≥ k) | P(Y > k - 0.5) |
| P(X > k) | P(Y > k + 0.5) |
where Y ~ N(np, np(1 - p)) is the approximating normal.
Note: the continuity correction may not be required by your exam board - check your specification before applying it.
Why does the approximation work?
By the Central Limit Theorem, the sum of many independent trials tends towards a normal distribution. The binomial count X is the total of n separate trials added together, so for large n, X is approximately normal.
The approximation improves as n increases. It is best when p is close to 0.5 (a symmetric distribution) and worst when p is close to 0 or 1 (heavily skewed).