Probability Rules

As the number of variables in a frequency distribution grows, the enumeration of different events becomes more complicated. Rather than continuing with a representation of every situation as a multidimensional sample or sample space, some basic rules will be useful.

Basic Properties

We first make some basic observations about probabilities.

Every probability is between zero and one. In other words, if $A$ is an event, then $0 \leq P(A) \leq 1$.
The sum of the probabilities of all of the outcomes is one. In other words, if all of the outcomes in the sample space are denoted by $A_i$, then $\sum A_i = 1$.
Impossible events have probability zero. That is, if event $A$ is impossible, then $P(A) = 0$. An example of such an event is rolling a 7 on a standard six-sided die. It should be noted that the converse does not hold for all situations. In particular, though, if $P(A) = 0$, and the statistical experiment has a finite sample space, then event $A$ is impossible.
Certain events have probability one. That is, if event $A$ is certain to occur, then $P(A) = 1$. An example of such an event is rolling less than 7 on a standard die. Once again, the converse does not hold for all situations, only those which involve finite sample spaces.

The Four Probability Rules

Whenever an event is the union of two other events, the Addition Rule will apply. Specifically, if $A$ and $B$ are events, then we have the following rule.

$P(A \text{ or } B) = P(A) + P(B) - P(A \text{ and } B)$

In set notation, this can be written as $P(A \cup B) = P(A) + P(B) - P(A \cap B)$.

Whenever an event is the complement of another event, the Complementary Rule will apply. Specifically, if $A$ is an event, then we have the following rule.

$P(\text{not } A) = 1- P(A)$

In set notation, this is written as $P(\overline{A}) = 1 - P(A)$.

Whenever partial knowledge of an event is available, then the Conditional Rule will apply. Specifically, if event $A$ is already known to have occurred and probability of event $B$ is desired, then we have the following rule.

$P(B, \text{ given } A) = \dfrac{P(A \text{ and } B)}{P(A)}$

In set notation, this is written as $P(B|A) = \dfrac{P(A \cap B)}{P(A)}$.

Lastly, whenever an event is the intersection of two other events, the Multiplication Rule will apply. That is, events $A$ and $B$ need to occur simultaneously. Therefore, if $A$ and $B$ are events, then we have the following rule.

$P(A \text{ and } B) = P(A) \cdot P(B,\text{ given } A)$

In set notation, this is written as $P(A \cap B) = P(A) \cdot P(B|A)$.

An Empirical Example

As an example, consider the relative frequencies of blood types in a particular population, as given in the following table.

	Type A	Type B	Type AB	Type O	Totals
Rh+	0.34	0.09	0.04	0.38	0.85
Rh-	0.06	0.02	0.01	0.06	0.15
Totals	0.40	0.11	0.05	0.44	1.00

Suppose one individual from the population is randomly selected. We have the following probabilities.

The probability that the person is either Type A or Rh positive is:
$P(\text{Type A or Rh+}) = P(\text{Type A}) + P(\text{Rh+}) - P(\text{Type A and Rh+}) = 0.40 + 0.85 - 0.34 = 0.91$.
Notice that since the events were connected by the "or" condition, we used the Addition Rule. We added the totals from the Type A column and the Rh+ row, and subtracted their intersection to avoid a double count of that cell.
The probability that the person is not Type A is:
$P(\text{not Type A}) = 1 - P(\text{Type A}) = 1 - 0.40 = 0.60$.
Since the key word in this problem was "not", we used the Complementary Rule. We subtracted the Type A total from 100% to get the result.
Given that the individual selected is known to be type AB, the probability that they are also Rh+ is:
$P(\text{Rh+, given Type AB}) = \dfrac{P(\text{Rh+ and Type AB})}{P(\text{Type AB})} = \dfrac{0.04}{0.05} = 0.80$.
In this problem, we had partial knowledge about the blood type of the subject, so we used the Conditional Rule. Essentially, this is equivalent to ignoring all but the Type AB column, and using the Basic Rule from entries in only that column.
The probability that the person is both Type AB and Rh+ is clearly $0.04$ from the table. But the key word "and" is the signal that the Multiplication Rule can be used. If we did use that rule, we would have:
$P(\text{Type AB and Rh+}) = P(\text{Type AB}) \cdot P(\text{Rh+, given Type AB}) = 0.05 \cdot 0.80 = 0.04$.
In this computation, we used the result of the previous example to do the computation.

In the last part of the example above, use of the Multiplication Rule probably seemed rather silly. In fact, its use was a bit circular, since the Multiplication Rule and the Conditional Rule are so closely related. And in fact, whenever you have a completed contingency table, there would be no need to do such a computation. The Multiplication Rule is much more useful in other contexts.

Rolling Dice

Let us now examine the probability rules in the context of the classical example of rolling dice. Suppose we roll two dice.

The probability that both dice are 5 is:
$P(\text{both are 5}) = P(\text{first is a 5 and second is a 5}) \vphantom{\dfrac12}$
$=P(\text{first is a 5}) \cdot P(\text{second is a 5, given first is a 5}) = \dfrac16 \cdot \dfrac16 = \dfrac{1}{36}$.
The word "both" implied two events had to happen at the same time, the first event and the second event. Then, because of the key word "and", we used the Multiplication Rule. The first factor resulted from the Basic Rule on a single die. We also observed that the knowledge of the outcome of the first die has no effect on the likelihood of any outcome of the second die, so the second factor was also the Basic Rule on a single die.
The probability that at least one die is a 5 is:
$P(\text{at least one is a 5}) = P(\text{first is a 5 or second is a 5}) \vphantom{\dfrac12}$
$= P(\text{first is a 5}) + P(\text{second is a 5}) - P(\text{first is a 5 and second is a 5}) = \dfrac16 + \dfrac16 - \dfrac{1}{36} = \dfrac{11}{36}$.
First, we had to recognize that the event "at least one" could be fulfilled by one or the other of two separate cases. The key word "or" then caused us to use the Addition Rule. The first two terms came from the Basic Rule on a single die, while the third term resulted from only one outcome where both dice will be 5.
The probability that neither die is a 5 is:
$P(\text{neither is a 5}) = 1 - P(\text{at least one is a 5}) = 1 - \dfrac{11}{36} = \dfrac{25}{36}$.
In this case, we recognized that "neither" was the complement of "at least one", so we used the Complementary Rule. We had already computed the probability of at least one five in the previous part.
Given that at least one of the dice is a 5, the probability that the other is a 5 is:
$P(\text{other is a 5 | at least one is a 5}) = \dfrac{P(\text{both are 5})}{P(\text{at least one is a 5})} = \dfrac{ \frac{1}{36} }{ \frac{11}{36} } = \dfrac{1}{11}$.
The partial knowledge required the use of the Conditional Rule. Both parts of the problem were handled in the previous examples.

Selections With and Without Replacement

Suppose we have a bag of ten jellybeans. Four of the jellybeans are red, three are green, two are yellow, and one is orange. Two jellybeans will be randomly selected. But before we can do the computations, we must know whether the first jellybean selected will be returned to the bag before the second jellybean is selected. If it is returned to the bag, the bag will be restored to its original condition, and the probabilities for the second jellybean will be identical to the first. If the first jellybean is not returned to the bag, then the probabilities for the second jellybean will be different than the first.

If the jellybeans selected with replacement, then the probability that they are both green is:
$P(\text{both green}) = P(\text{first green and second green}) \vphantom{\dfrac12}$
$= P(\text{first green}) \cdot P(\text{second green, given first green}) = \dfrac{3}{10} \cdot \dfrac{3}{10} = \dfrac{9}{100}$.
If the jellybeans selected without replacement, then the probability that they are both green is:
$P(\text{both green}) = P(\text{first green and second green}) \vphantom{\dfrac12}$
$= P(\text{first green}) \cdot P(\text{second green, given first green}) = \dfrac{3}{10} \cdot \dfrac29 = \dfrac{6}{90}$.
In this example, we see that the bag of jellybeans had changed before the second jellybean was selected, and we had to take that fact into account in doing our computation.
If the jellybeans selected without replacement, then the probability that the first is a green and the second is a red is:
$P(\text{first green and second red}) = P(\text{first green}) \cdot P(\text{second red, given first green}) = \dfrac{3}{10} \cdot \dfrac49 = \dfrac{12}{90}$.
If the jellybeans selected without replacement, then the probability that one is green and one is red:
$P(\text{one green and one red}) \vphantom{\dfrac12} = P(\text{first green and second red, or first red and second green})$
$= P(\text{first green and second red}) + P(\text{first red and second green}) - P(\text{first both and second both}) \vphantom{\dfrac12}$
Now we note that the first jellybean (nor the second) can be both red and green simultaneously, so that the subtraction term of the expression is zero. Therefore, we have:
$= P(\text{first green}) \cdot P(\text{second red, given first green)} + \vphantom{\dfrac12} P(\text{first red}) \cdot P(\text{second green, given first red)}$
$= \dfrac{3}{10} \cdot \dfrac49 + \dfrac{4}{10} \cdot \dfrac39 = \dfrac{24}{90}$.

In looking at the examples above, you will probably notice that the "without replacement" case is much more interesting.

Statistical Independence

The Conditional Rule required taking into account some partial knowledge, and in so doing, recomputing the probability of an event. Sometimes, the value changed. In the first example, the probability of selecting an individual with Rh+ blood was 85%, but once it was known that the individual had Type AB blood, the probability changed to 80%. Similarly, the probability of selecting a green jellybean was $\dfrac{3}{10}$, but once a green jellybean was removed from the bag, the probability of another green jellybean changed to $\dfrac29$.

Sometimes, though, partial knowledge will not change the probabilities. This was certainly the case when rolling dice. The probability that the second die is a 5 is $\dfrac16$, whether or not we know the outcome of the first die. In cases where partial knowledge of one event has no effect on the probability of another event, we say the two events are statistically independent, or (as long as the context is statistics), just independent. Events on the two dice were independent. So were the two selections from the jellybean bag when the selections were made with replacement. In other words, if $A$ and $B$ are two events, then they are independent whenever:

$P(B, \text{ given } A) = P(B)$

This condition is equivalent to $P(A, \text{ given } B) = P(A)$. In other words, it does not matter which event is the source of the partial knowledge. Furthermore, it is equivalent to $P(A \text{ and } B) = P(A) \cdot P(B)$, which is a huge simplification of the Multiplication Rule.

The independence of two events is not the same as two mutually exclusive events. In fact, mutually exclusive events are never independent. If $A$ and $B$ are mutually exclusive, then partial knowledge that $A$ has occurred means that the probability of $B$ will become zero, that is, $P(B, \text{ given } A) = 0$.