STA209/229 - Summary

Xirius-Probabilityandothers2-STA209amp229.pdf Xirius AI

This document, "Xirius-Probabilityandothers2-STA209amp229.pdf", serves as a comprehensive set of lecture notes or a study guide for students enrolled in STA209 and STA229. It meticulously covers fundamental and advanced concepts in probability theory, laying a strong foundation for statistical analysis. The material progresses logically, starting from the basic definitions of probability and events, moving through the properties and rules of probability, and then delving into the crucial concepts of random variables and their distributions.

The document provides detailed explanations of both discrete and continuous probability distributions, including their respective probability mass/density functions, cumulative distribution functions, expected values, and variances. It also explores multivariate distributions, specifically joint probability distributions, marginal and conditional probabilities, and measures of association like covariance and correlation. Furthermore, it introduces advanced topics such as Moment Generating Functions (MGFs), Chebyshev's Inequality, and briefly touches upon the Law of Large Numbers and the Central Limit Theorem, which are cornerstones of statistical inference.

Overall, this PDF is designed to equip students with a robust understanding of probability theory, enabling them to model random phenomena, calculate probabilities, characterize random variables, and apply various probability distributions to real-world problems. It includes numerous formulas, definitions, and illustrative examples to facilitate learning and comprehension, making it an invaluable resource for anyone studying introductory to intermediate probability and statistics.

MAIN TOPICS AND CONCEPTS

1. Introduction to Probability

This section establishes the foundational concepts of probability theory.

Experiment: A process that leads to well-defined outcomes.
Outcome: A single result of an experiment.
Sample Space ($\Omega$ or $S$): The set of all possible outcomes of an experiment.
Event: A subset of the sample space.
Mutually Exclusive Events: Events that cannot occur simultaneously, i.e., $A \cap B = \emptyset$.
Exhaustive Events: A set of events whose union covers the entire sample space, i.e., $\cup A_i = \Omega$.
Complementary Event ($A^c$): The event that A does not occur.
Intersection ($A \cap B$): Both A and B occur.
Union ($A \cup B$): A occurs, or B occurs, or both occur.

Approaches to Probability:

Classical (A Priori) Probability: Assumes all outcomes are equally likely.

P(A) = \frac{\text{Number of favorable outcomes}}{\text{Total number of possible outcomes}}

Relative Frequency (Empirical/A Posteriori) Probability: Based on observed frequencies from experiments.

$P(A) = \lim_{n \to \infty} \frac{n_A}{n}$, where $n_A$ is the number of times event A occurs in $n$ trials.

Axiomatic Probability: A formal mathematical approach based on a set of axioms:

1. For any event A, $P(A) \ge 0$.

2. $P(\Omega) = 1$.

3. If $A_1, A_2, \dots$ are mutually exclusive events, then $P(\cup_{i=1}^\infty A_i) = \sum_{i=1}^\infty P(A_i)$.

Properties of Probability:

$0 \le P(A) \le 1$ for any event A.
$P(\emptyset) = 0$.
$P(A^c) = 1 - P(A)$.
If $A \subseteq B$, then $P(A) \le P(B)$.

2. Rules of Probability

This section details how probabilities of combined events are calculated.

Addition Rule:

- For any two events A and B: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$.

- For mutually exclusive events A and B: $P(A \cup B) = P(A) + P(B)$.

Conditional Probability: The probability of event A occurring given that event B has already occurred.

$P(A|B) = \frac{P(A \cap B)}{P(B)}$, provided $P(B) > 0$.

Multiplication Rule:

- For any two events A and B: $P(A \cap B) = P(A|B)P(B) = P(B|A)P(A)$.

- For independent events A and B: $P(A \cap B) = P(A)P(B)$.

Independence of Events: Two events A and B are independent if the occurrence of one does not affect the probability of the other. This means $P(A|B) = P(A)$ or $P(B|A) = P(B)$.
Bayes' Theorem: Used to update the probability of an event based on new evidence.

P(A_i|B) = \frac{P(B|A_i)P(A_i)}{\sum_{j=1}^n P(B|A_j)P(A_j)}

3. Random Variables and Probability Distributions

This section introduces the concept of random variables and how their probabilities are distributed.

Random Variable (RV): A function that assigns a numerical value to each outcome in the sample space of a random experiment.
Types of Random Variables:

- Discrete Random Variable: Takes on a finite or countably infinite number of values (e.g., number of heads in coin flips).

- Continuous Random Variable: Takes on any value within a given interval (e.g., height, temperature).

Probability Distribution: A description of the probabilities associated with the possible values of a random variable.

For Discrete Random Variables:

Probability Mass Function (PMF): $p(x) = P(X=x)$.

- Properties: $p(x) \ge 0$ for all $x$, and $\sum_x p(x) = 1$.

Cumulative Distribution Function (CDF): $F(x) = P(X \le x) = \sum_{t \le x} p(t)$.

- Properties: $0 \le F(x) \le 1$, $F(x)$ is non-decreasing, $F(-\infty)=0$, $F(\infty)=1$.

For Continuous Random Variables:

Probability Density Function (PDF): $f(x)$.

- Properties: $f(x) \ge 0$ for all $x$, and $\int_{-\infty}^{\infty} f(x) dx = 1$.

- $P(a \le X \le b) = \int_a^b f(x) dx$. Note that $P(X=x)=0$ for any single point $x$.

Cumulative Distribution Function (CDF): $F(x) = P(X \le x) = \int_{-\infty}^x f(t) dt$.

- Properties: $0 \le F(x) \le 1$, $F(x)$ is non-decreasing, $F(-\infty)=0$, $F(\infty)=1$.

- The PDF can be obtained from the CDF by differentiation: $f(x) = \frac{d}{dx}F(x)$.

4. Expectation and Variance

These are key measures used to describe the central tendency and spread of a random variable's distribution.

Expected Value (Mean, $E(X)$ or $\mu$): The long-run average value of a random variable.

- Discrete RV: $E(X) = \sum_x x p(x)$.

- Continuous RV: $E(X) = \int_{-\infty}^{\infty} x f(x) dx$.

- Expected value of a function of X: $E(g(X)) = \sum_x g(x) p(x)$ (discrete) or $\int_{-\infty}^{\infty} g(x) f(x) dx$ (continuous).

Properties of Expectation:

- $E(c) = c$ (for a constant $c$).

- $E(cX) = cE(X)$.

- $E(X+Y) = E(X) + E(Y)$.

- $E(aX+b) = aE(X)+b$.

Variance ($Var(X)$ or $\sigma^2$): A measure of the spread or dispersion of the distribution around its mean.

- $Var(X) = E[(X - E(X))^2] = E(X^2) - [E(X)]^2$.

- Discrete RV: $Var(X) = \sum_x (x - \mu)^2 p(x)$.

- Continuous RV: $Var(X) = \int_{-\infty}^{\infty} (x - \mu)^2 f(x) dx$.

Properties of Variance:

- $Var(c) = 0$.

- $Var(cX) = c^2 Var(X)$.

- $Var(aX+b) = a^2 Var(X)$.

- If X and Y are independent random variables, $Var(X+Y) = Var(X) + Var(Y)$.

Standard Deviation ($\sigma_X$): The square root of the variance, $\sigma_X = \sqrt{Var(X)}$. It has the same units as the random variable.

5. Common Discrete Probability Distributions

This section details several widely used discrete distributions.

Bernoulli Distribution: $X \sim Bernoulli(p)$

- Models a single trial with two outcomes (success/failure).

- PMF: $p(x) = p^x (1-p)^{1-x}$ for $x \in \{0, 1\}$.

- $E(X) = p$, $Var(X) = p(1-p)$.

Binomial Distribution: $X \sim B(n, p)$

- Models the number of successes in $n$ independent Bernoulli trials.

- PMF: $p(x) = \binom{n}{x} p^x (1-p)^{n-x}$ for $x \in \{0, 1, \dots, n\}$.

- $E(X) = np$, $Var(X) = np(1-p)$.

Poisson Distribution: $X \sim P(\lambda)$

- Models the number of events occurring in a fixed interval of time or space, given a constant average rate $\lambda$.

- PMF: $p(x) = \frac{e^{-\lambda} \lambda^x}{x!}$ for $x \in \{0, 1, 2, \dots\}$.

- $E(X) = \lambda$, $Var(X) = \lambda$.

Geometric Distribution: $X \sim Geo(p)$

- Models the number of Bernoulli trials needed to get the first success.

- PMF: $p(x) = (1-p)^{x-1} p$ for $x \in \{1, 2, 3, \dots\}$.

- $E(X) = \frac{1}{p}$, $Var(X) = \frac{1-p}{p^2}$.

Hypergeometric Distribution: $X \sim H(N, K, n)$

- Models the number of successes in $n$ draws without replacement from a finite population of size $N$ containing $K$ successes.

- PMF: $p(x) = \frac{\binom{K}{x} \binom{N-K}{n-x}}{\binom{N}{n}}$ for $\max(0, n-(N-K)) \le x \le \min(n, K)$.

- $E(X) = n \frac{K}{N}$, $Var(X) = n \frac{K}{N} \frac{N-K}{N} \frac{N-n}{N-1}$.

6. Common Continuous Probability Distributions

This section covers important continuous distributions.

Uniform Distribution: $X \sim U(a, b)$

- Models situations where all values within an interval $[a, b]$ are equally likely.

- PDF: $f(x) = \frac{1}{b-a}$ for $a \le x \le b$, and $0$ otherwise.

- $E(X) = \frac{a+b}{2}$, $Var(X) = \frac{(b-a)^2}{12}$.

Exponential Distribution: $X \sim Exp(\lambda)$

- Models the time until an event occurs in a Poisson process (e.g., waiting time for the next customer).

- PDF: $f(x) = \lambda e^{-\lambda x}$ for $x \ge 0$, and $0$ otherwise.

- $E(X) = \frac{1}{\lambda}$, $Var(X) = \frac{1}{\lambda^2}$.

- It possesses the memoryless property : $P(X > s+t | X > s) = P(X > t)$.

Normal (Gaussian) Distribution: $X \sim N(\mu, \sigma^2)$

- The most important distribution in statistics, characterized by its bell-shaped curve.

- PDF: $f(x) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2} (\frac{x-\mu}{\sigma})^2}$ for $-\infty < x < \infty$.

- $E(X) = \mu$, $Var(X) = \sigma^2$.

- Standard Normal Distribution: A special case where $\mu=0$ and $\sigma=1$. Any normal RV X can be standardized using $Z = \frac{X-\mu}{\sigma}$.

7. Joint Probability Distributions

This section extends probability concepts to multiple random variables.

Joint PMF (Discrete): $p(x, y) = P(X=x, Y=y)$.

- Properties: $p(x,y) \ge 0$, and $\sum_x \sum_y p(x,y) = 1$.

Joint PDF (Continuous): $f(x, y)$.

- Properties: $f(x,y) \ge 0$, and $\int_{-\infty}^{\infty} \int_{-\infty}^{\infty} f(x,y) dx dy = 1$.

Marginal Distributions: The probability distribution of a single random variable from a joint distribution.

- Discrete: $p_X(x) = \sum_y p(x,y)$, $p_Y(y) = \sum_x p(x,y)$.

- Continuous: $f_X(x) = \int_{-\infty}^{\infty} f(x,y) dy$, $f_Y(y) = \int_{-\infty}^{\infty} f(x,y) dx$.

Conditional Distributions: The distribution of one variable given the value of another.

- Discrete: $p(x|y) = \frac{p(x,y)}{p_Y(y)}$, $p(y|x) = \frac{p(x,y)}{p_X(x)}$.

- Continuous: $f(x|y) = \frac{f(x,y)}{f_Y(y)}$, $f(y|x) = \frac{f(x,y)}{f_X(x)}$.

Independence of Random Variables: Two random variables X and Y are independent if their joint distribution is the product of their marginal distributions.

- Discrete: $p(x,y) = p_X(x)p_Y(y)$ for all $x, y$.

- Continuous: $f(x,y) = f_X(x)f_Y(y)$ for all $x, y$.

Covariance: A measure of the linear relationship between two random variables.

- $Cov(X,Y) = E[(X-E(X))(Y-E(Y))] = E(XY) - E(X)E(Y)$.

- If X and Y are independent, $Cov(X,Y) = 0$. The converse is not always true.

Correlation Coefficient ($\rho_{XY}$): A standardized measure of the linear relationship, ranging from -1 to 1.

- $\rho_{XY} = \frac{Cov(X,Y)}{\sigma_

• Xirius AI