AP Statistics Study Guide 2026

Unit 635–40% of exam (Units 6 + 7 combined)

Inference for Proportions and Means

The core of AP Statistics. z-tests, t-tests, confidence intervals, and conditions for inference.

Inference — drawing conclusions about a population from a sample — is the heart of AP Statistics and accounts for the majority of exam points.

The Inference Procedure Template

Every inference problem follows the same four-step structure (memorize this):

State: Define the parameter. State H₀ and Hₐ (or the confidence level).
Plan: Name the test/interval. Check all conditions.
Do: Calculate the test statistic and p-value (or interval).
Conclude: In context, state your conclusion about the parameter.

Conditions for Inference

For proportions (one sample): Random sample, Normal (np ≥ 10 and n(1−p) ≥ 10), Independent (population ≥ 10n).

For means (one sample): Random, Normal/Large Sample (population normal, or n ≥ 30, or no strong skew), Independent (population ≥ 10n).

Test Statistics

\begin{array}{ll} \textbf{One-proportion } z\textbf{-test:} & z = \dfrac{\hat{p} - p_0}{\sqrt{\dfrac{p_0(1-p_0)}{n}}} \\[18pt] \textbf{One-sample } t\textbf{-test:} & t = \dfrac{\bar{x} - \mu_0}{s/\sqrt{n}}, \quad df = n - 1 \\[18pt] \textbf{Two-proportion } z\textbf{-test:} & z = \dfrac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}_c(1-\hat{p}_c)\!\left(\dfrac{1}{n_1}+\dfrac{1}{n_2}\right)}}, \quad \hat{p}_c = \dfrac{x_1+x_2}{n_1+n_2} \\[18pt] \textbf{Two-sample } t\textbf{-test:} & t = \dfrac{\bar{x}_1 - \bar{x}_2}{\sqrt{\dfrac{s_1^2}{n_1}+\dfrac{s_2^2}{n_2}}} \end{array}

Normal Distribution Explorer

Interactive · Desmos

Normal Distribution Explorer

Drag μ (mean) and σ (standard deviation) to reshape the curve. Drag a and b to define a shaded region — the area equals the probability of a value falling in that range. Essential for finding p-values and critical values.

Powered by Desmos

Confidence Intervals

A confidence interval estimates a parameter with a margin of error:

\text{statistic} \pm z^* \cdot \text{SE} \qquad \text{where SE is the standard error}

\begin{array}{ll} \textbf{One-proportion CI:} & \hat{p} \pm z^*\sqrt{\dfrac{\hat{p}(1-\hat{p})}{n}} \\[14pt] \textbf{One-sample } t \textbf{ CI:} & \bar{x} \pm t^*\dfrac{s}{\sqrt{n}} \end{array}

Common critical values: z* = 1.645 (90%), z* = 1.96 (95%), z* = 2.576 (99%).

Interpreting CIs correctly: "We are 95% confident that the true proportion of [context] is between [lower] and [upper]." Do NOT say "there is a 95% chance the true value is in this interval" — it either is or isn't.

P-Values and Significance

The p-value is the probability of observing a test statistic at least as extreme as ours, assuming H₀ is true.

p < α: Reject H₀. Statistically significant evidence for Hₐ.
p ≥ α: Fail to reject H₀. Not enough evidence for Hₐ.

Never "accept H₀" — you only fail to reject it. This is one of the most commonly penalized errors on FRQs.

Exam tip: FRQ conclusions must be in context and must reference the p-value. A good template: 'Because p = [value] < α = 0.05, we reject H₀. There is convincing evidence that [Hₐ in context].' Lose one point if you forget the context.

Common mistake: Don't say 'accept H₀' — you fail to reject it. Don't say 'the probability that H₀ is true is p' — the p-value assumes H₀ is true, it doesn't measure the probability that H₀ is true.

Key Concepts

p-valueProbability of observing a result as extreme as ours, given H₀ is true.

Significance level (α)Threshold for rejecting H₀. Commonly 0.05. Set before the test.

Type I errorRejecting H₀ when it is actually true. Probability = α.

Type II errorFailing to reject H₀ when it is actually false. Probability = β.

PowerProbability of correctly rejecting a false H₀. Power = 1 − β. Increases with larger n or larger effect size.

Unit 510–15% of exam

Sampling Distributions

The Central Limit Theorem, sampling distributions of p̂ and x̄, and standard error.

Sampling Distribution of p̂

When taking random samples of size n from a population with proportion p, the sampling distribution of p̂ is:

\mu_{\hat{p}} = p \qquad \sigma_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}

Shape is approximately Normal when np ≥ 10 and n(1−p) ≥ 10.

Sampling Distribution of x̄

When taking random samples of size n from a population with mean μ and standard deviation σ:

\mu_{\bar{x}} = \mu \qquad \sigma_{\bar{x}} = \frac{\sigma}{\sqrt{n}} \quad \text{(standard error of the mean)}

The Central Limit Theorem (CLT)

Regardless of the shape of the population distribution, the sampling distribution of x̄ is approximately Normal when n is large (n ≥ 30 is the common rule of thumb).

The CLT is why inference works — it justifies using Normal-based procedures even when we don't know if the population is Normal.

Central Limit Theorem — Population vs. Sampling Distribution

Interactive · Desmos

Central Limit Theorem — Population vs. Sampling Distribution

The wider curve is the population distribution (std dev = σ). The narrower curve is the sampling distribution of x̄ (std dev = σ/√n). Drag n upward and watch the sampling distribution tighten — this is the CLT. Drag σ to change population spread.

Powered by Desmos

Exam tip: The standard error (SE) is σ/√n for means, √(p(1-p)/n) for proportions. It measures how much sample statistics vary from sample to sample. Larger n → smaller SE → more precise estimates. This relationship is fundamental.

Key Concepts

Sampling distributionThe distribution of a statistic (like x̄ or p̂) over all possible samples of the same size.

Standard errorStandard deviation of a sampling distribution. For means: σ/√n.

Central Limit TheoremFor large n, the sampling distribution of x̄ is approximately Normal, regardless of population shape.

Unbiased estimatorA statistic whose sampling distribution is centered at the true parameter value. x̄ is unbiased for μ.

Unit 25–7% of exam

Exploring Two-Variable Data

Scatterplots, correlation, least-squares regression, residuals, and transformations.

Least-Squares Regression Line (LSRL)

The LSRL minimizes the sum of squared residuals.

\hat{y} = a + bx \qquad b = r\frac{s_y}{s_x} \qquad a = \bar{y} - b\bar{x}

Key fact: the LSRL always passes through (x̄, ȳ). Use this to check calculations.

Interpreting Regression Output

On the AP exam, you'll often read computer output. Know what each quantity means:

Coefficient (slope b): Predicted change in y per 1-unit increase in x. "For each additional [x unit], we predict [y] changes by b [y units]."
R² (coefficient of determination): Proportion of variability in y explained by the linear relationship with x.
Residual: Observed − Predicted = y − ŷ. A residual plot should show no pattern.

Correlation (r)

−1 ≤ r ≤ 1. Sign gives direction; |r| gives strength.
r measures linear association only.
Correlation does not imply causation.
r is not resistant to outliers.

Exam tip: Interpreting slope in context always costs points if vague. Include the units of x and y, the word 'predicted,' and the direction. Example: 'For each additional inch of height, the predicted weight increases by 4.7 pounds.'

Key Concepts

LSRLLeast-squares regression line. Minimizes sum of squared residuals.

R²Coefficient of determination. Percent of variability in y explained by the linear model.

ResidualActual − Predicted (y − ŷ). Residual plot should show random scatter.

Correlation (r)Measures strength and direction of linear association. Not resistant to outliers.

ExtrapolationPredicting outside the range of data. Unreliable — models can behave differently beyond observed x values.

Exam prediction: This topic frequently appears on the AP Statistics exam. See our full AP Statistics predictions →

Unit 82–5% of exam

Chi-Square Tests

Goodness-of-fit, homogeneity, and independence tests with expected counts.

Three Chi-Square Tests

Test	Purpose	Data Structure
Goodness-of-Fit	Does one categorical variable match a claimed distribution?	One sample, one variable
Homogeneity	Is the distribution of one variable the same across multiple populations?	Multiple samples, one variable
Independence	Are two categorical variables associated in one population?	One sample, two variables

The Chi-Square Statistic

\chi^2 = \sum \frac{(\text{Observed} - \text{Expected})^2}{\text{Expected}}

Expected counts for a two-way table: (row total × column total) / table total.

Conditions: Random sample, all expected counts ≥ 5.

Degrees of freedom: For GOF, df = k − 1 (k = number of categories). For two-way tests, df = (rows−1)(cols−1).

Exam tip: Chi-square tests are always right-tailed — the test statistic is always positive, and larger values give more evidence against H₀. On the FRQ, show your expected counts table and check the condition (all expected ≥ 5).

AP Statistics Study Guide

Inference for Proportions and Means

The Inference Procedure Template

Conditions for Inference

Test Statistics

Confidence Intervals

P-Values and Significance

Sampling Distributions

Sampling Distribution of p̂

Sampling Distribution of x̄

The Central Limit Theorem (CLT)

Exploring Two-Variable Data

Least-Squares Regression Line (LSRL)

Interpreting Regression Output

Correlation (r)

Chi-Square Tests

Three Chi-Square Tests

The Chi-Square Statistic

Know exactly what to study for AP Statistics