Free Educational Material (50 Practice Problems)

AP Statistics

Use our free studying resources to cover your bases on intro to statistics, normal distributions, z-scores.

A parameter is a value describing a population, while a statistic describes a sample.

A population includes all members of a group, while a sample is a subset of the population.

A random sample is selected such that every member of the population has an equal chance of being included.

A bar graph displays categorical data with gaps between bars; a histogram displays numerical data with continuous bins.

A boxplot shows the median, quartiles, and potential outliers of a dataset.

Mean is the average, median is the middle value, and mode is the most frequent value in a dataset.

Standard deviation measures the average distance of data points from the mean.

Discrete variables have countable values; continuous variables can take any value in a range.

A frequency table lists the counts of each category or interval in a dataset.

Population proportion refers to the entire population; sample proportion refers to the subset measured.

The standard deviation measures the typical distance of scores from the mean; on average, scores differ from 78 by about 10 points.

A z-score measures how many standard deviations a value is from the mean; calculated as (X - μ)/σ.

Sample mean is computed from sample data, population mean is from the entire population.

Histograms group data into intervals; dotplots show each data point individually.

An outlier is an unusually high or low value; it can significantly skew the mean.

Variance is the average squared deviation of each data point from the mean.

Descriptive statistics summarize data; inferential statistics make predictions or inferences about a population.

About 68% of data falls within 1 SD, 95% within 2 SDs, 99.7% within 3 SDs of the mean.

Correlation measures relationship between variables; causation indicates one variable causes changes in another.

Population SD divides by N, sample SD divides by n-1 to correct for bias.

A probability distribution assigns probabilities to all possible outcomes of a random variable.

Discrete distributions have countable outcomes; continuous distributions have infinite possible values.

Expected value is the long-run average outcome of a random variable.

A binomial distribution models the number of successes in a fixed number of independent trials with constant probability.

A geometric distribution models the number of trials until the first success.

Probability is the chance an event occurs; odds are the ratio of success to failure.

As the number of trials increases, the sample mean approaches the population mean.

Conditional probability is the probability of an event given that another event has occurred.

Independent events do not affect each other; dependent events influence each other's probabilities.

A random variable is a numerical outcome of a random process.

A sampling distribution is the probability distribution of a statistic over repeated samples.

The CLT states that the sampling distribution of the sample mean is approximately normal for large sample sizes.

Standard error measures the variability of a sample statistic from sample to sample.

A confidence interval is a range of values within which a population parameter is likely to fall with a certain probability.

Margin of error quantifies the uncertainty in a sample estimate.

Hypothesis testing evaluates whether there is enough evidence to reject a null hypothesis.

A p-value measures the probability of obtaining a result at least as extreme as the observed, assuming the null hypothesis is true.

Type I error is rejecting a true null hypothesis; Type II error is failing to reject a false null hypothesis.

Alpha is the threshold probability for rejecting the null hypothesis, commonly 0.05.

One-tailed tests check for an effect in one direction; two-tailed tests check for an effect in either direction.

A chi-square test assesses the association between categorical variables.

Linear regression models the relationship between two quantitative variables with a line of best fit.

R² measures the proportion of variability in the response variable explained by the explanatory variable.

The correlation coefficient measures the strength and direction of a linear relationship between two variables.

A residual is the difference between an observed value and the value predicted by the regression line.

Explanatory variable predicts or explains; response variable is affected or predicted.

A scatterplot displays the relationship between two quantitative variables using points on a graph.

A lurking variable affects both explanatory and response variables, potentially confounding the analysis.

Extrapolation predicts values outside the range of the data, which can be unreliable.

Influential points are data points that greatly affect the slope or position of the regression line.
Vasilios S.

Vasilios S.

Carnegie Mellon University

AP Statistics Tutor

★ ★ ★ ★ ★ (200+ reviews)