## AH Statistics - Analysis Decision Flowchart

Click to start/reset.

Compare one sample with target value(s)

Compare population median to a target value

Distribution of parent population is symmetrical

Create second sample all of the target median value

Wilcoxon signed-rank test

Distribution of parent population is NOT symmetrical

No valid process in AH Statistics course

Compare population mean to a target value

Distribution of parent population is normal

Parent population variance NOT known

Estimate population variance from sample variance

Perform hypothesis test

t-test for a single sample mean

Construct confidence interval

Use Student's t-distribution

Parent population variance known

Perform hypothesis test

z-test for a single sample mean

Construct confidence interval

Use z-distribution

Distribution of parent population is NOT normal

Parent population variance known

Sample size > 20

Use central limit theorem to obtain approximate normal distribution of sample mean

Perform hypothesis test

z-test for a single sample mean

Construct confidence interval

Use z-distribution

Sample size <20

No valid process in AH Statistics course

Parent population variance NOT known

Sample size > 20

Estimate population variance from sample variance

Use central limit theorem to obtain approximate normal distribution of sample mean

Perform hypothesis test

t-test for a single sample mean

Construct confidence interval

Use Student's t-distribution

Sample size <20

No valid process in AH Statistics course

Compare population proportion to a target value

Model with binomial distribution

Hypothesis test

Either proceed using the binomial distribution, or approximate to normal if np > 5 and nq > 5

Confidence interval

Approximate to normal if np > 5 and nq > 5

Construct confidence interval using z-distribution

Compare sample to a theoretical distribution

Chi-squared goodness-of-fit test

Distribution parameter known

Check expected frequencies: at least 80% ≥5 and none <1, else combine categories and start again.

Use degrees of freedom = categories - 1

Distribution parameter NOT known

Estimate the unknown parameter from the observed frequencies

Check expected frequencies: at least 80% ≥5 and none <1, else combine categories and start again.

Use degrees of freedom = categories - 2

Compare two samples with each other

Two samples of paired data

Compare difference of population medians

Distribution of parent population of differences is symmetrical

Wilcoxon signed-rank test

Distribution of parent population of differences is NOT symmetrical

No valid process in AH Statistics course

Compare population means

Calculate values of paired differences

Proceed as if a one sample process, with the focus upon the distribution of the differences.

Two samples of non-paired data

Compare population medians

Distributions of both parent populations have the same shape and spread

Mann-Whitney test

Distributions of both parent populations do NOT have same shape and spread

No valid process in AH Statistics course

Compare population means

Distribution of both parent populations are NOT normal

Parent population variances known

Sample size > 20

Use central limit theorem to obtain approximate normal distribution of sample means

z-test for a difference in population means

Sample size < 20

No valid process in AH Statistics course

Parent population variances NOT known

Sample size > 20

Parent population variances assumed to be equal

Estimate population variance from pooled sample variance

Use central limit theorem to obtain approximate normal distributions of sample means

t-test for a difference in population means

Parent population variances assumed to be NOT equal

Both sample sizes large (ie both > 20)

Estimate population variances from sample variances

Use central limit theorem to obtain approximate normal distributions of sample means

z-test for a difference in population means

Either sample size small

No valid process in AH Statistics course

Sample size < 20

No valid process in AH Statistics course

Distribution of both parent populations are normal

Parent population variances both NOT known

Parent population variances assumed to be equal

Estimate population variance from pooled sample variance

t-test for a difference in population means

Parent population variances assumed to be NOT equal

Both sample sizes large (ie both > 20)

Estimate population variances from sample variances

z-test for a difference in population means

Either sample size small

No valid process in AH Statistics course

Parent population variances both known

z-test for a difference in population means

Compare population proportions

Model with binomial distribution

Check criteria for approximating binomial with normal

z-test for a difference in population proportions

Establish a relationship between two variables

Establish if an association exists between two categorical variables, using frequencies

Chi-squared test for association in a contingency table.

Check expected frequencies: at least 80% ≥5 and none <1, else combine rows or columns and start again.

Use degrees of freedom = (rows-1) x (cols-1).

Compare paired continuous variables

Linear correlation and regression analysis

View a scatterplot to establish if there is a plausible linear correlation

Conduct hypothesis test on correlation coefficient, ρ

Calculate least squares regression equation

Conduct hypothesis test on slope, β (if required)

Construct a residual plot for analysis

Construct prediction/confidence intervals