AP Statistics Ch 9: Sampling Distributions
Things to know before starting this chapter...
- if an event can’t be repeated, then statistical inference can’t be made on it; data analysis can be made on any data, though
 
Part 1: Sampling Distributions
- vocab and symbols
 - parameter: number that describes population; usually not known in statistics since we will need to look at entire population
 - statistics: number that describes sample; don’t need to know parameter to calculate this, and we can use this to estimate parameter
 - mean of population symbolized by Greek letter mu; in other words, info of parameter; not known until we find mean of sample
 - mean of sample symbolized by x-bar (x with a line over the top); in other words, info of statistics
 - sampling variability: the fact that statistics will vary with different samples
 - population proportion symbolized with p
 - sample proportion symbolized with p-hat
 - sampling distribution: distribution of values calculated from statistics; made from all same-sized, possible samples of the event
 - will make an ideal pattern
 - more accurate than distribution of statistics of only a certain number of trials
 - how to describe sampling distributions
 - can be described like any other distributions
 - describe shape, center, spread, and outliers
 - appearance of sampling distributions based on samples depends on random sampling and how it is done; bad sampling = bad results (not accurate)
 - as always, more repetitions/individuals = more accurate >>> predictable pattern and behavior >>> sample distributions have a more definite shape; few repetitions/individuals = very inaccurate
 - bias of statistic
 - using sampling distributions makes us able to tell if conclusion trustworthy based on the event’s usual sampling distribution
 - when we say “bias”, we are talking about bias as that of a statistic, not that of a sampling method
 - unbiased if mean of sampling distribution = true mean of parameter
 - statistic often called unbiased estimator of parameter; will sometimes be above or below true value, but is still centered at true value (no systematic tendency to make these slight errors)
 - using the statistic as an unbiased estimator means we can say p is around p-hat and mu is around x-bar
 - sample size does not affect what p-hat and p will be
 - as long as sample distribution is centered at true value (mean) of population, then it is considered unbiased
 - high = data’s center and data points not on or near the true value of parameter, low = data’s center and points on or near the true value of parameter
 - variability of statistic
 - less variability in large samples than in small samples
 - spread does not depend on size of population as long as population is at least 10 times as large as sample
 - only depends on sampling design and size
 - high = data points are all over the place, low = data points are close together
 
Part 2: Sample proportions
- usually appears in questions that involve categorical variables
 - when you express proportion of statistics, you express it in decimals
 - p-hat = number of successes in sample / size of sample = X / n
 - p-hat always the same as p; it is an unbiased estimator
 - how well p-hat estimates p depends on sampling distribution of p-hat
 - describing sampling distribution of p-hat
 - X and p-hat will vary with size of samples, so considered random samples
 - mean: the same as p
 - standard deviation: sqrt((p x (1-p))/(n))
 - you can’t and don’t use this formula if the sample is a large part of population; only sue when population is at least 10x as large as sample
 - normal approximation and p-hat
 - p-hat is approximately normal, and the larger sample is, the more accurate the normal curve is
 - normal curve most accurate when p is close to 0.5, and least accurate when close to 0 or 1
 - only use Normal approximation if np is larger or equal to 10 and n(1-p) is larger or equal to 10
 - how likely is getting a sample in which p-hat is close to p or any other value?
 - we are most likely going to work with a normal curve; if the normal curve works... work with z-scores and table A
 - first, find the value(s) of p-hat you are trying to look for, and then find the z-scores
 - using table A, find the correct percentages; if you are looking for the percentage between a range of values, then subtract the percentages so that it represents the percentage between the values you are looking for
 - for more information about working with z-scores, table A, and normal curves, look back to Ch 2 notes
 
Part 3: Sample Means
- very common
 - distributions of means are less variable and more Normal than distributions of individual observations
 - sample distribution of means (x-bar): distribution of value of means of all possible samples that have the same size; the samples still belong to the population you are interested in
 - if x-bar is mean of SRS with a certain size from a large population, that has mean mu and standard deviation o--...
 - mean of distribution of x-bar = mean of population (mu)
 - standard deviation = (standard deviation of population) / (sqrt(size of sample))
 - = o-- / (sqrt(n))
 - no matter what shape, size, etc. population distribution is...
 - like p-hat, x-bar is an unbiased estimator, this time, of mu
 - only use standard deviation equation when population is more than 10 times as large as sample
 - shape of distribution of x-bar
 - depends on shape of population distribution
 - is exactly normal if the population distribution is exactly normal
 - central limit theorem
 - no matter how the population distribution looks and what the mean is, a sample with a certain standard deviation, the larger the size of the sample, the closer it gets to a normal distribution, which can be summarized by saying N(mu, o-- / n)
 - the less the population distribution looks like a Normal distribution, the larger the n needs to be to make distribution of x-bar look like a Normal distribution
 - mean doesn’t change as sample gets larger, but the standard deviation does get smaller and the curve does get more Normal
 
Websites I found helpful
No comments:
Post a Comment