Tuesday, January 28, 2014

Yay, finally! Chapter 9! I hope this website is helpful so far. I know making it has been helpful for me for finals, and it will probably be helpful for the AP Exam as well. However, I want it to be helpful for you as well. If you have any problems, comment it and I will try to answer it to the best of my ability. It has been a while since I said this, so I will say it again. I do not own any of the pictures (unless otherwise stated) you see here. I also do not own any of the information found in the websites posted in these notes. The pictures are pictures that I thought will be helpful for showing you what is going on, and the websites are those I found helpful for understanding the material. I am posting them here because I hope you find them helpful as well. Happy studying!

AP Statistics Ch 9: Sampling Distributions

Things to know before starting this chapter...
  • if an event can’t be repeated, then statistical inference can’t be made on it; data analysis can be made on any data, though

Part 1: Sampling Distributions

  • vocab and symbols
    • parameter: number that describes population; usually not known in statistics since we will need to look at entire population
    • statistics: number that describes sample; don’t need to know parameter to calculate this, and we can use this to estimate parameter
    • mean of population symbolized by Greek letter mu; in other words, info of parameter; not known until we find mean of sample
    • mean of sample symbolized by x-bar (x with a line over the top); in other words, info of statistics
    • sampling variability: the fact that statistics will vary with different samples
    • population proportion symbolized with p
    • sample proportion symbolized with p-hat
    • sampling distribution: distribution of values calculated from statistics; made from all same-sized, possible samples of the event
      • will make an ideal pattern
      • more accurate than distribution of statistics of only a certain number of trials
  • how to describe sampling distributions
    • can be described like any other distributions
    • describe shape, center, spread, and outliers
    • appearance of sampling distributions based on samples depends on random sampling and how it is done; bad sampling = bad results (not accurate)
    • as always, more repetitions/individuals = more accurate >>> predictable pattern and behavior >>> sample distributions have a more definite shape; few repetitions/individuals  = very inaccurate
  • bias of statistic
    • using sampling distributions makes us able to tell if conclusion trustworthy based on the event’s usual sampling distribution
    • when we say “bias”, we are talking about bias as that of a statistic, not that of a sampling method
    • unbiased if mean of sampling distribution = true mean of parameter
      • statistic often called unbiased estimator of parameter; will sometimes be above or below true value, but is still centered at true value (no systematic tendency to make these slight errors)
    • using the statistic as an unbiased estimator means we can say p is around p-hat and mu is around x-bar
    • sample size does not affect what p-hat and p will be
    • as long as sample distribution is centered at true value (mean) of population, then it is considered unbiased
    • high = data’s center and data points not on or near the true value of parameter, low = data’s center and points on or near the true value of parameter
  • variability of statistic
    • less variability in large samples than in small samples
    • spread does not depend on size of population as long as population is at least 10 times as large as sample
    • only depends on sampling design and size
    • high = data points are all over the place, low = data points are close together

Part 2: Sample proportions

  • usually appears in questions that involve categorical variables
  • when you express proportion of statistics, you express it in decimals
  • p-hat = number of successes in sample / size of sample = X / n
    • p-hat always the same as p; it is an unbiased estimator
  • how well p-hat estimates p depends on sampling distribution of p-hat
  • describing sampling distribution of p-hat
      • X and p-hat will vary with size of samples, so considered random samples
    • mean: the same as p
    • standard deviation: sqrt((p x (1-p))/(n))
      • you can’t and don’t use this formula if the sample is a large part of population; only sue when population is at least 10x as large as sample
  • normal approximation and p-hat
    • p-hat is approximately normal, and the larger sample is, the more accurate the normal curve is
    • normal curve most accurate when p is close to 0.5, and least accurate when close to 0 or 1
    • only use Normal approximation if np is larger or equal to 10 and n(1-p) is larger or equal to 10
  • how likely is getting a sample in which p-hat is close to p or any other value?
    • we are most likely going to work with a normal curve; if the normal curve works... work with z-scores and table A
    • first, find the value(s) of p-hat you are trying to look for, and then find the z-scores
    • using table A, find the correct percentages; if you are looking for the percentage between a range of values, then subtract the percentages so that it represents the percentage between the values you are looking for
    • for more information about working with z-scores, table A, and normal curves, look back to Ch 2 notes

Part 3: Sample Means

  • very common
  • distributions of means are less variable and more Normal than distributions of individual observations
  • sample distribution of means (x-bar): distribution of value of means of all possible samples that have the same size; the samples still belong to the population you are interested in
  • if x-bar is mean of SRS with a certain size from a large population, that has mean mu and standard deviation o--...
    • mean of distribution of x-bar = mean of population (mu)
    • standard deviation = (standard deviation of population) / (sqrt(size of sample))
      • = o-- / (sqrt(n))
  • no matter what shape, size, etc. population distribution is...
    • like p-hat, x-bar is an unbiased estimator, this time, of mu
    • only use standard deviation equation when population is more than 10 times as large as sample
  • shape of distribution of x-bar
    • depends on shape of population distribution
    • is exactly normal if the population distribution is exactly normal
  • central limit theorem
    • no matter how the population distribution looks and what the mean is, a sample with a certain standard deviation, the larger the size of the sample, the closer it gets to a normal distribution, which can be summarized by saying N(mu, o-- / n)
    • the less the population distribution looks like a Normal distribution, the larger the n needs to be to make distribution of x-bar look like a Normal distribution
    • mean doesn’t change as sample gets larger, but the standard deviation does get smaller and the curve does get more Normal

Websites I found helpful


No comments:

Post a Comment