1. Foundations for Inference & Introduction to JASP¶
- Statistical inference is primarily concerned with understanding and quantifying the uncertainty of parameter estimates.
- Confidence Interval is a range of values where the true population value is likely to lie.
Foundations for Inference 1 3 4¶
- Point Estimate: A single value that best approximates the population parameter.
Uncertainty in point estimates¶
- Sampling Error:
- The natural variability that we expect between different random samples and the total population.
- It results from the randomness of the sampling process.
- It is quantified by the Standard Error as it is the most cared about measure of uncertainty.
- Bias:
- The point estimate is systematically higher or lower than the population parameter.
- It is a systemic tendency to under or over estimate the population parameter.
- It is especially important during the data collection phase.
Sampling Distribution¶
- If we take many samples of the same size from the same population, the point estimates will vary from sample to sample.
- If we plot the distribution of point estimates, we get the sampling distribution.
- The distribution of point estimates based on samples of a fixed size from a certain population.
- It resembles a normal distribution (bell-shaped, symmetric) centered at the true population parameter.
- The mean of the sampling distribution is the population parameter.
- The standard deviation of the sampling distribution is the standard error.
Central Limit Theorem¶
- When observations are independent and the sample size is sufficiently large, the sampling distribution of the parameter estimate will follow a normal distribution with a mean equal to the population parameter \({\mu}_{\hat{p}}=p\) and a stand error computed as \(SE_{\hat{p}} = \sqrt{\frac{p(1-p)}{n}}\).
- For the CLT to hold, two conditions must be met
- The independence of observations.
- The success-failure condition:
- \(np > 10\).
- \(n(1-p) > 10\).
- The problem then becomes normal distribution problem:
- The mean of the sampling distribution is the point estimate.
- The standard error is the standard deviation of the sampling distribution.
- We plot the distribution by finding the z-score and using the normal distribution table.
- \(z_{1} = \frac{\hat{p_{min}} - p}{SE_{\hat{p}}}\), \(z_{2} = \frac{\hat{p_{max}} - p}{SE_{\hat{p}}}\).
- It is hard to find z1 nad z2 as it requires us to make more samplings.
- We use the confidence interval to estimate the range of values where the true population parameter is likely to lie.
- With a 95% confidence level, we can say that \(z_{1} = -1.96\) and \(z_{2} = 1.96\).
- With a 99% confidence level, we can say that \(z_{1} = -2.58\) and \(z_{2} = 2.58\).
The Plug-in Principle¶
- If we have a point estimate for a sample, and we confirmed that the central limit theorem holds and the sampling distribution is approximately normal, we can use the plug-in principle to plugin the sample point estimation (sample parameter) in place of the population parameter.
Confidence Intervals¶
- The problem is the point estimate (of a sample) may not truly represent the population parameter.
- Instead of providing a single point estimate, we provide a range of values where the true population parameter is likely to be.
- Constructing 95% confidence interval:
- In a normal distribution, 95% of the observations fall within 1.96 standard deviations of the mean (distribution center).
- Thus, if a point estimate can be modeled using a normal distribution, we can construct a plausible range with 95% confidence as \([\hat{p} - 1.96\times{SE}, \hat{p} + 1.96\times{SE}]\)
- Interpreting confidence level:
- We are 95% confident that the true population parameter lies within the interval.
- For example, [0.45, 0.55] is the 95% confidence level for people supporting solar panels, and we can say:
- We are 95% confident that the actual percentage of public supporting solar panels is between 45% and 55%.
- Common confidence levels:
- 90% confidence level: \(z = 1.645\) and the interval is \(\hat{p} \pm 1.645\times{SE}\).
- 95% confidence level: \(z = 1.96\) and the interval is \(\hat{p} \pm 1.96\times{SE}\).
- 99% confidence level: \(z = 2.58\) and the interval is \(\hat{p} \pm 2.58\times{SE}\).
- Confidence intervals says nothing about individual observations.
- Confidence intervals says nothing about future samples.
- It is NOT the probability that the true parameter lies within the interval.
Introduction to JASP 2 5¶
Descriptive Statistics¶
- Descriptive statistics and related plots are a succinct way of describing and summarising data but do not test any hypotheses. it includes:
- Measures of central tendency.
- Measures of dispersion.
- Percentile values.
- Measures of distribution.
- Descriptive plots.
- Central Tendency:
- The tendency for variable values to cluster around a central value.
- Mean, Median, Mode.
- Mean: M or \(\bar{x}\). It equals the sum of all values divided by the number of values. It equals the average. It is sensitive to outliers.
- Median: Mdn. It is the middle value when all values are ordered. It is less sensitive to outliers.
- Mode: The most frequent value.
References¶
-
Diez, D. M., Barr, C. D., & Çetinkaya-Rundel, M. (2019). Openintro statistics - Fourth edition. Open Textbook Library. https://www.biostat.jhsph.edu/~iruczins/teaching/books/2019.openintro.statistics.pdf Read Chapter 5 - Foundations for Inference from page 168-205. Section 5.1 - Point estimates and sampling variability. Section 5.2 - Confidence intervals for a proportion. Solve the following practice exercises as homework from the attached: Practice Exercises – Unit 1 https://my.uopeople.edu/pluginfile.php/1897551/mod_book/chapter/531355/Practice%20Excercises%20-%20%20Unit%201_Final.pdf ↩
-
Goss-Sampson, M. A. (2022). Statistical analysis in JASP: A guide for students (5th ed., JASP v0.16.1 2022). https://jasp-stats.org/wp-content/uploads/2022/04/Statistical-Analysis-in-JASP-A-Students-Guide-v16.pdf Read Page 2-31 ↩
-
OpenIntroOrg. (2019a, September 02). Foundations for inference: Point estimates [Video]. YouTube. https://youtu.be/oLW_uzkPZGA ↩
-
OpenIntroOrg. (2019b, September 6). Intro to confidence intervals via proportions [Video]. YouTube. https://youtu.be/A6_W8qY8zJo ↩
-
JASP Statistics. (2022, October 05). Introduction to JASP [Video]. YouTube. https://youtu.be/APRaBFC2lEQ ↩