Statistical Inference

2025-04-10

Statistical Inference

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

What is Statistical Inference?

Drawing conclusions about a population based on a sample
Population = entire group
Sample = subset

Two Main Types of Inference

Estimation
Hypothesis Testing

Estimation

Point Estimate: Single best guess (e.g., \(\hat \beta_1\))
Interval Estimate: Range likely to contain the true value

Hypothesis Testing

\(H_0\): No effect or difference
\(H_1\): Some effect or difference
We use sample data to support or reject \(H_0\)

Key Concepts and Tools

Sampling Distribution
Central Limit Theorem
Standard Error

p-values

Probability of observing data as extreme as this if \(H_0\) is true

Misinterpretation of p-values is common. Emphasize: low p-value means data is unusual under \(H_0\).

Confidence Intervals

A range where we expect the true value to fall

Hypothesis Testing

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Hypothesis Tests

Hypothesis tests are used to test whether claims are valid or not. This is conducted by collecting data, setting the Null and Alternative Hypothesis.

Null Hypothesis \(H_0\)

The null hypothesis is the claim that is initially believed to be true. For the most part, it is always equal to the hypothesized value.

Alternative Hypothesis \(H_a\)

The alternative hypothesis contradicts the null hypothesis.

Example of Null and Alternative Hypothesis

We want to see if \(\beta\) is different from \(\beta^*\)

Null Hypothesis	Alternative Hypothesis
\(H_0: \beta=\beta^*\)	\(H_a: \beta\ne\beta^*\)
\(H_0: \beta\le\beta^*\)	\(H_a: \beta>\beta^*\)
\(H_0: \beta\ge\beta^*\)	\(H_0: \beta<\beta^*\)

One-Side vs Two-Side Hypothesis Tests

Notice how there are 3 types of null and alternative hypothesis, The first type of hypothesis (\(H_a:\beta\ne\beta^*\)) is considered a 2-sided hypothesis because the rejection region is located in 2 regions. The remaining two hypotheses are considered 1-sided because the rejection region is located on one side of the distribution.

Null Hypothesis	Alternative Hypothesis	Side
\(H_0: \beta=\beta^*\)	\(H_a: \beta\ne\beta^*\)	2-Sided
\(H_0: \beta\le\beta^*\)	\(H_a: \beta>\beta^*\)	1-Sided
\(H_0: \beta\ge\beta^*\)	\(H_0: \beta<\beta^*\)	1-Sided

Hypothesis Testing Steps

State \(H_0\) and \(H_1\)
Choose \(\alpha\)
Compute confidence interval/p-value
Make a decision

Rejection Region

Code

alpha <- 0.05

# Critical values for two-tailed test
z_critical <- qnorm(1 - alpha / 2)

# Create data for the normal curve
x <- seq(-4, 4, length = 1000)
y <- dnorm(x)

df <- data.frame(x = x, y = y)

ggplot(df, aes(x = x, y = y)) +
  geom_line(color = "deepskyblue", size = 1) +
  geom_area(data = subset(df, x <= -z_critical), aes(y = y), fill = "firebrick", alpha = 0.5) +
  geom_area(data = subset(df, x >= z_critical), aes(y = y), fill = "firebrick", alpha = 0.5) +
  geom_vline(xintercept = c(-z_critical, z_critical), linetype = "dashed", color = "black") +
  theme_bw()

Decision Making

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Decision Making

Hypothesis Testing will force you to make a decision: Reject \(H_0\) OR Fail to Reject \(H_0\)

Reject \(H_0\): The effect seen is not due to random chance, there is a process making contributing to the effect.

Fail to Reject \(H_0\): The effect seen is due to random chance. Random sampling is the reason why an effect is displayed, not an underlying process.

Decision Making: P-Value

The p-value approach is one of the most common methods to report significant results. It is easier to interpret the p-value because it provides the probability of observing our test statistics, or something more extreme, given that the null hypothesis is true.

If \(p < \alpha\), then you reject \(H_0\); otherwise, you will fail to reject \(H_0\).

Decision Making: Confidence Interval Approach

The confidence interval approach can evaluate a hypothesis test where the alternative hypothesis is \(\beta\ne\beta^*\). The bootstrapping approach will result in a lower and upper bound denoted as: \((LB, UB)\).

If \(\beta^*\) is in \((LB, UB)\), then you fail to reject \(H_0\). If \(\beta^*\) is not in \((LB,UB)\), then you reject \(H_0\).

Significance Level \(\alpha\)

The significance level \(\alpha\) is the probability you will reject the null hypothesis given that it was true.

In other words, \(\alpha\) is the error rate that a research controls.

Typically, we want this error rate to be small (\(\alpha = 0.05\)).

Power Analysis

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

What is Statistical Power

Statistical Power is the probability of correctly rejecting a false null hypothesis.
In other words, it’s the chance of detecting a real effect when it exists.

Why Power Matters

Low power → high risk of Type II Error (false negatives)
High power → better chance of finding true effects
Common threshold: 80% power

Errors in Inference

Type I	Reject \(H_0\) when true	False positive
Type II	Don’t reject \(H_0\) when false	False negative
Power	\(1 - P(\text{Type II})\)	Detecting a true effect

Type I Error (False Positive)

Rejecting \(H_0\) when it is actually true
Probability = \(\alpha\) (significance level)

Type II Error (False Negative)

Failing to reject \(H_0\) when it is actually false
Probability = \(\beta\)
Power = \(1 - \beta\)

Balancing Errors

Lowering \(\alpha\) reduces Type I errors, but increases risk of Type II errors.
To reduce both:
- Increase sample size
- Use more appropriate statistical tests

What Affects Power?

Effect Size
- Bigger effects are easier to detect
Sample Size (\(n\))
- Larger samples reduce standard error
Significance Level (\(\alpha\))
- Higher \(\alpha\) increases power (but riskier!)
Variability
- Less noise in data = better power

Boosting Power

Power = Probability of rejecting \(H_0\) when it’s false
Helps avoid Type II Errors
Driven by:
- Sample size
- Effect size
- \(\alpha\)
- Variability
Aim for 80% or higher

Confidence Intervals

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Confidence Intervals

A confidence interval gives a range of plausible values for a population parameter.
It reflects uncertainty in point estimates from sample data.

Interpretation

“We are 95% confident that the true mean lies between A and B.”

This does not mean there’s a 95% chance the mean is in that interval.
It means: if we repeated the sampling process many times, 95% of the intervals would contain the true value.

Factors Affecting CI Width

Sample size (\(n\)): larger \(n\) → narrower CI
Standard deviation (\(s\) or \(\sigma\)): more variability → wider CI
Confidence level: higher confidence → wider CI

Linear Regression Inference in R

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Conducting HT of \(\beta_j\)

Code

xlm <- lm(Y ~ X, data = DATA)
summary(xlm)

xlm: name of the stored model
Y: Name of the outcome variable in DATA
X: Name of the Predictor Variable(s) in DATA
DATA: Name of the data set

Example

Code

m1 <- lm(body_mass_g ~ species + flipper_length_mm, penguins)
summary(m1)

#> 
#> Call:
#> lm(formula = body_mass_g ~ species + flipper_length_mm, data = penguins)
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -927.70 -254.82  -23.92  241.16 1191.68 
#> 
#> Coefficients:
#>                    Estimate Std. Error t value Pr(>|t|)    
#> (Intercept)       -4031.477    584.151  -6.901 2.55e-11 ***
#> speciesChinstrap   -206.510     57.731  -3.577 0.000398 ***
#> speciesGentoo       266.810     95.264   2.801 0.005392 ** 
#> flipper_length_mm    40.705      3.071  13.255  < 2e-16 ***
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> Residual standard error: 375.5 on 338 degrees of freedom
#>   (2 observations deleted due to missingness)
#> Multiple R-squared:  0.7826, Adjusted R-squared:  0.7807 
#> F-statistic: 405.7 on 3 and 338 DF,  p-value: < 2.2e-16

Confidence Interval

Code

confint(xlm, level = LEVEL)

xlm: Name of the model saved in R
LEVEL: A number between 0 and 1 to specify confidence level

Example

Code

confint(m1, level = 0.90)

#>                           5 %        95 %
#> (Intercept)       -4994.96108 -3067.99270
#> speciesChinstrap   -301.72956  -111.29068
#> speciesGentoo       109.68404   423.93517
#> flipper_length_mm    35.64014    45.77066

Linear Regression Example

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Wage Data Example

The Wage data set contains data on 3000 male workers in the atlantic region. We are interested if there is a significant effect on the outcome wage based on the predictor variable age, adjusting for marital status (maritl), race (race), and education level (education).

Red Wine Data

The Wine Quality data set contains data on information on both red and white wine from North Portugal. We are interested in seeing if density of the red wine (predictor variable) affects the quality (outcome variable), adjusting for alcohol, p_h, residual_sugar, and fixed_acidity.

Code

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/wine-quality/winequality-red.csv"
wine <- read_delim(url, delim = ";")

Logistic Regression Inference in R

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Conducting HT of \(\beta_j\)

Code

xlm <- glm(Y ~ X, data = DATA, family = binomial())
summary(xlm)

xlm: name of the stored model
Y: Name of the outcome variable in DATA
X: Name of the Predictor Variable(s) in DATA
DATA: Name of the data set

Example

Code

m1 <- glm(death ~ recur + number + size, bladder1, family = binomial())
summary(m1)

#> 
#> Call:
#> glm(formula = death ~ recur + number + size, family = binomial(), 
#>     data = bladder1)
#> 
#> Coefficients:
#>               Estimate Std. Error z value Pr(>|z|)    
#> (Intercept) -0.8525259  0.4462559  -1.910 0.056082 .  
#> recur       -0.3897480  0.1062848  -3.667 0.000245 ***
#> number       0.0008451  0.1124503   0.008 0.994004    
#> size        -0.2240419  0.1626749  -1.377 0.168439    
#> ---
#> Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#> 
#> (Dispersion parameter for binomial family taken to be 1)
#> 
#>     Null deviance: 189.38  on 293  degrees of freedom
#> Residual deviance: 166.43  on 290  degrees of freedom
#> AIC: 174.43
#> 
#> Number of Fisher Scoring iterations: 6

Confidence Interval

Code

confint(xlm, level = LEVEL)

xlm: Name of the model saved in R
LEVEL: A number between 0 and 1 to specify confidence level

Example

Code

confint(m1, level = 0.95)

#>                  2.5 %      97.5 %
#> (Intercept) -1.7353779  0.02529523
#> recur       -0.6217831 -0.20078281
#> number      -0.2421738  0.20731479
#> size        -0.5880581  0.06061498

Confidence Interval for Odds Ratio

Code

exp(confint(m1, level = 0.95))

#>                 2.5 %    97.5 %
#> (Intercept) 0.1763335 1.0256179
#> recur       0.5369861 0.8180901
#> number      0.7849197 1.2303698
#> size        0.5554048 1.0624898

Logistic Regression Example

Statistical Inference
Hypothesis Testing
Decision Making
Power Analysis
Confidence Intervals
Linear Regression Inference in R
Linear Regression Example
Logistic Regression Inference in R
Logistic Regression Example

Breast Cancer Data

The Breast Cancer data set contains information about image diagnosis of individuals from Wisconsin. We are interested if breast cancer diagnosis (outcome variable; Benign or Malignant), is affected by tumor radius, adjusting for texture, perimeter, and smoothness.

Code

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/breast-cancer-wisconsin/wdbc.data"
bc <- read.csv(url, header = FALSE)

# Add column names
colnames(bc) <- c("id", "diagnosis", paste0("V", 3:32))

# Convert diagnosis to factor
bc$diagnosis <- factor(bc$diagnosis, levels = c("B", "M"), labels = c("Benign", "Malignant"))

Bank Note Classification

The Bank Note data set contains information about bank note authentication based on images. We are interested in seeing if class (outcome variable; real or fake) is associated by image skewness (predictor), adjusting for variance, and entropy.

Code

url <- "https://archive.ics.uci.edu/ml/machine-learning-databases/00267/data_banknote_authentication.txt"
bank <- read.csv(url, header = FALSE)

colnames(bank) <- c("variance", "skewness", "curtosis", "entropy", "class")
bank$class <- factor(bank$class, levels = c(0, 1), labels = c("Genuine", "Forged"))