2025-04-08
Motivation
Hypothesis Testing
Example 1
Example 2
The ncbirths
data set provides information on whether different predictors. The outcome of interest will be premie
(“full term” or “premie”).
Model the relationship between the outcome premie
and habit
(“smoker” and “nonsmoker”).
Model the relationship between the outcome premie
and visits
(“smoker” and “nonsmoker”).
Use a logistic regression to model the outcome premie
and habit
.
#> [1] -0.01344105
#> [1] 0.9866489
The odds of a mother who smokes having a premature infant is 0.98 times lower than the odds of a mother who does not smoke.
Use a logistic regression to model the outcome premie
and visits
.
#> [1] -0.1063097
#> [1] 0.8991461
As the number of hospital visits increases by 1, the odds of having a premature infant decreases by a factor 0.89.
Does smoking have a protective effect on having a premature infant?
Does the number of hospital visits have a protective effect?
What is real and what is random?
Motivation
Hypothesis Testing
Example 1
Example 2
Hypothesis tests are used to test whether claims are valid or not. This is conducted by collecting data, setting the Null and Alternative Hypothesis.
The null hypothesis is the claim that is initially believed to be true. For the most part, it is always equal to the hypothesized value.
The alternative hypothesis contradicts the null hypothesis.
We want to see if \(\beta\) is different from \(\beta^*\)
Null Hypothesis | Alternative Hypothesis |
---|---|
\(H_0: \beta=\beta^*\) | \(H_a: \beta\ne\beta^*\) |
\(H_0: \beta\le\beta^*\) | \(H_a: \beta>\beta^*\) |
\(H_0: \beta\ge\beta^*\) | \(H_0: \beta<\beta^*\) |
Notice how there are 3 types of null and alternative hypothesis, The first type of hypothesis (\(H_a:\beta\ne\beta^*\)) is considered a 2-sided hypothesis because the rejection region is located in 2 regions. The remaining two hypotheses are considered 1-sided because the rejection region is located on one side of the distribution.
Null Hypothesis | Alternative Hypothesis | Side |
---|---|---|
\(H_0: \beta=\beta^*\) | \(H_a: \beta\ne\beta^*\) | 2-Sided |
\(H_0: \beta\le\beta^*\) | \(H_a: \beta>\beta^*\) | 1-Sided |
\(H_0: \beta\ge\beta^*\) | \(H_0: \beta<\beta^*\) | 1-Sided |
Hypothesis Testing will force you to make a decision: Reject \(H_0\) OR Fail to Reject \(H_0\)
Reject \(H_0\): The effect seen is not due to random chance, there is a process making contributing to the effect.
Fail to Reject \(H_0\): The effect seen is due to random chance. Random sampling is the reason why an effect is displayed, not an underlying process.
The p-value approach is one of the most common methods to report significant results. It is easier to interpret the p-value because it provides the probability of observing our test statistics, or something more extreme, given that the null hypothesis is true.
If \(p < \alpha\), then you reject \(H_0\); otherwise, you will fail to reject \(H_0\).
The confidence interval approach can evaluate a hypothesis test where the alternative hypothesis is \(\beta\ne\beta^*\). The bootstrapping approach will result in a lower and upper bound denoted as: \((LB, UB)\).
If \(\beta^*\) is in \((LB, UB)\), then you fail to reject \(H_0\). If \(\beta^*\) is not in \((LB,UB)\), then you reject \(H_0\).
The significance level \(\alpha\) is the probability you will reject the null hypothesis given that it was true.
In other words, \(\alpha\) is the error rate that a research controls.
Typically, we want this error rate to be small (\(\alpha = 0.05\)).
Set up the Null and Alternative Hypothesis.
Compute p-value or confidence interval
Make a decision
Interpret the results
Motivation
Hypothesis Testing
Example 1
Example 2
Use a logistic regression to model the outcome premie
and habit
.
Motivation
Hypothesis Testing
Example 1
Example 2