2025-04-01
Sampling Distribution
Simulating Unicorns
Central Limit Theorem
Common Sampling Distributions
Sampling Distributions for Regression Models
Sampling Distribution is the idea that the statistics that you generate (slopes and intercepts) have their own data generation process.
In other words, the numerical values you obtain from the lm
and glm
function can be different if we got a different data set.
Some values will be more common than others. Because of this, they have their own data generating process, like the outcome of interest has it’s own data generating process.
Distribution of a statistic over repeated samples
Different Samples yield different statistics
The Standard Error (SE) is the standard deviation of a statistic itself.
SE tells us how much a statistic varies from sample to sample. Smaller SE = more precision.
\[ Y_i = \beta_0 + \beta_1 X_i + \varepsilon_i \]
\[ \varepsilon_i \sim DGP \]
The randomness effect is a sampling phenomenom where you will get different samples everytime you sample a population.
Getting different samples means you will get different statistics.
These statistics will have a distribution on their own.
Sampling Distribution
Simulating Unicorns
Central Limit Theorem
Common Sampling Distributions
Sampling Distributions for Regression Models
To better understand the variation in statistics, let’s simulate a data set of unicorn characteristics to visualize and understand the variation.
We will simulate a data set using the unicorns
function and only need to specify how many unicorns you want to simulate.
#> [1] "Unicorn_ID" "Age" "Gender"
#> [4] "Color" "Type_of_Unicorn" "Type_of_Horn"
#> [7] "Horn_Length" "Horn_Strength" "Weight"
#> [10] "Health_Score" "Personality_Score" "Magical_Score"
#> [13] "Elusiveness_Score" "Gentleness_Score" "Nature_Score"
We will only look at Magical_Score
and Nature_Score
.
\[ Magical = 3423 + 8 \times Nature + \varepsilon \]
\[ \varepsilon \sim N(0, 3.24) \]
lm
N
: number of times to repeat a processCODE
: what is to repeatedMODEL
: a model that can be used to extract componentsINDEX
: which component do you want to use
0
: Intercept1
: first slope2
: second slope...
#> [1] 7.997632 7.997991 7.998313 8.001240 7.995315 8.001904 7.998877 8.000884
#> [9] 8.004513 7.994004 7.998884 7.999972 8.002184 8.007035 7.990211 7.993707
#> [17] 7.996064 7.999105 8.001524 8.000990 7.998822 8.000173 7.997829 8.004573
#> [25] 8.003037 7.994623 7.996811 7.994341 7.995896 7.998858 8.001211 8.001380
#> [33] 7.994521 8.000795 8.001994 7.999628 8.005461 8.005022 8.004779 7.997077
#> [41] 7.996466 7.998786 8.002236 8.000670 7.999901 7.995504 7.995391 7.996135
#> [49] 7.995058 8.001030 7.998628 8.006220 8.004327 8.000139 7.997487 7.995011
#> [57] 7.999462 7.997409 8.003891 7.997757 8.005654 8.000364 8.001224 8.001460
#> [65] 7.997232 7.999980 7.996626 7.993305 8.004015 8.005500 7.997155 7.998066
#> [73] 7.994468 7.995546 8.001601 7.999792 7.991261 8.000052 8.002325 7.996592
#> [81] 7.997234 8.003938 8.000648 7.999911 8.001573 8.002062 7.995522 8.000860
#> [89] 8.002080 7.998516 8.005041 7.996588 8.002593 8.003953 8.003344 8.004013
#> [97] 8.000579 8.001397 7.989768 8.000604 7.999746 7.999621 8.000390 7.990722
#> [105] 8.001206 8.004442 8.003553 7.997145 8.004399 8.005904 8.002679 8.000688
#> [113] 8.001036 7.992793 7.997595 8.000565 7.999792 7.998692 8.000178 8.001915
#> [121] 8.000520 7.997410 8.000869 8.002938 7.998961 7.997764 7.999482 7.997923
#> [129] 7.998568 8.000316 8.002184 7.993755 7.998149 8.002428 7.997807 7.998602
#> [137] 7.995664 8.001783 7.998465 7.991418 8.008901 7.995744 8.005698 8.003770
#> [145] 8.001356 8.000229 8.007533 7.995410 8.007421 8.001301 8.000581 7.996916
#> [153] 7.996796 7.994429 8.006515 7.995615 8.005645 7.999882 7.996875 8.004060
#> [161] 8.001877 7.993342 8.003987 8.003547 8.005892 7.998167 7.999365 8.001173
#> [169] 8.005663 8.002674 7.996723 8.000224 7.997127 7.993992 7.998085 8.000749
#> [177] 8.002087 8.005411 7.996130 7.995419 8.000196 7.996848 7.998802 8.003887
#> [185] 8.001192 7.996315 7.996958 7.995781 7.999735 7.999390 8.001352 8.008260
#> [193] 7.999908 7.999516 7.998022 8.001395 8.002117 8.004142 8.009295 8.002288
#> [201] 7.999406 7.992677 8.001016 8.009689 8.001298 7.998158 7.999456 7.997953
#> [209] 8.001007 8.000668 8.001845 8.002892 8.002304 7.999994 8.007042 8.001809
#> [217] 7.999995 7.997101 8.001664 8.000496 7.997758 8.001051 8.004977 8.001817
#> [225] 8.001159 7.997231 8.002456 7.999789 7.996748 7.996686 7.998973 8.001998
#> [233] 7.997369 8.007685 8.001897 8.005296 8.000780 7.993748 8.001021 7.999173
#> [241] 7.996548 8.007710 7.997935 7.995372 8.000373 7.998715 8.001714 7.998022
#> [249] 7.996228 8.002292 7.995358 8.001876 8.002586 7.994187 8.006756 7.998490
#> [257] 8.005728 7.994582 7.996680 7.999812 7.991980 7.998519 8.009793 7.996479
#> [265] 7.998855 8.003460 8.001306 8.003803 7.998812 8.004618 7.996986 7.999438
#> [273] 8.001394 7.992100 8.004374 7.995438 8.000504 7.996229 8.008469 7.998516
#> [281] 8.002244 7.995609 7.997142 8.000360 7.996050 8.000785 8.002780 8.001442
#> [289] 8.005977 8.002599 8.001257 8.003878 7.999126 7.997683 8.002033 8.009570
#> [297] 7.997712 7.997641 7.995195 7.996919 7.997046 8.005230 7.996316 7.999955
#> [305] 7.999015 7.996410 8.001020 7.993133 8.000715 8.005989 8.000184 8.001312
#> [313] 8.003210 8.001849 8.006486 7.997372 7.999987 7.997007 7.992363 8.003736
#> [321] 7.992069 7.995016 7.999823 7.997141 8.011768 8.003432 7.998501 8.003971
#> [329] 8.000479 7.999129 7.995160 7.994981 7.998561 7.994955 7.998409 8.007674
#> [337] 7.997888 7.994762 8.001930 8.006570 8.001950 7.999250 8.000284 7.995737
#> [345] 8.000319 8.006461 8.005864 8.000046 8.001012 7.999008 7.995284 8.006791
#> [353] 8.003115 7.994044 8.004631 7.995351 7.999128 7.995745 7.999280 7.997070
#> [361] 8.007538 7.997331 7.995013 8.005322 8.000991 7.999362 8.006051 7.997122
#> [369] 8.004996 8.000992 7.997716 8.003837 8.000059 8.002150 8.007346 7.993704
#> [377] 8.001270 7.999525 8.003832 7.993223 7.998136 7.996540 8.001663 8.009955
#> [385] 8.007838 7.990166 7.994759 7.992888 7.997279 7.996033 7.996287 8.002078
#> [393] 7.993917 8.001540 8.002114 7.999042 8.001065 7.996678 8.002962 8.004150
#> [401] 8.002093 8.000925 7.995199 8.010969 7.999600 8.003938 7.997072 8.004924
#> [409] 7.999060 8.002159 8.000991 7.999277 7.997814 7.999464 8.002479 7.998212
#> [417] 8.002141 8.001678 7.995584 7.997144 8.000004 8.000549 7.999444 7.998909
#> [425] 8.004682 7.998966 7.998172 8.003782 7.994302 8.003089 8.000923 7.998558
#> [433] 7.998431 7.993781 7.996412 8.000274 7.998830 7.995294 8.002627 7.999700
#> [441] 7.996207 7.996780 8.003381 8.002402 8.001564 7.998173 7.994030 8.003207
#> [449] 7.997798 8.008855 7.998953 7.995520 7.997805 8.007508 8.004622 7.996270
#> [457] 7.992690 7.996084 7.999052 7.995728 7.997300 7.996292 8.000436 7.994991
#> [465] 7.996887 8.000965 8.001502 8.001619 7.997019 7.999044 8.001335 8.003610
#> [473] 7.993896 8.007574 7.997126 8.005446 7.998837 7.999002 8.000315 8.000073
#> [481] 8.000161 8.002095 8.001807 8.002282 8.000176 8.000345 7.999298 7.999235
#> [489] 7.995719 7.998441 7.997762 8.000042 7.996234 8.000816 8.001066 7.998683
#> [497] 7.997838 7.996375 8.001729 7.994191 8.000080 7.996549 8.007487 8.001098
#> [505] 7.998409 7.996250 8.003333 7.994521 7.996367 8.005768 7.998474 8.001629
#> [513] 7.999606 7.996648 7.995719 7.994631 8.002401 7.997761 7.997335 7.997079
#> [521] 7.996777 7.997916 7.998305 8.002712 7.999604 7.996699 8.000274 7.998648
#> [529] 7.997107 7.996326 7.999680 7.999334 7.994539 7.998234 8.007180 8.000673
#> [537] 7.994288 8.004812 8.003099 7.998644 8.002055 8.000974 7.993440 8.002863
#> [545] 7.998995 7.997504 7.998536 7.994636 8.003994 8.002990 8.001766 7.998984
#> [553] 7.997173 8.008615 7.997733 7.997579 8.001019 8.006172 8.002001 8.008265
#> [561] 8.001061 7.997695 7.998353 8.003952 8.000652 8.001701 8.003572 8.003257
#> [569] 7.994804 8.003175 8.001382 8.002706 8.002183 7.998026 7.999154 7.999352
#> [577] 7.999665 8.005176 7.990227 8.000597 8.003215 8.003437 8.001498 8.003144
#> [585] 7.998229 7.999660 8.000639 8.008070 7.999598 8.005468 7.996505 8.001329
#> [593] 7.999332 7.997065 7.999929 7.993784 8.005830 8.003127 8.004573 8.000348
#> [601] 8.008174 7.995726 8.000811 7.994207 7.999449 8.001372 8.003679 7.998705
#> [609] 8.003276 7.994131 8.001830 8.004065 7.998571 7.998798 7.998681 8.008508
#> [617] 7.996200 8.001367 7.994609 7.994972 8.000584 8.003002 8.000861 8.003866
#> [625] 7.999803 7.996106 8.000713 8.002230 7.999500 8.003769 7.998189 7.999470
#> [633] 8.000123 7.999831 8.004364 7.994750 8.001197 8.006599 7.988337 7.999657
#> [641] 7.999800 8.000910 7.995230 7.996065 8.002178 7.995075 8.004038 7.998311
#> [649] 8.002365 7.999756 8.001890 8.003768 7.999474 7.987866 8.006968 7.993527
#> [657] 7.992659 8.000325 7.997782 8.001551 8.002617 8.003518 7.996489 8.004587
#> [665] 8.001517 7.997979 7.995782 8.002379 8.009516 8.004607 7.991440 7.997555
#> [673] 7.999216 8.000457 8.002379 8.000499 7.990635 7.996781 8.005982 7.999255
#> [681] 7.995245 7.999438 7.998389 8.003766 8.001695 7.999247 8.000565 7.996656
#> [689] 7.996845 7.998076 8.004370 7.998120 8.002890 7.996849 8.000053 8.001132
#> [697] 8.009077 8.008402 8.003318 7.998344 8.002449 7.997960 7.995285 7.998982
#> [705] 7.995422 8.008223 7.995216 7.997825 8.000483 7.999852 7.999423 8.002478
#> [713] 8.001344 7.994678 7.990713 8.003834 7.998299 7.996946 7.994607 8.000328
#> [721] 7.996110 8.003888 7.998335 7.999073 8.000441 8.002857 7.998523 8.000739
#> [729] 7.995230 7.995479 7.994644 8.001809 7.990181 8.005391 7.999380 7.997808
#> [737] 8.002257 7.995072 7.999083 7.999359 8.001288 8.004255 7.997668 8.005375
#> [745] 8.004211 7.999743 8.005616 7.998952 7.997841 8.005883 7.997368 7.999721
#> [753] 7.992488 7.997812 7.999340 8.000893 7.992655 7.997732 8.001234 7.996200
#> [761] 7.999748 7.997889 8.003297 8.008962 8.005601 8.003723 7.996326 8.001638
#> [769] 7.994808 7.998551 8.004422 7.998065 8.000909 8.002278 7.997642 8.000182
#> [777] 7.997340 7.996924 7.999020 8.003074 7.996943 7.997708 8.000464 8.001145
#> [785] 7.998896 7.996108 8.000341 8.003032 8.004444 8.007393 7.996749 8.000608
#> [793] 7.996521 8.004650 8.004262 8.009847 7.996171 7.995686 7.996880 8.003229
#> [801] 7.994508 8.003067 7.995363 8.003659 7.997500 7.998886 7.998746 8.004563
#> [809] 8.001324 7.994619 8.005400 7.998783 8.001472 8.000791 7.998021 8.003976
#> [817] 7.998469 8.001588 7.996509 8.007386 7.999824 8.000333 7.999085 8.002929
#> [825] 7.995887 8.004226 7.998790 7.997323 7.999113 7.994575 7.998476 7.998158
#> [833] 8.001913 8.007543 8.002278 8.009369 7.994321 8.000223 8.009645 7.996227
#> [841] 7.994862 7.996213 7.994047 8.004346 7.993120 8.007185 7.998158 8.005241
#> [849] 8.006997 7.995658 7.996866 8.000384 7.997144 8.001407 7.997994 7.999099
#> [857] 7.999396 8.007558 8.005803 7.997356 7.998743 8.014076 8.000982 7.996391
#> [865] 7.989113 8.002935 8.002252 8.003984 7.992276 7.994938 8.001932 8.000250
#> [873] 7.998719 7.997118 7.999218 8.003559 7.998347 8.003086 8.000969 7.998884
#> [881] 8.001633 7.999580 7.998489 7.999047 7.991872 7.996039 8.001938 8.005591
#> [889] 8.004580 7.998989 8.002732 7.998494 7.994115 8.005956 8.002765 8.002380
#> [897] 7.993191 8.003226 8.000900 7.992528 7.998691 7.994608 8.002224 8.004165
#> [905] 8.002920 7.993480 8.002642 8.002140 7.999405 8.003939 7.990446 8.001872
#> [913] 8.000107 7.998172 8.000905 8.000465 8.001978 7.997166 7.992303 8.001303
#> [921] 7.997530 8.003449 8.002642 7.995629 7.997961 8.005773 8.002240 8.001213
#> [929] 8.000372 8.000279 7.994696 8.004001 7.998116 7.999590 8.000780 8.000737
#> [937] 7.999731 8.001566 7.996633 7.992785 7.997424 7.996533 7.996289 8.005493
#> [945] 8.002652 8.001489 8.000887 7.991953 7.999821 7.990551 7.997725 8.003789
#> [953] 8.005503 7.998506 7.999595 7.996863 8.001345 7.998051 7.997424 8.003695
#> [961] 7.996422 7.998953 7.993464 8.004175 8.006037 8.001828 7.997440 7.995781
#> [969] 7.999212 7.996047 7.999232 7.997678 7.999144 8.000991 7.999227 7.998431
#> [977] 8.003954 8.004121 7.987809 8.005990 8.002811 7.998413 8.001371 8.007867
#> [985] 8.004962 7.997512 7.995998 7.997973 7.996137 8.000418 7.999125 8.002964
#> [993] 7.990908 8.002621 7.996534 7.996232 7.998360 7.995406 8.004237 7.995312
Sampling Distribution
Simulating Unicorns
Central Limit Theorem
Common Sampling Distributions
Sampling Distributions for Regression Models
The Central Limit Theorem (CLT) is a fundamental concept in probability and statistics. It states that the distribution of the sum (or average) of a large number of independent, identically distributed (i.i.d.) random variables will be approximately normal, regardless of the underlying distribution of those individual variables.
Simulating 500 samples of size 10 from a normal distribution with mean 5 and standard deviation of 2.
Simulating 500 samples of size 30 from a normal distribution with mean 5 and standard deviation of 2.
Simulating 500 samples of size 50 from a normal distribution with mean 5 and standard deviation of 2.
Simulating 500 samples of size 100 from a normal distribution with mean 5 and standard deviation of 2.
Sampling Distribution
Simulating Unicorns
Central Limit Theorem
Common Sampling Distributions
Sampling Distributions for Regression Models
When the data is said to have a normal distribution (DGP), there are special properties with both the mean and standard deviation, regardless of sample size.
Mean \[ \bar X = \sum ^n_{i=1} X_i \]
Standard Deviation \[ s^2 = \frac{1}{n}\sum ^n_{i=1} (X_i - \bar X)^2 \]
A data sample of size \(n\) is generated from: \[ X_i \sim N(\mu, \sigma) \]
\[ \bar X \sim N(\mu, \sigma/\sqrt{n}) \]
\[ Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}} \sim N(0,1) \]
A data sample of size \(n\) is generated from: \[ X_i \sim N(\mu, \sigma) \]
\[ (n-1)s^2/\sigma^2 \sim \chi^2(n-1) \]
\[ Z = \frac{\bar X - \mu}{\sigma/\sqrt{n}} \rightarrow \frac{\bar X - \mu}{s/\sqrt{n}} \sim t(n-1) \]
Sampling Distribution
Simulating Unicorns
Central Limit Theorem
Common Sampling Distributions
Sampling Distributions for Regression Models
The estimates of regression coefficients (slopes) have a distribution!
Based on our outcome, we will have 2 different distributions to work with: Normal or t.
\[ \frac{\hat\beta_j-\beta_j}{\mathrm{se}(\hat\beta_j)} \sim t_{n-p^\prime} \]
\[ \frac{\hat\beta_j}{\mathrm{se}(\hat\beta_j)} \sim t_{n-p^\prime} \]
\[ \frac{\hat\beta_j - \beta_j}{\mathrm{se}(\hat\beta_j)} \sim N(0,1) \]
\[ \frac{\hat\beta_j}{\mathrm{se}(\hat\beta_j)} \sim N(0,1) \]