class: bg-main1 hide-slide-number split-70 count: false .column[.right.vmiddle.content[ .font3[Parametric: .amber[Mean] in Two Groups] ]] .bg-main4.column[.vmiddle.content[ .amber[Aly Lamuri] Indonesia Medical Education and Research Institute ]] --- name: overview layout: true class: bg-main4 middle split-30 hide-slide-number .column[.vmiddle.right.content[ .amber.font3[Overview] ]] --- template: overview count: false .bg-main1.column[.vmiddle.content[ - .amber[Mean difference] - One sample T-Test - Unpaired T-Test - Paired T-Test - Effect size ]] --- layout: false class: bg-main3 center middle # Concept recall .font2[ `$$\frac{x - \bar{x}}{s} \sim N(0, 1) \tag{1}$$` `$$\bar{X} \xrightarrow{d} N(\mu, \frac{\sigma}{\sqrt{n}}) \tag{2}$$` ] ??? - Scaling and centering data following a normal distribution results in a standardized normal distribution (Z-distribution) - Central limit theorem: sample mean from any distribution will follow normal distribution -- .font2[ With known `\(\mu\)` and `\(\sigma\)`, we can make a direct comparison ] ??? Ideal: by knowing the parameter `\(\mu\)` and `\(\sigma\)`, we can directly compare our sample mean to its corresponding population -- .font2[ But... ] -- .font2[ How if we don't know `\(\mu\)`? ] ??? Solution: use its statistics `\(\bar{x}\)` as an estimate --- layout: true class: bg-main3 # Mean difference --- .font2[ - Simply `\(\bar{x} - \mu_0\)` - But there bound to be errors in our samples - What do we do? - An adjustment to .amber[standard error] ] ??? - One-sample mean difference test: compare mean from our data to previously reported mean in the population --- count: false `\begin{align} SE &= \frac{\sigma}{\sqrt{n}} \tag{Standard Error} \\ \\ z &= \frac{\bar{x} - \mu_0}{SE} \\ &= \frac{\bar{x} - \mu_0}{^{\sigma}/\tiny{\sqrt{n}}} \tag{One-sample Test} \end{align}` ??? - `\(\mu_0\)` is the expected (hypothesized) mean, often it is something we acquire from previous publications - `\(\sigma\)` is the **known** standard deviation in **population** -- <br> <br> .font2[How do we get the p-value?] -- (.amber[Hint:] Z-statistics follows the Z-distribution) ??? To get the p-value, we calculate acquired `\(z\)` statistics as a quantile of the Z-distribution --- layout: true class: bg-main3 # Example, please? .font2[ In a population of third-year electrical engineering students, we know the .amber[average final score] of a particular course is .pink[70]. In measuring students' comprehension, UKRIDA has established a standardized examination with a .amber[standard deviation] of .pink[10]. We are interested to see whether students registered to this year course have different average, where 18 students averagely scored 75 on the final exam. ] --- .font2[ `\begin{align} H_0 &: \bar{x} = \mu_0 \\ H_a &: \bar{x} \neq \mu_0 \end{align}` ] --- count: false .font2[ `\begin{align} SE &= \frac{10}{\sqrt{18}} &= 2.36 \\ z &= \frac{75 - 70}{2.36} &= 2.12 \end{align}` ] --- layout: false class: bg-main3 count: false # Where does it located in Z-distribution? <img src="index_files/figure-html/z.dist1-1.png" width="100%" /> --- count: false class: bg-main3 # Where is it relative to the significance value? <img src="index_files/figure-html/z.dist2-1.png" width="100%" /> --- count: false class: bg-main3 # What is the p-value? .font2[ - Recall the hypothesis `\(H_a: \bar{x} \neq \mu_0\)` - We need to conduct a two-tailed test to determine the p-value - First we find the cummulative probability of `\(z\)` which satisfies: `$$P(Z \leqslant 2.12\ |\ \mu,\sigma): Z \sim N(0, 1)$$` ] -- .font2[ - Subtract `\(P(Z=z)\)` from 1 - Multiply by 2 ] -- ```r 2 * {1 - pnorm(2.12, 0, 1)} ``` ``` ## [1] 0.034 ``` --- class: bg-main3 # What do we learn? .font2[ - Z-test requires the data to follow .amber[normal distribution] - We do not need to know the parameter `\(\mu \to\)` We can hypothesize the value - But we need the parameter `\(\sigma\)` to correctly compute `\(z\)` ] --- count: false class: bg-main3 # Any hidden message? .font2[ - Yes! - Z-test assumes normality - So we need to perform goodness of fit or normality test ] -- .font2[ What if we do not know `\(\sigma\)`? ] -- .font2[We are unable to use the Z-distribution] -- .font2[Solution: use Student's T-distribution] --- layout: true class: bg-main3 # Student's T-distribution --- `\begin{align} Let\ & X \sim t_\nu \tag{Notation} \\ P(X=x) &= \frac{\Gamma \big( \frac{\nu+1}{2} \big)}{\sqrt{\nu \pi}\ \Gamma \big( \frac{\nu}{2} \big)} \bigg( 1 + \frac{x^2}{\nu} \bigg) \tag{PDF} \\ \nu &= n - 1 \\ \\ Let\ & T \sim t_\nu \\ T &= Z \sqrt{\frac{\nu}{V}} \tag{Relationship} \end{align}` -- .font2[ - `\(T:\)` T-distribution with `\(\nu\)` degree of freedom - `\(Z:\)` A standardized normal distribution - `\(V:\)` A `\(\chi^2\)` distribution with `\(\nu\)` degree of freedom ] ??? - Importance: Use in parametric mean difference between two groups - Student's T-distribution only depends on 1 parameter: `\(\nu\)` degree of freedom - Degree of freedom is total subjects subtracted by 1 --- count: false <img src="index_files/figure-html/t.dist-1.png" width="100%" /> --- template: overview .bg-main1.column[.vmiddle.content[ - Mean difference - .amber[One sample T-Test] - Unpaired T-Test - Paired T-Test - Effect size ]] --- layout: false class: bg-main3 # One sample T-Test .font2[ - Similar to one sample Z-test - Applied to data with unknown `\(\sigma\)` - Use `\(s\)` as an estimate of the population parameter `$$t = \frac{\bar{x}-\mu}{^s \big/ \tiny{\sqrt{n}}}$$` ] --- layout: true class: bg-main3 # Example, please? ```r set.seed(1) x <- rnorm(20, mean=120, sd=20) ``` --- ```r summary(x) ``` ``` ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 75.7 112.3 127.2 123.8 135.2 151.9 ``` ```r sd(x) ``` ``` ## [1] 18.3 ``` --- count: false .font2[ - We let `\(x \sim N(120, 20)\)` - But our `\(\bar{x}\)` is 123.81 - With an `\(s\)` of 18.265 - And a `\(\nu\)` of 19 - Does our sample differ from the population? ] --- count: false ## Formulate the hypothesis .font2[ `\begin{align} H_0 &: \bar{x} = 120 \\ H_a &: \bar{x} \neq 120 \end{align}` ] --- count: false ## Determine the t-statistics .font2[ `$$t=\frac{\bar{x}-\mu}{^s \big/ \tiny{\sqrt{n}}}$$` ] ```r t <- {{mean(x) - 120} / {sd(x) / sqrt(20)}} %T>% print() ``` ``` ## [1] 0.933 ``` --- layout: false count: false class: bg-main3 ## Locate `\(t\)` statistics in the T-distribution <img src="index_files/figure-html/plt.one.sample1-1.png" width="100%" /> --- count: false class: bg-main3 ## P-value in a one-tailed test <img src="index_files/figure-html/plt.one.sample2-1.png" width="100%" /> ```r 1 - pt(t, df=19) ``` ``` ## [1] 0.181 ``` --- count: false class: bg-main3 ## P-value in a two-tailed test <img src="index_files/figure-html/plt.one.sample3-1.png" width="100%" /> ```r 2 * {1 - pt(t, df=19)} ``` ``` ## [1] 0.363 ``` --- count: false class: bg-main3 # How is the result in `R`? ```r t.test(x, mu=120) ``` ``` ## ## One Sample t-test ## ## data: x ## t = 0.9, df = 19, p-value = 0.4 ## alternative hypothesis: true mean is not equal to 120 ## 95 percent confidence interval: ## 115 132 ## sample estimates: ## mean of x ## 124 ``` -- ## Conclusion - Fail to reject `\(H_0\)` - `\(\bar{x} = \mu_0 = 120\)` --- template: overview .bg-main1.column[.vmiddle.content[ - Mean difference - One sample T-Test - .amber[Unpaired T-Test] - Paired T-Test - Effect size ]] --- class: bg-main3 # Unpaired T-Test .font2[ - T-Test conducted on two sample groups - Aims to determine significance on the mean difference - Types: .amber[Student's] and .amber[Welch's] T-Test - Assumes .amber[normality] *or* a large sample size - Also robust in non-normal data, .pink[*unless*] highly-skewed or have outliers ] ??? - Problem with robustness: we often use simulation with specified parameters - But we don't know such parameters in real-world data - It's more acceptable when we don't violate the normality assumption -- .font2[ `\begin{align} H_0 &: \bar{x}_1 - \bar{x}_2 = d \\ H_a &: \bar{x}_1 - \bar{x}_2 \neq d \\ d &= \mu_1 - \mu_2 = 0 \end{align}` ] ??? In a unique case, we may find `\(d \neq 0\)` --- class: bg-main3 # Student's T-Test .font2[ - T-Test with a pooled variance - .amber[Requires:] Equal variance in two samples (tested with Levene's test) - Uses pooled variance `\(\to\)` compute statistics and the degree of freedom ] -- `\begin{align} t &= \frac{\bar{x}_1 - \bar{x}_2 - d}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}} \tag{Statistics} \\ s_p &= \sqrt{\frac{(n_1 - 1) s_1^2 + (n_2 - 1) s_2^2}{\nu}} \tag{Pooled variance} \\ \nu &= n_1 + n_2 - 2 \tag{Degree of freedom} \end{align}` -- .font2[Often, our data violate the equal variance assumption] .font2[.amber[Solution:] Welch's T-Test] --- class: bg-main3 # Welch's T-Test `\begin{align} t &= \frac{\bar{x}_1 - \bar{x}_2 - d} {\sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}} \tag{Statistics} \\ \nu &= \frac{(n_1-1)(n_2-1)}{(n_2-1)C^2 + (1-C^2)(n_1-1)} \tag{Degree of freedom} \\ \\ C &= \frac{\frac{s_1^2}{n_1}}{\frac{s_1^2}{n_1} \frac{s_2^2}{n_2}} \end{align}` -- .font2[ For the record: - Still assumes .amber[normality]! - Not-normally distributed data still accepted .pink[if] the sample size is .amber[large enough] - How large is large enough? Re-check previous discussions on the .amber[CLT] ] --- layout: true class: bg-main3 # Example, please? --- .font2[ Suppose we are collecting data on body height. Our population of interest will be students registered in UKRIDA, where we categorize sex as female and male. We acquire a normally distributed data from both sexes, where: - `\(female \sim N(155, 15)\)` - `\(male \sim N(170, 12)\)` We have a sample of 25 females and 30 males, and would like conduct a hypothesis test on mean difference. ] --- count: false ```r set.seed(5) tbl <- data.frame( "height" = c(rnorm(30, 170, 8), rnorm(25, 155, 16)), "sex" = c(rep("male", 30), rep("female", 25)) ) tapply(tbl$height, tbl$sex, summary) ``` ``` ## $female ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 123 146 154 158 173 190 ## ## $male ## Min. 1st Qu. Median Mean 3rd Qu. Max. ## 152 165 168 170 177 184 ``` ```r tapply(tbl$height, tbl$sex, sd) ``` ``` ## female male ## 17.98 7.93 ``` --- count: false <img src="index_files/figure-html/two.sample.t2-1.png" width="100%" /> ??? Does it follow the normal distribution? --- count: false ```r tapply(tbl$height, tbl$sex, shapiro.test) ``` ``` ## $female ## ## Shapiro-Wilk normality test ## ## data: X[[i]] ## W = 1, p-value = 0.7 ## ## ## $male ## ## Shapiro-Wilk normality test ## ## data: X[[i]] ## W = 1, p-value = 0.2 ``` ??? Yes, each group follows a normal distribution --- ```r car::leveneTest(tbl$height ~ tbl$sex) ``` ``` ## Levene's Test for Homogeneity of Variance (center = median) ## Df F value Pr(>F) ## group 1 14.3 0.00039 *** ## 53 ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 ``` -- .font2[Levene's test suggests heterogenous variance (hint: significant p-value)] --- ```r t.test(height ~ sex, data=tbl, var.equal=FALSE) ``` ``` ## ## Welch Two Sample t-test ## ## data: height by sex ## t = -3, df = 32, p-value = 0.004 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -19.87 -4.07 ## sample estimates: ## mean in group female mean in group male ## 158 170 ``` .font2[Perform Welch's T-Test since sampled variances are not equal] --- ```r t.test(height ~ sex, data=tbl, var.equal=TRUE) ``` ``` ## ## Two Sample t-test ## ## data: height by sex ## t = -3, df = 53, p-value = 0.002 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -19.27 -4.67 ## sample estimates: ## mean in group female mean in group male ## 158 170 ``` .font2[Student's T-Test, to demonstrate type-I error inflation (hint: look at the p-value)] ??? - The p-value is *much lower* compared to Welch's T-Test - A low p-value does not mean a bad outcome per se, but we need to be cautious - Especially when we have violation on required assumptions - Always be wary of statistical error! --- <img src="index_files/figure-html/two.sample.t7-1.png" width="100%" /> ??? - Visualizing your findings is important - Both figures give similar information, but conveyed in different fashions --- template: overview .bg-main1.column[.vmiddle.content[ - Mean difference - One sample T-Test - Unpaired T-Test - .amber[Paired T-Test] - Effect size ]] --- layout: false class: bg-main3 # Paired T-Test .font2[ - Previous analysis implicitly assumes .pink[independence] - Some observations may violate such an assumption - Cases: - Difference between multiple measurements - Probability events where each instance influence another ] --- layout: true class: bg-main3 # Measuring mean differences --- .font2[ - Two-sample T-Test does not take into account difference within subjects - In fact, it is a violation of its stringent assumption - To measure difference within same subjects, we need to reduce the complexity ] --- count: false .font2[ - Suppose `\(\mu_1\)` and `\(\mu_2\)` represent measurement in `\(t_1\)` and `\(t_2\)` - Both measures represent same subject (within comparison) - We can calculate the difference between both samples: `$$\mu_d = \mu_1 - \mu_2$$` ] --- count: false .font2[ - Now we only need to take into account one sample differences - We hypothesize: `\begin{align} H_0 &: \mu_d = 0 \\ H_a &: \mu_d \neq 0 \end{align}` ] -- .font2[ - Sounds familiar? - Because it is! - We can apply .amber[one-sample T-Test] to `\(\mu_d\)` ] --- layout: true class: bg-main3 # Example, please? --- .font2[ In the current investigation, we are looking for the effect of a certain anti-hipertensive drug. First we measure the blood pressure baseline, then prescribe the drug to all subjects. Then, we re-measure the blood pressure after one month. Each subject has a unique identifier, so we can specify mean differences within paired samples. Suppose we have the following scenario in 30 sampled subjects: - `\(X_1 \sim N(140, 12)\)` - `\(X_2 \sim N(130, 17)\)` ] --- count: false .font2.center[ Set our hypotheses: `\begin{align} H_0 &: \bar{x}_d = 0 \\ H_a &: \bar{x}_d \neq 0 \end{align}` ] --- count: false ```r set.seed(1) tbl <- data.frame( "bp" = c(rnorm(30, 140, 12), rnorm(30, 133, 17)), "time" = c(rep("Before", 30), rep("After", 30)) %>% factor(levels=c("Before", "After")) ) # Measure the mean of mean difference md <- with(tbl, bp[time=="Before"] - bp[time=="After"]) # Calculate t-statistics t <- {mean(md)} / {sd(md) / sqrt(30)} %T>% print() ``` ``` ## [1] 3.12 ``` ```r # Obtain p-value for a two-sided test 2 * {1 - pt(t, df=29)} ``` ``` ## [1] 0.076 ``` --- count: false ```r # Comparison to built-in one-sample T-Test t.test(md, mu=0) ``` ``` ## ## One Sample t-test ## ## data: md ## t = 2, df = 29, p-value = 0.08 ## alternative hypothesis: true mean is not equal to 0 ## 95 percent confidence interval: ## -0.64 12.10 ## sample estimates: ## mean of x ## 5.73 ``` --- count: false ```r # Comparison to built-in paired T-Test t.test(bp ~ time, data=tbl, paired=TRUE) ``` ``` ## ## Paired t-test ## ## data: bp by time ## t = 2, df = 29, p-value = 0.08 ## alternative hypothesis: true difference in means is not equal to 0 ## 95 percent confidence interval: ## -0.64 12.10 ## sample estimates: ## mean of the differences ## 5.73 ``` --- layout: false class: bg-main3 # Choosing an .amber[appropriate] statistical test .font2[ - Both T-Test and Z-Test assume normality ] -- .font2[ - One-sample test: - Known `\(\sigma \to\)` use Z-Test - Unknown `\(\sigma \to\)` use one-sample T-Test ] -- .font2[ - Two-sample test `\(\to\)` do Levene's test - Equal variance: Student's method (pooled variance) - Unequal variance: Welch's method ] -- .font2[ - Paired T-Test: Basically one-sample T-Test on sampled differences ] --- template: overview .bg-main1.column[.vmiddle.content[ - Mean difference - One sample T-Test - Unpaired T-Test - Paired T-Test - .amber[Effect size] ]] --- layout: false class: bg-main3 # Effect size .font2[ - Concept recall: statistical power - Related topics: sample size and significance level ] `\begin{align} d &= \frac{\bar{x}_1 - \bar{x}_2}{s_p} \tag{Cohen's D} \\ s_p &= \sqrt{\frac{(s_1^2 + s_2^2)}{2}} \tag{Pooled SD} \end{align}` --- layout: true class: bg-main3 # Example, please? --- ```r set.seed(1) tbl <- data.frame( "bp" = c(rnorm(30, 140, 12), rnorm(30, 133, 17)), "time" = c(rep("Before", 30), rep("After", 30)) %>% factor(levels=c("Before", "After")) ) # Measure the mean of mean difference md <- with(tbl, bp[time=="Before"] - bp[time=="After"]) # Calculate t-statistics t <- {mean(md)} / {sd(md) / sqrt(30)} %T>% print() ``` ``` ## [1] 3.12 ``` ```r # Obtain p-value for a two-sided test 2 * {1 - pt(t, df=29)} ``` ``` ## [1] 0.076 ``` --- count: false ```r # Calculate pooled standard deviation sp <- sqrt({with(tbl, tapply(bp, time, var, simplify=FALSE)) %>% {do.call(add, .)} } / 2) %T>% print() ``` ``` ## [1] 12.4 ``` ```r # Measure Cohen's distance {with(tbl, tapply(bp, time, mean, simplify=FALSE)) %>% {do.call(subtract, .)} } / sp ``` ``` ## [1] 0.464 ``` --- count: false ```r # Calculate power using the `psych` package d <- psych::cohen.d(tbl ~ time) %T>% print() ``` ``` ## Call: psych::cohen.d(x = tbl ~ time) ## Cohen d statistic of difference between two means ## lower effect upper ## bp -0.99 -0.47 0.05 ## ## Multivariate (Mahalanobis) distance between groups ## [1] 0.47 ## r equivalent of difference between two means ## bp ## -0.23 ``` --- ```r # Power analysis using previous information pwr::pwr.t.test(n=30, d=d$cohen.d[[2]], sig.level=0.05, type="paired") ``` ``` ## ## Paired t test power calculation ## ## n = 30 ## d = 0.472 ## sig.level = 0.05 ## power = 0.704 ## alternative = two.sided ## ## NOTE: n is number of *pairs* ``` --- layout: false count: false class: bg-main1 center middle font5 hide-slide-number .amber[Query?]