count: false class: bg-main1 hide-slide-number split-70 .column[.right.vmiddle.content[ .font3[.amber[Differences] Between Two Groups] ]] .bg-main4.column[.vmiddle.content[ .amber[Aly Lamuri] Indonesia Medical Education and Research Institute ]] --- count: false class: bg-main3 # .amber[Recap] .font2[ - Test of .amber[proportional difference] - Exact test - Approximation - Test of .amber[mean difference] - One sample - Two samples - Multiple samples ] ??? - Differences between Fisher's exact and Pearson's `\(\chi^2\)` - One sample mean difference: Z-Test, T-Test - Two samples mean difference: unpaired and paired T-Test - Multiple samples mean difference: one-way, factorial, repeated measure ANOVA --- name: overview layout: true class: bg-main4 middle split-30 hide-slide-number .column[.vmiddle.right.content[ .amber.font3[Overview] ]] --- template: overview count: false .bg-main1.column[.vmiddle.content[ - .amber[Non-parametric test] - One-sample test - Two-sample test - Paired test ]] --- layout: false class: bg-main3 # Non-parametric test .font2[ - Parametric tests assume normality - Small sample size `\(\to\)` hard to assess normality - Severe skewness `\(\to\)` impair parametric tests - What defines a parametric test anyway? ] ??? - In parametric test, we are trying to estimate the population parameter using the data we have - Example: in T-test, we hypothesize both groups come from the same population, thus having a roughly equal mean value -- .font2[ .amber[Solution:] Use a non-parametric test ] ??? - In non-parametric test we are not assuming the parameter - We just want to measure whether our sample has a roughly similar presentation --- count: false class: bg-main3 split-two # Skewness .column[.vmiddle.content[ <img src="index_files/figure-html/plt.data-1.png" width="100%" /> ]] .column[.vmiddle.content[ <img src="index_files/figure-html/plt.data.skew-1.png" width="100%" /> ]] --- class: bg-main3 # When should we use a non-parametric test? .font2[ - Small sample size - Data is not .pink[asymptotically] normal - The presence of extreme outliers or severe skewness - We cannot ascertain the .pink[parameter] of its population ] ??? - Using a non-parametric test in a large data is unwieldy - To a certain degree, the parametric test is robust to a non-normal distribution - Outliers and skewness make it hard to estimate the mean difference, even in a large sample size -- ## Hypotheses .font2[ - `\(H_0\)`: The sampled groups come from the .amber[same] population - `\(H_1\)`: The sampled groups come from .amber[different] populations ] ??? Notice the difference in hypotheses declaration between parametric and non-parametric tests --- class: bg-main3 # But, how do we measure population-based .amber[difference]? -- .font2[.amber[Hint:] We use the central tendency] -- .font2[However, we cannot rely on the mean] -- .font2[So, we use the .amber[median] instead] -- ## .amber[Hypotheses] .font2[ - `\(H_0:\ M_1 = M_2\)` - `\(H_1:\ M_1 \neq M_2\)` ] --- template: overview count: false .bg-main1.column[.vmiddle.content[ - Non-parametric test - .amber[One-sample test] - Two-sample test - Paired test ]] --- class: bg-main3 # One-sample test .font2[ - One-sample sign test - One-sample Wilcoxon signed rank test ] ??? - Similar to the parametric test, we have one group of observation - We would like to know whether our group deviates from the hypothesized median - One-sample Wilcoxon is analogous to the one-sample T-Test --- class: bg-main3 # One-sample sign test .font2[ - .amber[Does not] assume normality or symmetric distribution - Usable in a skewed data - Follow a .amber[binomial] distribution ] ??? - A special case of the binomial test with p=0.5 - Why p=0.5? - Because the chance of having `\(M_1 \neq M_0\)` is 0.5 since median is the midpoint --- count: false class: bg-main3 # Procedure .font2[ - Find the residual between our observation and hypothesized median - Omit all 0 - Disregard the magnitude, take only its .amber[sign] - Calculate the frequency of .amber[positive] and .amber[negative] signs - Let `\(B_s\)` be the resultant `\(\to B_s \sim B(n, 0.5)\)` ] ??? - Because we only have two outcome of interest - Whether the observation be positive or negative sign - With all instances being independent, we have a Bernoulli trial - Then we model our probability using the Binomial distribution --- layout: true class: bg-main3 # Example, please? --- count: false ```r # Generate a skewed data using a Chi-squared distribution set.seed(1) x <- rchisq(10, 4) %T>% print() ``` ``` ## [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67 ``` .font2[ - Here we have `\(X \sim \chi^2(4)\)` - Let `\(H_0\)` be `\(M = 5\)` - And we are interested to conduct a two-tailed test ] -- ```r # Set M and find the residual (difference) M <- 5 diff <- {x - M} # Make a data frame tbl <- data.frame(x=x, abs.diff=abs(diff), sign=sign(diff)) ``` --- count: false .bg-white.content[ <br>
<br> ] --- count: false ```r # Perform a binomial test res <- lapply(c(-1, 1), function(sign) { binom.test(sum(tbl$sign==sign), nrow(tbl), 0.5) %>% broom::tidy() }) # Two-tailed test on sign=-1 knitr::kable(res[[1]]) %>% kable_minimal() ``` <table class=" lightable-minimal" style='font-family: "Trebuchet MS", verdana, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:right;"> parameter </th> <th style="text-align:right;"> conf.low </th> <th style="text-align:right;"> conf.high </th> <th style="text-align:left;"> method </th> <th style="text-align:left;"> alternative </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.6 </td> <td style="text-align:right;"> 6 </td> <td style="text-align:right;"> 0.754 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 0.262 </td> <td style="text-align:right;"> 0.878 </td> <td style="text-align:left;"> Exact binomial test </td> <td style="text-align:left;"> two.sided </td> </tr> </tbody> </table> ```r # Two-tailed test on sign=1 knitr::kable(res[[2]]) %>% kable_minimal() ``` <table class=" lightable-minimal" style='font-family: "Trebuchet MS", verdana, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;"> estimate </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:right;"> parameter </th> <th style="text-align:right;"> conf.low </th> <th style="text-align:right;"> conf.high </th> <th style="text-align:left;"> method </th> <th style="text-align:left;"> alternative </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.4 </td> <td style="text-align:right;"> 4 </td> <td style="text-align:right;"> 0.754 </td> <td style="text-align:right;"> 10 </td> <td style="text-align:right;"> 0.122 </td> <td style="text-align:right;"> 0.738 </td> <td style="text-align:left;"> Exact binomial test </td> <td style="text-align:left;"> two.sided </td> </tr> </tbody> </table> --- count: false layout: false class: bg-main3 # Caveats .font2[ - Only see the difference - Neglecting the magnitude - What if our data has a severe skewness? ] --- layout: false class: bg-main3 # One-sample Wilcoxon signed rank test .font2[ - Does not assume normality - But still assumes a .amber[symmetric] distribution - What distribution has a symmetric shape but not normal? ] ??? - Hint: uniform distribution - Other examples: Cauchy distribution, generalized normal distribution, etc. - This test is not good for a skewed data --- count: false class: bg-main3 # Procedure .font2[ - Similar to performing a sign test - However, we assign ranks based on computed difference - Statistics is the resultants of signed rank - Take the .amber[minimum value] between both statistics ] ??? - Also referred as a sum rank signed test --- layout: true count: false class: bg-main3 # Example, please? --- ```r # Generate a skewed data using a Chi-squared distribution set.seed(1) x <- rchisq(10, 4) %T>% print() ``` ``` ## [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67 ``` .font2[ - Assigning rank will preserve the magnitude - Let `\(H_0: M = 5\)` ] ```r # Add columns to data frame tbl$ranked <- rank(tbl$abs.diff) ``` ??? - We will re-use the same data - We keep the hypotheses to replicate our previous analysis --- count: false .bg-white.content[ <br>
<br> ] --- count: false ```r # Calculate the statistics W <- tapply(tbl$rank, tbl$sign, sum) %>% min() %T>% print() ``` ``` ## [1] 17 ``` ```r # Find the p-value for a two-tailed test psignrank(W, nrow(tbl)) * 2 ``` ``` ## [1] 0.322 ``` ```r # Built-in test wilcox.test(x, data=tbl, mu=5) ``` ``` ## ## Wilcoxon signed rank exact test ## ## data: x ## V = 17, p-value = 0.3 ## alternative hypothesis: true location is not equal to 5 ``` --- template: overview count: false .bg-main1.column[.vmiddle.content[ - Non-parametric test - One-sample test - .amber[Two-sample test] - Paired test ]] --- layout: false class: bg-main3 # Two-sample test .font2[ - Mann-Whitney U test - Do not assume normality - Can handle skewed data - I.I.D ] ??? - Also referred as a unpaired two-sample Wilcoxon test - Does not imply mean difference - Applicable to a small dataset --- class: bg-main3 # Procedure .font2[ - Pooled all the data elements from both groups - Sort from the smallest to largest value - Assign a rank to each value - Compare both groups ] ??? - This is the concept of *sum of ranks* - Less statistical power compared to parametric tests - Still assumes IID --- layout: true class: bg-main3 # Example, please? --- ```r # We will use x as the first group x ``` ``` ## [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67 ``` ```r # Assign x+4 as the second group, make a data frame tbl <- data.frame( obs=c(x, x+4), group=rep(c("1", "2"), each=length(x)) %>% factor() ) %T>% str() ``` ``` ## 'data.frame': 20 obs. of 2 variables: ## $ obs : num 1.66 7.14 6.93 4.1 7.77 ... ## $ group: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ... ``` --- count: false ```r # Goodness of fit test to determine the distribution tapply(tbl$obs, tbl$group, ks.test, pnorm) %>% lapply(broom::tidy) %>% lapply(data.frame) %>% {do.call(rbind, .)} %>% kable() %>% kable_minimal() ``` <table class=" lightable-minimal" style='font-family: "Trebuchet MS", verdana, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:left;"> method </th> <th style="text-align:left;"> alternative </th> </tr> </thead> <tbody> <tr> <td style="text-align:right;"> 0.913 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> One-sample Kolmogorov-Smirnov test </td> <td style="text-align:left;"> two-sided </td> </tr> <tr> <td style="text-align:right;"> 1.000 </td> <td style="text-align:right;"> 0 </td> <td style="text-align:left;"> One-sample Kolmogorov-Smirnov test </td> <td style="text-align:left;"> two-sided </td> </tr> </tbody> </table> -- .amber.font2[A random quiz has appeared!] -- .font2[Why do we use Kolmogorov-Smirnov test instead of a normality test?] ??? Is there another option? --- count: false <img src="index_files/figure-html/plt.tbl-1.png" width="100%" /> --- count: false ```r wilcox.test(obs ~ group, data=tbl, conf.int=TRUE) ``` ``` ## ## Wilcoxon rank sum exact test ## ## data: obs by group ## W = 12, p-value = 0.003 ## alternative hypothesis: true location shift is not equal to 0 ## 95 percent confidence interval: ## -6.74 -1.26 ## sample estimates: ## difference in location ## -4 ``` ```r rstatix::wilcox_effsize(obs ~ group, data=tbl) ``` ``` ## # A tibble: 1 x 7 ## .y. group1 group2 effsize n1 n2 magnitude ## * <chr> <chr> <chr> <dbl> <int> <int> <ord> ## 1 obs 1 2 0.642 10 10 large ``` --- template: overview count: false .bg-main1.column[.vmiddle.content[ - Non-parametric test - One-sample test - Two-sample test - .amber[Paired test] ]] --- layout: false class: bg-main3 # Paired test .font2[ - Both groups are not independent - Paired Wilcoxon test - Does not assume normality - Does not imply mean difference ] ??? - Symmetric data is a plus though - Does not imply mean difference - Applicable to a small dataset --- class: bg-main3 # Procedure .font2[ - Akin to one-sample Wilcoxon test - Measure the difference between paired data point - Remove zero (if any) then recompute the `\(n\)` - Assign rank to the absolute difference - Calculate statistics based on rank and sign ] --- layout: true class: bg-main3 # Example, please? --- ```r # We will use the ChickWeight dataset str(ChickWeight) ``` ``` ## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame': 578 obs. of 4 variables: ## $ weight: num 42 51 59 64 76 93 106 125 149 171 ... ## $ Time : num 0 2 4 6 8 10 12 14 16 18 ... ## $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ... ## $ Diet : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ... ## - attr(*, "formula")=Class 'formula' language weight ~ Time | Chick ## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> ## - attr(*, "outer")=Class 'formula' language ~Diet ## .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> ## - attr(*, "labels")=List of 2 ## ..$ x: chr "Time" ## ..$ y: chr "Body weight" ## - attr(*, "units")=List of 2 ## ..$ x: chr "(days)" ## ..$ y: chr "(gm)" ``` --- count: false ```r # Assess normality tapply(ChickWeight$weight, ChickWeight$Time, shapiro.test) %>% lapply(broom::tidy) %>% lapply(data.frame) %>% {do.call(rbind, .)} %>% kable() %>% kable_minimal() ``` <table class=" lightable-minimal" style='font-family: "Trebuchet MS", verdana, sans-serif; margin-left: auto; margin-right: auto;'> <thead> <tr> <th style="text-align:left;"> </th> <th style="text-align:right;"> statistic </th> <th style="text-align:right;"> p.value </th> <th style="text-align:left;"> method </th> </tr> </thead> <tbody> <tr> <td style="text-align:left;"> 0 </td> <td style="text-align:right;"> 0.890 </td> <td style="text-align:right;"> 0.000 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 2 </td> <td style="text-align:right;"> 0.873 </td> <td style="text-align:right;"> 0.000 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 4 </td> <td style="text-align:right;"> 0.973 </td> <td style="text-align:right;"> 0.315 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 6 </td> <td style="text-align:right;"> 0.982 </td> <td style="text-align:right;"> 0.648 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 8 </td> <td style="text-align:right;"> 0.980 </td> <td style="text-align:right;"> 0.577 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 10 </td> <td style="text-align:right;"> 0.981 </td> <td style="text-align:right;"> 0.616 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 12 </td> <td style="text-align:right;"> 0.983 </td> <td style="text-align:right;"> 0.686 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 14 </td> <td style="text-align:right;"> 0.973 </td> <td style="text-align:right;"> 0.325 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 16 </td> <td style="text-align:right;"> 0.986 </td> <td style="text-align:right;"> 0.830 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 18 </td> <td style="text-align:right;"> 0.991 </td> <td style="text-align:right;"> 0.975 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 20 </td> <td style="text-align:right;"> 0.991 </td> <td style="text-align:right;"> 0.968 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> <tr> <td style="text-align:left;"> 21 </td> <td style="text-align:right;"> 0.986 </td> <td style="text-align:right;"> 0.869 </td> <td style="text-align:left;"> Shapiro-Wilk normality test </td> </tr> </tbody> </table> --- count: false ```r # Subset the dataset to exclude normally distributed data tbl <- subset(ChickWeight, subset={ChickWeight$Time %in% c(0, 2)}) # Make Time as a factor tbl$Time %<>% factor(levels=c(0, 2)) ``` --- count: false ```r # Perform a paired Wilcoxon test wilcox.test(weight ~ Time, data=tbl, paired=TRUE, conf.int=TRUE) ``` ``` ## ## Wilcoxon signed rank test with continuity correction ## ## data: weight by Time ## V = 8, p-value = 1e-09 ## alternative hypothesis: true location shift is not equal to 0 ## 95 percent confidence interval: ## -9.0 -7.5 ## sample estimates: ## (pseudo)median ## -8.5 ``` ```r rstatix::wilcox_effsize(weight ~ Time, data=tbl, paired=TRUE) ``` ``` ## # A tibble: 1 x 7 ## .y. group1 group2 effsize n1 n2 magnitude ## * <chr> <chr> <chr> <dbl> <int> <int> <ord> ## 1 weight 0 2 0.862 50 50 large ``` --- count: false <img src="index_files/figure-html/paired.wilcox6-1.png" width="100%" /> --- layout: false count: false class: bg-main1 middle hide-slide-number font5 center .amber[Query?]