Non-parametric: Difference Between Two Groups

.bg-main4.column[.vmiddle.content[
.amber[Aly Lamuri]  
Indonesia Medical Education and Research Institute
]]

---

# .amber[Recap]

.font2[
- Test of .amber[proportional difference]
  - Exact test
  - Approximation
- Test of .amber[mean difference]
  - One sample
  - Two samples
  - Multiple samples
]

???

- Differences between Fisher's exact and Pearson's `$\chi^2$`
- One sample mean difference: Z-Test, T-Test
- Two samples mean difference: unpaired and paired T-Test
- Multiple samples mean difference: one-way, factorial, repeated measure ANOVA

---

---

.bg-main1.column[.vmiddle.content[
- .amber[Non-parametric test]
- One-sample test
- Two-sample test
- Paired test
]]

---

# Non-parametric test

.font2[
- Parametric tests assume normality
- Small sample size `$\to$` hard to assess normality
- Severe skewness `$\to$` impair parametric tests
- What defines a parametric test anyway?
]

???

- In parametric test, we are trying to estimate the population parameter using
  the data we have
- Example: in T-test, we hypothesize both groups come from the same population,
  thus having a roughly equal mean value

???

- In non-parametric test we are not assuming the parameter
- We just want to measure whether our sample has a roughly similar presentation

---

# Skewness

]]

---

# When should we use a non-parametric test?

.font2[
- Small sample size
- Data is not .pink[asymptotically] normal
- The presence of extreme outliers or severe skewness
- We cannot ascertain the .pink[parameter] of its population
]

???

- Using a non-parametric test in a large data is unwieldy
- To a certain degree, the parametric test is robust to a non-normal
  distribution
- Outliers and skewness make it hard to estimate the mean difference, even in a
  large sample size

## Hypotheses

.font2[
- `$H_0$`: The sampled groups come from the .amber[same] population
- `$H_1$`: The sampled groups come from .amber[different] populations
]

???

Notice the difference in hypotheses declaration between parametric and
non-parametric tests

---

# But, how do we measure population-based .amber[difference]?

## .amber[Hypotheses]

---

.bg-main1.column[.vmiddle.content[
- Non-parametric test
- .amber[One-sample test]
- Two-sample test
- Paired test
]]

---

# One-sample test

???

- Similar to the parametric test, we have one group of observation
- We would like to know whether our group deviates from the hypothesized median
- One-sample Wilcoxon is analogous to the one-sample T-Test

---

# One-sample sign test

.font2[
- .amber[Does not] assume normality or symmetric distribution
- Usable in a skewed data
- Follow a .amber[binomial] distribution
]

???

- A special case of the binomial test with p=0.5
- Why p=0.5?
- Because the chance of having `$M_1 \neq M_0$` is 0.5 since median is the
  midpoint

---

# Procedure

.font2[
- Find the residual between our observation and hypothesized median
- Omit all 0
- Disregard the magnitude, take only its .amber[sign]
- Calculate the frequency of .amber[positive] and .amber[negative] signs
- Let `$B_s$` be the resultant `$\to B_s \sim B(n, 0.5)$`
]

???

- Because we only have two outcome of interest
- Whether the observation be positive or negative sign
- With all instances being independent, we have a Bernoulli trial
- Then we model our probability using the Binomial distribution

---

# Example, please?

---

```r
# Generate a skewed data using a Chi-squared distribution
set.seed(1)
x <- rchisq(10, 4) %T>% print()
```

```
##  [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67
```

.font2[
- Here we have `$X \sim \chi^2(4)$`
- Let `$H_0$` be `$M = 5$`
- And we are interested to conduct a two-tailed test
]

```r
# Set M and find the residual (difference)
M <- 5
diff <- {x - M}

# Make a data frame
tbl <- data.frame(x=x, abs.diff=abs(diff), sign=sign(diff))
```

---

.bg-white.content[
<br>

<div id="htmlwidget-e739e3b0a4364914bce3" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-e739e3b0a4364914bce3">{"x":{"filter":"none","data":[["1","2","3","4","5","6","7","8","9","10"],[1.66173000475602,7.14151269432391,6.92634076027453,4.10162370021791,7.77081805863493,5.0810804771595,4.57613259741702,2.29858531160039,1.36204293019693,1.67124749961814],[3.33826999524398,2.14151269432391,1.92634076027453,0.898376299782091,2.77081805863493,0.0810804771595039,0.423867402582976,2.70141468839961,3.63795706980307,3.32875250038186],[-1,1,1,-1,1,1,-1,-1,-1,-1]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>x<\/th>\n      <th>abs.diff<\/th>\n      <th>sign<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"columnDefs":[{"className":"dt-right","targets":[1,2,3]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":[],"jsHooks":[]}</script>

<br>
]

---

```r
# Perform a binomial test
res <- lapply(c(-1, 1), function(sign) {
	binom.test(sum(tbl$sign==sign), nrow(tbl), 0.5) %>%
		broom::tidy()
})

# Two-tailed test on sign=-1
knitr::kable(res[[1]]) %>% kable_minimal()
```

```r
# Two-tailed test on sign=1
knitr::kable(res[[2]]) %>% kable_minimal()
```

---

# Caveats

.font2[
- Only see the difference
- Neglecting the magnitude
- What if our data has a severe skewness?
]

---

# One-sample Wilcoxon signed rank test

.font2[
- Does not assume normality
- But still assumes a .amber[symmetric] distribution
- What distribution has a symmetric shape but not normal?
]

???

- Hint: uniform distribution
- Other examples: Cauchy distribution, generalized normal distribution, etc.
- This test is not good for a skewed data

---

# Procedure

.font2[
- Similar to performing a sign test
- However, we assign ranks based on computed difference
- Statistics is the resultants of signed rank
- Take the .amber[minimum value] between both statistics
]

???

- Also referred as a sum rank signed test

---

# Example, please?

---

```r
# Generate a skewed data using a Chi-squared distribution
set.seed(1)
x <- rchisq(10, 4) %T>% print()
```

```
##  [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67
```

```r
# Add columns to data frame
tbl$ranked <- rank(tbl$abs.diff)
```

???

- We will re-use the same data
- We keep the hypotheses to replicate our previous analysis

---

.bg-white.content[
<br>

<div id="htmlwidget-96db9f1817fa4b79b904" style="width:100%;height:auto;" class="datatables html-widget"></div>
<script type="application/json" data-for="htmlwidget-96db9f1817fa4b79b904">{"x":{"filter":"none","data":[["1","2","3","4","5","6","7","8","9","10"],[1.66173000475602,7.14151269432391,6.92634076027453,4.10162370021791,7.77081805863493,5.0810804771595,4.57613259741702,2.29858531160039,1.36204293019693,1.67124749961814],[3.33826999524398,2.14151269432391,1.92634076027453,0.898376299782091,2.77081805863493,0.0810804771595039,0.423867402582976,2.70141468839961,3.63795706980307,3.32875250038186],[-1,1,1,-1,1,1,-1,-1,-1,-1],[9,5,4,3,7,1,2,6,10,8]],"container":"<table class=\"display\">\n  <thead>\n    <tr>\n      <th> <\/th>\n      <th>x<\/th>\n      <th>abs.diff<\/th>\n      <th>sign<\/th>\n      <th>ranked<\/th>\n    <\/tr>\n  <\/thead>\n<\/table>","options":{"columnDefs":[{"className":"dt-right","targets":[1,2,3,4]},{"orderable":false,"targets":0}],"order":[],"autoWidth":false,"orderClasses":false}},"evals":[],"jsHooks":[]}</script>

<br>
]

---

```r
# Calculate the statistics
W <- tapply(tbl$rank, tbl$sign, sum) %>% min() %T>% print()
```

```
## [1] 17
```

```r
# Find the p-value for a two-tailed test
psignrank(W, nrow(tbl)) * 2
```

```
## [1] 0.322
```

```r
# Built-in test
wilcox.test(x, data=tbl, mu=5)
```

```
## 
## 	Wilcoxon signed rank exact test
## 
## data:  x
## V = 17, p-value = 0.3
## alternative hypothesis: true location is not equal to 5
```

---

.bg-main1.column[.vmiddle.content[
- Non-parametric test
- One-sample test
- .amber[Two-sample test]
- Paired test
]]

---

# Two-sample test

???

- Also referred as a unpaired two-sample Wilcoxon test
- Does not imply mean difference
- Applicable to a small dataset

---

# Procedure

.font2[
- Pooled all the data elements from both groups
- Sort from the smallest to largest value
- Assign a rank to each value
- Compare both groups
]

???

- This is the concept of *sum of ranks*
- Less statistical power compared to parametric tests
- Still assumes IID

---

# Example, please?

---

```r
# We will use x as the first group
x
```

```
##  [1] 1.66 7.14 6.93 4.10 7.77 5.08 4.58 2.30 1.36 1.67
```

```r
# Assign x+4 as the second group, make a data frame
tbl <- data.frame(
	obs=c(x, x+4), 
	group=rep(c("1", "2"), each=length(x)) %>% factor()
) %T>% str()
```

```
## 'data.frame':	20 obs. of  2 variables:
##  $ obs  : num  1.66 7.14 6.93 4.1 7.77 ...
##  $ group: Factor w/ 2 levels "1","2": 1 1 1 1 1 1 1 1 1 1 ...
```

---

```r
# Goodness of fit test to determine the distribution
tapply(tbl$obs, tbl$group, ks.test, pnorm) %>% lapply(broom::tidy) %>%
	lapply(data.frame) %>% {do.call(rbind, .)} %>% kable() %>% kable_minimal()
```

.amber.font2[A random quiz has appeared!]

???

Is there another option?

---

---

```r
wilcox.test(obs ~ group, data=tbl, conf.int=TRUE)
```

```
## 
## 	Wilcoxon rank sum exact test
## 
## data:  obs by group
## W = 12, p-value = 0.003
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -6.74 -1.26
## sample estimates:
## difference in location 
##                     -4
```

```r
rstatix::wilcox_effsize(obs ~ group, data=tbl)
```

```
## # A tibble: 1 x 7
##   .y.   group1 group2 effsize    n1    n2 magnitude
## * <chr> <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 obs   1      2        0.642    10    10 large
```

---

.bg-main1.column[.vmiddle.content[
- Non-parametric test
- One-sample test
- Two-sample test
- .amber[Paired test]
]]

---

# Paired test

.font2[
- Both groups are not independent
- Paired Wilcoxon test
- Does not assume normality
- Does not imply mean difference
]

???

- Symmetric data is a plus though
- Does not imply mean difference
- Applicable to a small dataset

---

# Procedure

.font2[
- Akin to one-sample Wilcoxon test
- Measure the difference between paired data point
- Remove zero (if any) then recompute the `$n$`
- Assign rank to the absolute difference
- Calculate statistics based on rank and sign
]

---

# Example, please?

---

```r
# We will use the ChickWeight dataset
str(ChickWeight)
```

```
## Classes 'nfnGroupedData', 'nfGroupedData', 'groupedData' and 'data.frame':	578 obs. of  4 variables:
##  $ weight: num  42 51 59 64 76 93 106 125 149 171 ...
##  $ Time  : num  0 2 4 6 8 10 12 14 16 18 ...
##  $ Chick : Ord.factor w/ 50 levels "18"<"16"<"15"<..: 15 15 15 15 15 15 15 15 15 15 ...
##  $ Diet  : Factor w/ 4 levels "1","2","3","4": 1 1 1 1 1 1 1 1 1 1 ...
##  - attr(*, "formula")=Class 'formula'  language weight ~ Time | Chick
##   .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
##  - attr(*, "outer")=Class 'formula'  language ~Diet
##   .. ..- attr(*, ".Environment")=<environment: R_EmptyEnv> 
##  - attr(*, "labels")=List of 2
##   ..$ x: chr "Time"
##   ..$ y: chr "Body weight"
##  - attr(*, "units")=List of 2
##   ..$ x: chr "(days)"
##   ..$ y: chr "(gm)"
```

---

```r
# Assess normality
tapply(ChickWeight$weight, ChickWeight$Time, shapiro.test) %>% lapply(broom::tidy) %>%
	lapply(data.frame) %>% {do.call(rbind, .)} %>% kable() %>% kable_minimal()
```

<table class=" lightable-minimal" style='font-family: "Trebuchet MS", verdana, sans-serif; margin-left: auto; margin-right: auto;'>
 <thead>
  <tr>
   <th style="text-align:left;">   </th>
   <th style="text-align:right;"> statistic </th>
   <th style="text-align:right;"> p.value </th>
   <th style="text-align:left;"> method </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> 0 </td>
   <td style="text-align:right;"> 0.890 </td>
   <td style="text-align:right;"> 0.000 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 2 </td>
   <td style="text-align:right;"> 0.873 </td>
   <td style="text-align:right;"> 0.000 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 4 </td>
   <td style="text-align:right;"> 0.973 </td>
   <td style="text-align:right;"> 0.315 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 6 </td>
   <td style="text-align:right;"> 0.982 </td>
   <td style="text-align:right;"> 0.648 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 8 </td>
   <td style="text-align:right;"> 0.980 </td>
   <td style="text-align:right;"> 0.577 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 10 </td>
   <td style="text-align:right;"> 0.981 </td>
   <td style="text-align:right;"> 0.616 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 12 </td>
   <td style="text-align:right;"> 0.983 </td>
   <td style="text-align:right;"> 0.686 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 14 </td>
   <td style="text-align:right;"> 0.973 </td>
   <td style="text-align:right;"> 0.325 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 16 </td>
   <td style="text-align:right;"> 0.986 </td>
   <td style="text-align:right;"> 0.830 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 18 </td>
   <td style="text-align:right;"> 0.991 </td>
   <td style="text-align:right;"> 0.975 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 20 </td>
   <td style="text-align:right;"> 0.991 </td>
   <td style="text-align:right;"> 0.968 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
  <tr>
   <td style="text-align:left;"> 21 </td>
   <td style="text-align:right;"> 0.986 </td>
   <td style="text-align:right;"> 0.869 </td>
   <td style="text-align:left;"> Shapiro-Wilk normality test </td>
  </tr>
</tbody>
</table>

---

```r
# Subset the dataset to exclude normally distributed data
tbl <- subset(ChickWeight, subset={ChickWeight$Time %in% c(0, 2)})

# Make Time as a factor
tbl$Time %<>% factor(levels=c(0, 2))
```

---

```r
# Perform a paired Wilcoxon test
wilcox.test(weight ~ Time, data=tbl, paired=TRUE, conf.int=TRUE)
```

```
## 
## 	Wilcoxon signed rank test with continuity correction
## 
## data:  weight by Time
## V = 8, p-value = 1e-09
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -9.0 -7.5
## sample estimates:
## (pseudo)median 
##           -8.5
```

```r
rstatix::wilcox_effsize(weight ~ Time, data=tbl, paired=TRUE)
```

```
## # A tibble: 1 x 7
##   .y.    group1 group2 effsize    n1    n2 magnitude
## * <chr>  <chr>  <chr>    <dbl> <int> <int> <ord>    
## 1 weight 0      2        0.862    50    50 large
```

---

---