About me
Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant
A Gentle Introduction to Biostatistics
Aly Lamuri
Indonesia Medical Education and Research Institute

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant
Outline
Understand the basic of statistics

Understand the basic of statistics

Understand the basic of statistics
HBU?
Outline
All observable subjects inhabiting a certain location
All observable subjects inhabiting a certain location
Quantitative summary of a population
All observable subjects inhabiting a certain location
Quantitative summary of a population
All observable subjects inhabiting a certain location
Quantitative summary of a population
X: Data element
N: Number of element
P: Proportion
M: Median
μ: Average
σ: Standard deviation
σ2: Variance
ρ: Correlation coefficient
A subset of an observable population
A subset of an observable population
Quantitative summary of a sample
A subset of an observable population
Quantitative summary of a sample
A subset of an observable population
Quantitative summary of a sample
x: Data element
n: Number of element
p: Proportion
m: Median
¯x: Average
s: Standard deviation
s2: Variance
r: Correlation coefficient
| Statistics | Meanings | Parameters |
|---|---|---|
| `x` | Data element | `X` |
| `n` | Number of element | `N` |
| `p` | Proportion | `P` |
| `m` | Median | `M` |
| `¯x` | Average | `μ` |
| `s` | Standard deviation | `σ` |
| `s2` | Variance | `σ2` |
| `r` | Correlation coefficient | `ρ` |
Outline
set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9print(X[7])
## [1] 164.9length(X)
## [1] 10set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9sum(X > 165) / length(X)
## [1] 0.3set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9sum(X > 165) / length(X)
## [1] 0.3¯x=1nn∑i=1xi
¯x=1nn∑i=1xi
sum(X) / length(X)
## [1] 161.3¯x=1nn∑i=1xi
sum(X) / length(X)
## [1] 161.3mean(X)
## [1] 161.3¯x=1nn∑i=1xi
sum(X) / length(X)
## [1] 161.3mean(X)
## [1] 161.3Problem: Not all data distributed evenly
¯x=1nn∑i=1xi
On a quick glimpse:

¯x=1nn∑i=1xi
Solution: Use another measure → median

m=⎧⎨⎩xn+12:n2∤12(xn2+xn2+1):n2∣
m=⎧⎨⎩xn+12:n2∤12(xn2+xn2+1):n2∣ A quick demo:
sort(X)
## [1] 151.6 151.8 153.7 156.9 161.8 163.3 164.9 165.8 167.4 176.0m=⎧⎨⎩xn+12:n2∤12(xn2+xn2+1):n2∣ A quick demo:
sort(X)
## [1] 151.6 151.8 153.7 156.9 161.8 163.3 164.9 165.8 167.4 176.0median(X)
## [1] 162.6m=⎧⎨⎩xn+12:n2∤12(xn2+xn2+1):n2∣ A quick demo:
sort(X)
## [1] 151.6 151.8 153.7 156.9 161.8 163.3 164.9 165.8 167.4 176.0median(X)
## [1] 162.6mean(X)
## [1] 161.3m=⎧⎨⎩xn+12:n2∤12(xn2+xn2+1):n2∣ A quick demo:
sort(X)
## [1] 151.6 151.8 153.7 156.9 161.8 163.3 164.9 165.8 167.4 176.0median(X)
## [1] 162.6mean(X)
## [1] 161.3Take our data as an example:
set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9Take our data as an example:
set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9d <- X - mean(X)print(d, digits=2)
## [1] -7.59 0.51 -9.68 14.63 1.97 -9.53 3.55 6.06 4.44 -4.38Take our data as an example:
set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9d <- X - mean(X)print(d, digits=2)
## [1] -7.59 0.51 -9.68 14.63 1.97 -9.53 3.55 6.06 4.44 -4.38Hard to find its general property!
Take our data as an example:
set.seed(1)X <- rnorm(10, mean=160, sd=10)print(X)
## [1] 153.7 161.8 151.6 176.0 163.3 151.8 164.9 167.4 165.8 156.9d <- X - mean(X)print(d, digits=2)
## [1] -7.59 0.51 -9.68 14.63 1.97 -9.53 3.55 6.06 4.44 -4.38Hard to find its general property! → Potential solution?
We can take the absolute value and compute the mean:
¯d=1NN∑i=1|Xi−μ|
We can take the absolute value and compute the mean:
¯d=1NN∑i=1|Xi−μ| A quick demo:
d <- abs(X - mean(X))print(d, digits=2)
## [1] 7.59 0.51 9.68 14.63 1.97 9.53 3.55 6.06 4.44 4.38d.bar <- mean(d)print(d.bar, digits=2)
## [1] 6.2We can take the absolute value and compute the mean:
¯d=1NN∑i=1|Xi−μ| A quick demo:
d <- abs(X - mean(X))print(d, digits=2)
## [1] 7.59 0.51 9.68 14.63 1.97 9.53 3.55 6.06 4.44 4.38d.bar <- mean(d)print(d.bar, digits=2)
## [1] 6.2Now, it's easier to report your findings as ¯x±¯d
We can take the absolute value and compute the mean:
¯d=1NN∑i=1|Xi−μ| A quick demo:
d <- abs(X - mean(X))print(d, digits=2)
## [1] 7.59 0.51 9.68 14.63 1.97 9.53 3.55 6.06 4.44 4.38d.bar <- mean(d)print(d.bar, digits=2)
## [1] 6.2Now, it's easier to report your findings as ¯x±¯d , or numerically as 161.32 ± 6.23
We can take the absolute value and compute the mean:
¯d=1NN∑i=1|Xi−μ| A quick demo:
d <- abs(X - mean(X))print(d, digits=2)
## [1] 7.59 0.51 9.68 14.63 1.97 9.53 3.55 6.06 4.44 4.38d.bar <- mean(d)print(d.bar, digits=2)
## [1] 6.2Now, it's easier to report your findings as ¯x±¯d , or numerically as 161.32 ± 6.23 → Yet, such a practice is uncommon to see.
Another alternative is to find the root-mean square, which define a standard deviation:
σ= ⎷1NN∑i=1(Xi−μ)2
Another alternative is to find the root-mean square, which define a standard deviation:
σ= ⎷1NN∑i=1(Xi−μ)2 A quick demo:
std.dev <- sqrt(sum({X - mean(X)}^2) / length(X))print(std.dev)
## [1] 7.4In statistics, we need to adjust the estimation by applying Bessel's correction.
In statistics, we need to adjust the estimation by applying Bessel's correction. Simply said, we find the mean by dividing into n−1 instead of N.
s= ⎷1n−1n∑i=1(xi−¯x)2
In statistics, we need to adjust the estimation by applying Bessel's correction. Simply said, we find the mean by dividing into n−1 instead of N.
s= ⎷1n−1n∑i=1(xi−¯x)2 A quick demo:
std.dev <- sqrt(sum({X - mean(X)}^2) / {length(X) - 1})print(std.dev)
## [1] 7.8sd(X) # Built-in function to calculate standard deviation
## [1] 7.8Bessel's method applied to correct the bias in estimating population variance.
s2=1n−1n∑i=1(xi−¯x)2
s2=1n−1n∑i=1(xi−¯x)2 Importance:
s2=1n−1n∑i=1(xi−¯x)2 Importance:
Our data:
sort(X)
## [1] 152 152 154 157 162 163 165 166 167 176
quantile(X, probs=seq(0, 1, 1/5))
## 0% 20% 40% 60% 80% 100% ## 152 153 160 164 166 176
quantile(X, probs=seq(0, 1, 1/4))
## 0% 25% 50% 75% 100% ## 152 155 163 166 176Conclusion

Query?




(Disclaimer: Photo is just an illustration)

The chance of:
Overall probability of consecutive independent occurrence: 1 in 12mil → Rare!


Example: 8 among 10 dentists recommend Colg*te
Example: 8 among 10 dentists recommend Colg*te
Slide and short note: http://bit.ly/biostatistik-ukrida

Aly Lamuri
FKUI 2011
Newcastle 2018
Academic writer
Research assistant
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |