Mathematics for Machine Learning
Aly Lamuri
Department of Medical Physics
Overview
Object in math depends on which branch it is defined upon, in linear algebra:
Spaces:
αx+βy=c

When rotating a straight line on one endpoint, it will form a circle, where the length of such a straight line equates to the radius
Where's the fifth one?
If two lines are drawn which intersect a third in such a way that the sum of the inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough
If two lines are drawn which intersect a third in such a way that the sum of the inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough
Also known as the parallel postulate
If two lines are drawn which intersect a third in such a way that the sum of the inner angles on one side is less than two right angles, then the two lines inevitably must intersect each other on that side if extended far enough
Also known as the parallel postulate



Classes:
Let X=(x1,x2,...,xn)A=(α1,α2,...,αm):∀A,X∈R, then∃f(x)=j∑1i∑1αij⋅xi=Π
4x1+4x2=52x1−4x2=16x1=6(+)
4x1+4x2=52x1−4x2=16x1=6(+)
∴x1=1x2=0.25
[442−4]⋅[x1x2]=[51]Let A:=[442−4]A⋅[x1x2]=[51]A−1⋅A⋅[x1x2]=A−1[51]
Recall that A⋅A−1=I [44102−401] Starts by having the original and identity matrix side by side
Add the second row into the first [60112−401]
Divide the first row by 6 [1016162−401]
Divide the second row by 2 ⎡⎣1016161−2012⎤⎦
Subtract the first row from the second ⎡⎣1016160−2−1613⎤⎦
Divide the second row by -2 ⎡⎣10161601112−16⎤⎦ Now you have I on the left and A−1 on the right :)
This is one of the simplest method to get A−1 in m×m matrices
A−1⋅A⋅[x1x2]=A−1[51][x1x2]=[1616112−16]⋅[51][x1x2]=[114]
d:X×X→[0,∞)
Let (x,y)∈Xd(x,y)=0⟺x=yd(x,y)⩾0⟺x≠yd(x,y)=d(y,x)d(x,y)⩽d(x,z)+d(z,y)
D= ⎷n∑i=1(pi−qi)2
n is the number of dimension
D=∑i=1n|pi−qi|
D=(n∑i=1|pi−qi|p)1p

Example: Hilbert space, Lp space
Example: Hilbert space, Lp space
Imagine inner product as a dot product without dimensionality constraint
Overview
Summary:

Partial derivatives generalization to find extremum

∇f=⎡⎢ ⎢ ⎢ ⎢⎣∂f∂x1⋮∂f∂xn⎤⎥ ⎥ ⎥ ⎥⎦
Overview
P(A|B)=P(A)⋅P(B)
All observable subjects inhabiting a certain location
All observable subjects inhabiting a certain location
Quantitative summary of a population
All observable subjects inhabiting a certain location
Quantitative summary of a population
All observable subjects inhabiting a certain location
Quantitative summary of a population
X: Data element
N: Number of element
p: Proportion
M: Median
μ: Average
σ: Standard deviation
σ2: Variance
ρ: Correlation coefficient
A subset of an observable population
A subset of an observable population
Quantitative summary of a sample
A subset of an observable population
Quantitative summary of a sample
A subset of an observable population
Quantitative summary of a sample
x: Data element
n: Number of element
^p: Proportion
m: Median
¯x: Average
s: Standard deviation
s2: Variance
r: Correlation coefficient
P(E=e)=ES
Independent vs Identical? → I.I.D
Independent vs Identical? → I.I.D
Independent vs Identical? → I.I.D
Considering I.I.D, can we do a better probability estimation?
Independent vs Identical? → I.I.D
Considering I.I.D, can we do a better probability estimation?
Independent vs Identical? → I.I.D
Considering I.I.D, can we do a better probability estimation?
In math, please?
P(E=e)=f(e)>0:E∈S∑e∈Sf(e)=1P(E∈A)=∑e∈Af(e):A⊂S(1)(2)(3)
Similarity: All describes discrete random variables following Bernoulli trials
Bernoulli trials:

Key question: How many events to have given a certain probability n?

Key question: How many failures before getting an event?

Key question: What is the chance of having n events given a λ rate?
Similarity: All describes continuous random variables





¯Xd→N(μ,σ√n)as n→∞
d→ is a convergence of random variables
¯Xd→N(μ,σ√n)as n→∞
d→ is a convergence of random variables
y=β0+n∑i=1βixi+ϵ
g(y)=β0+n∑i=1βixi+ϵ
logit and probit (also termed a logistic regression)log (log-linear regression), identity, sqrtOverview
Keyboard shortcuts
| ↑, ←, Pg Up, k | Go to previous slide |
| ↓, →, Pg Dn, Space, j | Go to next slide |
| Home | Go to first slide |
| End | Go to last slide |
| Number + Return | Go to specific slide |
| b / m / f | Toggle blackout / mirrored / fullscreen mode |
| c | Clone slideshow |
| p | Toggle presenter mode |
| t | Restart the presentation timer |
| ?, h | Toggle this help |
| Esc | Back to slideshow |