Math for Machine Learning

Stochastic Processes (Random Planar Maps)

Once you decide to get started with Data Science (in a serious way), the first few months (if not year) can seem pretty difficult. At times, maybe even hopeless, especially if you do not have the necessary academic background or it has been a while since you have been in an academic setting. In my judgment, you should not let that stop you from your pursuit. There are so many resources online/offline to help you start to fill in your gaps.

Below are some of the key areas that you should have mastery over for you to go further with data science/machine learning:


  • Functions
  • Continuity
  • Differentiability
  • Integration (single and multi-variables)
  • Optimization
  • Convexity/Concavity

Linear Algebra

  • Vectors
  • Matrices
  • Eigenvalue
  • Vector
  • Singular Value Decomposition
  • Least Squares Estimation and Matrix Algebra


  • Basic probability
  • Sample spaces
  • Conditional probabilities and independence
  • Random variables
  • Moments
  • Distributions
  • Chi-Squared
  • F-Test
  • T-Test
  • Bayes’ Theorem
  • Marginalization
  • Bayesian Inference
  • Likelihood
  • Estimation
  • Regression
  • Analysis of Variance

Stochastic Processes and Dynamical Systems

  • Dirichlet Processes
  • Gaussian Processes for Machine Learning

What else do you think is necessary?


The notion of a function is that of something which provides a distinct output for a given input.


Think about two sets, D and R along with a principle which appropriates a unique element of R to each and every element of D. This rule is termed a function and it is represented by a letter such as f. Given n x ∈ D, f (x) is the name of the thing in R which comes from doing f to x. D is called the domain of f. In order to establish that D refers to f, the representation D (f) may be used. The set R is sometimes described as the range of f. Nowadays it.
is known as the codomain. The set of all elements of R which are of the form f (x) for some x ∈ D is consequently, a subset of R. This is sometimes referred to as the image of f. When this set equals R, the function f is said to be onto, also surjective, if whenever x  ̸= y it followss f (x) ̸= f (y), the function is called one-to-one, also injective.

It is typical representation to write f : D → R to denote the condition just described within this definition where f is a function characterized on a domain D which has values in a codomain R.