Machine Learning

Progress: 0%

Welcome to the Machine Learning Track! | Continue |
---|

Progress: 0%%

Description: # Welcome to the Machine Learning Track! This track is intended to introduce an exhaustive range of concepts that are applied in the day-to-day work of a Data Analyst. It will cover all concepts and follows a learning-by-doing methodology for skill-building, by providing many exercises and milestone labs for practicing application of the concepts previously learned. The objective of this track is to develop data analysis skills to collect, manipulate and present data for easy consumption by business users. ## Data Life Cycle and the role of a Data Analyst

Read more..

Description: # Welcome to the Machine Learning Track! This track is intended to introduce an exhaustive range of concepts that are applied in the day-to-day work of a Data Analyst. It will cover all concepts and follows a learning-by-doing methodology for skill-building, by providing many exercises and milestone labs for practicing application of the concepts previously learned. The objective of this track is to develop data analysis skills to collect, manipulate and present data for easy consumption by business users. ## Data Life Cycle and the role of a Data Analyst

Read more..

Linear Algebra - Basics | Continue |
---|

Progress: 0%%

Description: # Linear Algebra - Basics # Introduction to Linear Algebra Linear algebra is a branch of mathematics that deals with equations of straight lines. A line is made up of multiple points. A point in 2 dimensional (2D) space is represented using two coordinates (x,y).

Read more..

Description: # Linear Algebra - Basics # Introduction to Linear Algebra Linear algebra is a branch of mathematics that deals with equations of straight lines. A line is made up of multiple points. A point in 2 dimensional (2D) space is represented using two coordinates (x,y).

Read more..

Probability | Continue |
---|

Discrete Distributions or Probability Mass Functions (PMFs) | Continue |
---|

Progress: 0%%

Description: # Discrete Distributions or Probability Mass Functions (PMFs) Probability distribution is a function that generates the probabilities of occurrence of all possible outcomes in an experiment. Consider an experiement of rolling of the die. If the random variable X is used to denote the outcome of the die roll, then the probability distribution of X would take the value $\frac{1}{6}$ for $X \to \{1, 2, 3, 4, 5, 6\}$.

Read more..

Description: # Discrete Distributions or Probability Mass Functions (PMFs) Probability distribution is a function that generates the probabilities of occurrence of all possible outcomes in an experiment. Consider an experiement of rolling of the die. If the random variable X is used to denote the outcome of the die roll, then the probability distribution of X would take the value $\frac{1}{6}$ for $X \to \{1, 2, 3, 4, 5, 6\}$.

Read more..

Continuous Distributions | Continue |
---|

Introduction to Linear Regression | Continue |
---|

Progress: 0%%

Description: # Introduction to Linear Regression Linear regression is a supervised learning algorithm. Given a single feature, a line is fit that best predicts the independent variable. When many features are involved, a hyperplane is fit that minimizes the error between predicted values and the ground truth. Given an input vector Xn = (X1, X2, ..., Xn) that we want to use to predict the output y, the regression equation is given by: $$y=\beta_0 + \sum_{i=1}^nX_i\beta_i$$

Read more..

Description: # Introduction to Linear Regression Linear regression is a supervised learning algorithm. Given a single feature, a line is fit that best predicts the independent variable. When many features are involved, a hyperplane is fit that minimizes the error between predicted values and the ground truth. Given an input vector Xn = (X1, X2, ..., Xn) that we want to use to predict the output y, the regression equation is given by: $$y=\beta_0 + \sum_{i=1}^nX_i\beta_i$$

Read more..

Advanced Linear Regression | Continue |
---|

Progress: 0%%

Description: # Advanced Linear Regression <br/> # Improving the fit - Cross Validation ## Cross Validation

Read more..

Description: # Advanced Linear Regression <br/> # Improving the fit - Cross Validation ## Cross Validation

Read more..

Logistic Regression | Continue |
---|

Progress: 0%%

Description: # Logistic Regression ## Logistic Regression ### Introduction to Classification

Read more..

Description: # Logistic Regression ## Logistic Regression ### Introduction to Classification

Read more..

Logistic Regression: Model Building and Implementation | Continue |
---|

Progress: 0%%

Description: # Logistic Regression: Model Building and Implementation <br/><br/><br/> ## Titanic Survivors - Data Selection & Preparation

Read more..

Description: # Logistic Regression: Model Building and Implementation <br/><br/><br/> ## Titanic Survivors - Data Selection & Preparation

Read more..

Support Vector Machines (SVMs) | Continue |
---|

Progress: 0%%

Description: # Support Vector Machines (SVMs) ## Introduction Support Vector Machines are classifiers that can classify datasets by a introducing an optimal hyperplane between the multi-dimensional data points. An hyperplane is a multi-dimensional structure that extends a two-dimensional plane. If the datasets consists of two dimensional dataset, then an estimate line is fit that provides the best classification on the dataset. By "best classification", it is to be noted that a plane that not necessarily provides perfect classification of all points in the training dataset but fits a criterion such that the line is farthest from all points. You can see from the figure below that a hyperplane classifies the dataset as shown.

Read more..

Description: # Support Vector Machines (SVMs) ## Introduction Support Vector Machines are classifiers that can classify datasets by a introducing an optimal hyperplane between the multi-dimensional data points. An hyperplane is a multi-dimensional structure that extends a two-dimensional plane. If the datasets consists of two dimensional dataset, then an estimate line is fit that provides the best classification on the dataset. By "best classification", it is to be noted that a plane that not necessarily provides perfect classification of all points in the training dataset but fits a criterion such that the line is farthest from all points. You can see from the figure below that a hyperplane classifies the dataset as shown.

Read more..

Advanced k-means | Continue |
---|

Progress: 0%%

Description: # Advanced k-means ## Let us prepare the 'normal' data To start with clustering, let us consider the datasets which follow the Gaussian 'normal' distribution with a low variance. To do so, we can synthesize a dataset using sklearn make_blob feature. The centers of these gaussian blobs need to be specified. In two dimensions, we need to specify the centers, standard deviation and number of samples as 2000. Here is the gaussian normal distribution function:

Read more..

Description: # Advanced k-means ## Let us prepare the 'normal' data To start with clustering, let us consider the datasets which follow the Gaussian 'normal' distribution with a low variance. To do so, we can synthesize a dataset using sklearn make_blob feature. The centers of these gaussian blobs need to be specified. In two dimensions, we need to specify the centers, standard deviation and number of samples as 2000. Here is the gaussian normal distribution function:

Read more..

Maximum Likelihood Estimation (MLE) | Continue |
---|

Progress: 0%%

Description: # Maximum Likelihood Estimation (MLE) * A model hypothesizes a relation between the unknown parameter(s) and the observed data. * The goal of a statistical analysis is to estimate the unknown parameter(s) in the hypothetical model * The likelihood function is a popular and latest method to estimate unknown parameters.

Read more..

Description: # Maximum Likelihood Estimation (MLE) * A model hypothesizes a relation between the unknown parameter(s) and the observed data. * The goal of a statistical analysis is to estimate the unknown parameter(s) in the hypothetical model * The likelihood function is a popular and latest method to estimate unknown parameters.

Read more..

Gaussian Mixture Models (GMMs) | Continue |
---|

Progress: 0%%

Description: # Gaussian Mixture Models (GMMs) ## The Three Archers - Not So 'normal' Data In an archery training session, three archers are told to shoot at the target. Assume that the archers are shooting the same arrows and later at the end of the competition they need to count their scores. What would be the best estimate of their scores? Assume that inner yellow has the highest score and the score lowers as the circle your arrow lands in moves away from the center.

Read more..

Description: # Gaussian Mixture Models (GMMs) ## The Three Archers - Not So 'normal' Data In an archery training session, three archers are told to shoot at the target. Assume that the archers are shooting the same arrows and later at the end of the competition they need to count their scores. What would be the best estimate of their scores? Assume that inner yellow has the highest score and the score lowers as the circle your arrow lands in moves away from the center.

Read more..

Linearly Inseparable Datasets | Continue |
---|

Progress: 0%%

Description: # Linearly Inseparable Datasets ## The Non-Convex Regions ### Non-Convex Regions

Read more..

Description: # Linearly Inseparable Datasets ## The Non-Convex Regions ### Non-Convex Regions

Read more..

DBScan | Continue |
---|

Progress: 0%%

Description: # DBScan ## DBScan DBSCAN has the ability to capture densely packed data points. It is similar to KNNs with variable parameters.

Read more..

Description: # DBScan ## DBScan DBSCAN has the ability to capture densely packed data points. It is similar to KNNs with variable parameters.

Read more..

Spectral Clustering | Continue |
---|

Progress: 0%%

Description: # Spectral Clustering ## Spectral Clustering Spectral Clustering works by transforming the data into a subspace prior to clustering. This is incredibly useful when the data is high dimensional. This saves the effort of doing a PCA or a dimensionality reduction ourselves prior to clustering. Spectral clustering works by determining an affinity matrix between the datasets. The data is represented as a graph and an affinity matrix is computed. For the affinity function, we can use the rbf kernel function or nearest neighbors.

Read more..

Description: # Spectral Clustering ## Spectral Clustering Spectral Clustering works by transforming the data into a subspace prior to clustering. This is incredibly useful when the data is high dimensional. This saves the effort of doing a PCA or a dimensionality reduction ourselves prior to clustering. Spectral clustering works by determining an affinity matrix between the datasets. The data is represented as a graph and an affinity matrix is computed. For the affinity function, we can use the rbf kernel function or nearest neighbors.

Read more..

Agglomerative Clustering | Continue |
---|

Progress: 0%%

Description: # Agglomerative Clustering ## Agglomerative Clustering ### Algorithm

Read more..

Description: # Agglomerative Clustering ## Agglomerative Clustering ### Algorithm

Read more..

Jensen's Inequality that Guarantees Convergence of the EM Algorithm | Continue |
---|

Progress: 0%%

Description: # Jensen's Inequality that Guarantees Convergence of the EM Algorithm Jensen's Inequality states that given g, a strictly convex function, and X a random variable, then $$ E[g(X)] ≥ g(E[X]) $$

Read more..

Description: # Jensen's Inequality that Guarantees Convergence of the EM Algorithm Jensen's Inequality states that given g, a strictly convex function, and X a random variable, then $$ E[g(X)] ≥ g(E[X]) $$

Read more..