Data Scientist II
Progress: 0%
Progress: 0%%
Description: # Intro to Python Python is a simple, easy-to-learn, pseudo-code resembling programming language. It is rich with all the features of any object oriented language. Scientists and Mathematicians have been using python since its inception and hence is popular for analytical tasks. Python is also known for its brevity.  ## Why is Python a Favorite of Data Scientists?
Read more..

Progress: 0%%
Description: # Numpy ## Arrays and Lists Before getting introduced to Numpy library, we need to be familiar with a very widely used data structure called 'array'. An array is a collection of homogenous variables. Here homogenous means variables of the same data type. And so an array can be a collection of integers (int datatypes), collection of fractions/decimal values (float datatypes) or a collection of characters (char datatype) also referred to as a string.
Read more..

Progress: 0%%
Description: # Dataframes ## Dataframe Basics ### The Pandas Library
Read more..

Progress: 0%%
Description: # Linear Algebra - Basics # Introduction to Linear Algebra Linear algebra is a branch of mathematics that deals with equations of straight lines. A line is made up of multiple points. A point in 2 dimensional (2D) space is represented using two coordinates (x,y).
Read more..

Progress: 0%%
Description: # Data Science workflow ## Origins of Data Science ### History
Read more..

Progress: 0%%
Description: # Machine Learning ## Introduction to Machine Learning ### Machine Learning
Read more..

Progress: 0%%
Description: # Data Visualization ## Data Visualization ### Data Visualization
Read more..

Progress: 0%%
Description: # Introduction to Linear Regression Linear regression is a supervised learning algorithm. Given a single feature, a line is fit that best predicts the independent variable. When many features are involved, a hyperplane is fit that minimizes the error between predicted values and the ground truth. Given an input vector Xn = (X1, X2, ..., Xn) that we want to use to predict the output y, the regression equation is given by: $$y=\beta_0 + \sum_{i=1}^nX_i\beta_i$$
Read more..

Progress: 0%%
Description: # Logistic Regression ## Logistic Regression ### Introduction to Classification
Read more..

Progress: 0%%
Description: # Logistic Regression: Model Building and Implementation <br/><br/><br/> ## Titanic Survivors - Data Selection & Preparation
Read more..

Progress: 0%%
Description: # Support Vector Machines (SVMs) ## Introduction Support Vector Machines are classifiers that can classify datasets by a introducing an optimal hyperplane between the multi-dimensional data points. An hyperplane is a multi-dimensional structure that extends a two-dimensional plane. If the datasets consists of two dimensional dataset, then an estimate line is fit that provides the best classification on the dataset. By "best classification", it is to be noted that a plane that not necessarily provides perfect classification of all points in the training dataset but fits a criterion such that the line is farthest from all points. You can see from the figure below that a hyperplane classifies the dataset as shown.
Read more..

Progress: 0%%
Description: # Unsupervised Learning ## Unsupervised Learning - K-Means ### Introduction to Unsupervised Learning
Read more..

Progress: 0%%
Description: # Decision Trees A Decision tree is a computational model that contains a set of if-then-else decisions to classify data. Its similar to program flow diagrams. For example a bank needs to be opened and the bank wants to know whether the economics such as income levels, number of already existing banks, location and other details that can affect the profitability is known. A decision tree helps to make such decisions based on existing data. The decision trees are used for classification and prediction. Let us use the Titanic example to perform classification on who is likely to survive using Decision Trees. This will be a binary classifier but multi-decision classifiers can also be implemented. The advantage of Decision Trees are that they are interpretable.
Read more..

Progress: 0%%
Description: # Natural Language Processing (NLP) ## Natural Language Processing (NLP) ### Introduction
Read more..