Skip to content

Latest commit

 

History

History
99 lines (79 loc) · 7.62 KB

ML.md

File metadata and controls

99 lines (79 loc) · 7.62 KB

Machine Learning basics

Core Algorithms

  • Descion Trees: Link
    • Entropy or information gain to make the first split and so on.
  • Random Forest (descion Trees + Bagging + random feature subset) Link.
    • Reduces bias by overfiting one tree.
    • Reduces variance by combining results from multiple not-correlated trees (bagging).
  • Linear Regression - Elements of Statistical Learning (pages 43-49) and ISLR.
    • Regression analysis playlist Link.
    • Multicolenarity Link.
      • check the correlation and remove the features manually.
      • Auxillary regression (X1~X2, X3), and comute VIF (1/1-R^2).
    • R-square vs Adjusted R-2 coefficient of determintation
    • Coeficient calculation
    • Standard error
    • t-statistic depends on degrees of freedom and (same as z-stat as n>30):
    • F-statistic for concluding results on more than one variable at a time similar to R-square.
    • Dummy Trap: leads to near singular matrix.
  • Logistic Regression Link
    • What is the significance of log odds. ? Link;
    • Distribution of odd ratios become normal from right skewed. Link
  • Bagging vs Boosting and Stacking Link Link2
  • Gradient Boosting Link LINK2
    • Is a error reducing strategy which can be used with any weak learner along with the idea of boosting.
    • Linear regression or descion tree.
  • Bayes Theorem Link
    • P(A|B) = P(B|A).P(A)/P(B)
  • Naive Bayes Link
  • K-NN Link
    • KNN works by finding the distances between a query and all the examples in the data, selecting the specified number examples (K) closest to the query, then votes for the most frequent label (in the case of classification) or averages the labels (in the case of regression)
    • General rule is k = sqrt(N)
  • K-means Link
    • k points at ramdom,
    • assign datapoints to center
    • Recompute cluster center
    • move center and iterate
    • Elbow method: choose k based on sharp drop in sum of squared errors between center and data points as k increases.
  • SVM Link Link2
  • PCA and SVD computation from data matrix. Link Link

Imp ML Concepts

  • Bias variance Tradeoff Link Link2
    • Random Forest Link
    • Low bias and variance is reducned by bagging.
    • Linear regression is high bias , low variance (less complex model)
    • Full NN is low bias high variance (more complex model)
    • high Bias is mifit in training data
    • high variance is difference in errors with different data set e.g. test data.
    • Why variance reduces and correlation between learners Link
  • Correlation vs Causation Link
  • Pearson Correlation
  • L1 and L2 Regularizer Link Link2 Link-Viz
  • Why cross entropy vs MSE Link
  • Dummy Variable TrapLink
  • My MLE-NotesLink
  • Log-likelihood Link Link2
    • StatQuest MLE Link
    • Proabbility vs Likelihood Link
    • Negative log likelihood and Cross Entropy Link
  • Least Square Error Link
  • Information Theory Link
  • Parametric and non-Parametric Link
  • ML FAQ Link
  • MLE vs MAP Link

Evaluation

  • Precion Recall , Accuracy, F1 Confusion Matrix Link
  • AUC ROC Link Link

Machine Learning Extras

Type of Loss link

Pairwise losses link

Decision Trees and Cross Entropy Link

ISOMAP Link

Compressed Sensing Link

SMOTE Link

K-NN Link

Stratified Sampling Link

Categorical data Link

Covariate Shift in ML or Why train and test data should come from similar distribution Link

Kalman_filter Link