Pytoch is a quite powerful, flexible and yet popular deep learning framework. The learning curve could be steep if you do not have much deep learning background. So what is a good way to learn it? The official pytorch webiste has some great tutorials at: https://pytorch.org/tutorials/ IMHO those materials are tuned to some intermediate level • Read More »

debian 10 have nvidia GPU driver, cuda 9 etc, it is a good thing for deep learning ( keras or pytorch). but if let it running for some time ( without any interrupt), it hang after the power management kick in. Work around is to turn off the power management by doing: sudo systemctl mask • Read More »

The normal use case is to discover interesting relations between variables in large databases, e.g: i(t) 1 ABDE 2 BCE 3 ABDE 4 ABCE 5 ABCDE 6 BCD the above are some transactions, let’s A, B, C … are some products people bought. we want to find which set are frequent set .e.g: BC is • Read More »

K-means clustering is unsupervised machine learning algorithm. Wikipedia has a great demo as below on how it works: Demonstration of the standard algorithm 1. k initial “means” (in this case k=3) are randomly generated within the data domain (shown in color). 2. k clusters are created by associating every observation with the nearest mean. The • Read More »

(1) Maximum the margin SVM is very easy to understand on the graph,, we just need to find the a separate plane which maximum the margin. see the graph below: (2) How to calculate/find the max Margin Assuming hard-margin issue for the simplicity of math, the separate plane can be expressed as: w*x -b = 0 where • Read More »

The big picture is: a quadratic programming problem can be reduced to be a linear programming problem. Here is how: (1) KTT conditions For any non-linear programming: max: f(x), s.t: g(x) <=0 It has been proved that it needs to meet Karush–Kuhn–Tucker (KKT) conditions provided that some regularity conditions are satisfied how it is being proved? it is • Read More »

Why study the linear programming (LP) ? LP has a lot of use cases, one of them is the SVM ( support vector machine). The SVM ‘s Lagrangian dual can give the lower bound of SVM, this Lagrangian dual can be solved by quadratic programming. The KKT conditions of this quadratic programming can be solved by • Read More »

In logistic regression, we just assume the probability of x to be classified as 1 is : P( y = 1 | x ) = 1 / ( 1 + exp ( -w^T x) ) = hw(x) w is the parameter vector that we need to learn and optimize from the training sets. This is • Read More »

Bayes theorem: where A and B are events and P(B) ≠ 0. P(A) and P(B) are the probabilities of observing A and B without regard to each other. P(A | B), a conditional probability, is the probability of observing event A given that B is true. P(B | A) is the probability of observing event B given that A • Read More »

Decision tree works just like computer language if. In AI/ML world, the problem is usually like this: Given training set with features [( f1,f2 ….), ….] and known category/label [c1, ….], how can we learn from this training set/data and design a decision tree , so that for any new data, we can predict which • Read More »