Introduction In this post we describe attention mechanisms by motivating it using geometric intuition. The…

In this post we describe the Nyström method for finding the eigenvalues and eigenfunctions of…

In this post we describe the use of momentum to speed up gradient descent. We…

Popular papers often have code on Github, but the authors are super busy writing new…

In this post we describe how to do gradient descent with constraints. We first describe…

In this post we describe the high-level idea behind gradient descent for convex optimization. Much…

An important class of machine learning models is decision trees: you can use them for…

In this post we describe several methods for visualizing time series data. Time series visualization…

In this post we describe basic visualization of missing data patterns in R with VIM….

In this post we describe stationary and non-stationary time series. We first ask why we…