slides Show Galton peas dataset as motivational example Show that weighted version removes the heteroscedasticity.
Vector Databases
Vector databases allow you to store a collection of vectors, for example embeddings of documents. And then to send queries to the database, e.g you have a vector and you want to see which vectors i...
Model Performance
Main references: Classifaction metric - blog posts Model Performance Accuracy Accuracy is the most common metric for classification problems. It is the ratio of correctly predicted observati...
Multi-label classification
Multi-label classification Main-references [1] Comprehensive comparative study of multi-label classification methods Problem description A clear description of this problem is given in [1]: ...
Lasso
Lasso Main references: Statistical learning with sparsity - by Hastie, Tibshirani and Wainwright Problem statement The objective of Lasso is: [\begin{align} \min_{\beta \in \mathbb{R}^p} |y...
Factor Models
Factor Models Main references: Multivariate analysis. Probability and mathematical statistics Model description The idea of a factor model is to represent each observed variable as a linear ...
MCMC
Markov Chain Monte Carlo (MCMC) Monte Carlo Monte Carlo methods are algorithms that solve problems through random sampling. For example, stochastic integration, where integrals that are difficult ...
Reproducing Kernel Hilbert Space - RKHS
In this post I give the main results of Reproducing Kernel Hilbert Space (RKHS). Taken from the introduction chapter of this book. For later sections I used these slides which I though were very cl...
Linear Regression
In this post I will describe the setup of linear regression, and main results. I will be looking at the fixed design setting, which is where we assume that the covariates $x_i^0$ are fixed, and the...
Biplots
Often when we do principal component analysis we plot the principal components. Biplots are similar but in addition to the principal components we plot the low rank approximations of the variables.