Covariance Matrix

In my first machine learning class, in order to learn about the theory behind PCA (Principal Component Analysis), we had to learn about variance-covariance matrix. I was concurrently taking a basic theoretical probability and statistics, so even the idea of variance was still vague to me. Despite the repeated attempts to understand covariance, I still had trouble fully capturing the intuition behind the covariance between two random variables. Even now, application and verification of correct usage of mathematical properties of covariance requires intensive googling. [Read More]
theory 

My First Post

This is the first blog post of my life! I will be using this blog to post about anything that I want to share in statistics. For starter, I will run a linear regression with the iris dataset. names(iris) ## [1] "Sepal.Length" "Sepal.Width" "Petal.Length" "Petal.Width" "Species" Let’s predict Sepal.Length with Petal.Length and Petal.Width. #separate into training and testing sets set.seed(1234) train_ind <- sample(nrow(iris), floor(0.8 * nrow(iris))) iris_train <- iris[train_ind,] iris_test <- iris[-train_ind,] #run linear regression iris_lm <- lm(Sepal. [Read More]