Study Group

The SML 109 Study Group is now complete. Over 13 weeks in 2017, we worked through Harvard’s CS109 Data Science Course as a group and finally presented 10 data science projects at an event titled “Judgement Day” at Amazon, a couple of videos from the event which are available on Youtube.

SML CS109 Student Discoveries Archive

The best student & TA discoveries related to the CS109 course:

Week 11 Oct 11th:

Michael’s discovery of essential cheat sheets for machine learning engineers

Week 9 Sep 27th:

Best picks from the discussion on DS/ML/AI primer material:

Really Good: AI primer playbook

Data science for beginners series on youtube:

Tensorflow playground (this is awesome!)


Week 8 Sep 20th:

Khalido’s excellent cs109 course notes as he has worked through the material:

Python Podcast, In particular this episode goes into Sklearn:


Week 7 Sep 13th:

extra data science courses to go a bit more in depth:

best python pandas resources:

Week 6 Sep 6th:

Excellent presentations by our tutors Nikzad & Gordon explaining SVD & PCA:



What are kernels in machine learning and SVM and why do we need them:

Week 5 Aug 30:

What is a T test?:


Bootstrap definition:


Simplified Explanation of Linear Regression:


“The Curse of Dimensionality”:


Intuitive Explanation between the relationship between PCA & SVD


Week 4 Aug 23:

Python/Pandas/Numpy/Matplotlib Material:

Python Data Science Handbook:

Extra Data Visualisation Tools:

gap-minder type plots (visualising data changing over time) with plotly:

Extra Statistics Material:

Bootstrapping Juypter notebook explanation by Dima Galat:

Khan Academy Stats Course:

Box & Whisker pots :

Bias, Variance & Bias-Variance trade off:

P-Value explanation: