Study Group

The SML 109 Study Group is now complete. Over 13 weeks in 2017, we worked through Harvard’s CS109 Data Science Course as a group and finally presented 10 data science projects at an event titled “Judgement Day” at Amazon, a couple of videos from the event which are available on Youtube.

Feel free to check this page for updates on future Study Group Sessions!



SML CS109 Student Discoveries Archive

The best student & TA discoveries related to the CS109 course:

Week 11 Oct 11th:

Michael’s discovery of essential cheat sheets for machine learning engineers

Week 9 Sep 27th:

Best picks from the discussion on DS/ML/AI primer material:

Really Good: AI primer playbook

Data science for beginners series on youtube:

Tensorflow playground (this is awesome!)


Week 8 Sep 20th:

Khalido’s excellent cs109 course notes as he has worked through the material:

Python Podcast, In particular this episode goes into Sklearn:


Week 7 Sep 13th:

extra data science courses to go a bit more in depth:

best python pandas resources:

Week 6 Sep 6th:

Excellent presentations by our tutors Nikzad & Gordon explaining SVD & PCA:



What are kernels in machine learning and SVM and why do we need them:

Week 5 Aug 30:

What is a T test?:


Bootstrap definition:


Simplified Explanation of Linear Regression:


“The Curse of Dimensionality”:


Intuitive Explanation between the relationship between PCA & SVD


Week 4 Aug 23:

Python/Pandas/Numpy/Matplotlib Material:

Python Data Science Handbook:

Extra Data Visualisation Tools:

gap-minder type plots (visualising data changing over time) with plotly:

Extra Statistics Material:

Bootstrapping Juypter notebook explanation by Dima Galat:

Khan Academy Stats Course:

Box & Whisker pots :

Click to access Box%20and%20Whisker%20Plots.pdf

Bias, Variance & Bias-Variance trade off:

P-Value explanation: