TOOLS
I. Python
II. R
III. MatLab
RESOURCES BY MODULE
Module 1: Making sense of unstructured data
Instructors: Stefanie Jegelka
-
Case Study 1: PCA: Computing Eigenfaces
- Case Study Package: instructors.zip
-
Case Study 2: Spectral Clustering New Stories
Instructors: Tamara Broderick
-
Case Study 1: Genetic Codes
-
Case Study 2: Finding Themes in Project Description
Recommended Reading
Module 2: Regression and Prediction
Instructor: Victor Chernuzkov
Instructor Slides
- Regression 1.1 - Introduction
- Regression 1.2 - Linear Regression for Prediction
- Regression 1.3 - Assessment of Prediction Quality
- Regression 1.4 - Case Study: Predicting Wages
- Regression 1.5 - The Inference Problem for Linear Regression
- Regression 1.6 - Case Study: Gender Wage Gap
- Regression 1.7 - Other Types of Regression
- Regression 2.1 - Modern Linear Regression for High-Dimensional Data
- Regression 2.2 - High-Dimensional Sparse Models and Lasso
- Regression 2.3 - Inference with Lasso
- Regression 2.4 - Case Study: Do poor countries grow faster than rich countries?
- Regression 2.5 - Other Penalized Regression Methods. Cross-Validation
- Regression 3.1 - Modern Nonlinear Regression. Trees, Random Forests, and Boosted Trees
- Regression 3.2 - Neural Networks
- Regression 3.3 - Assessment of Prediction Quality. Aggregation of Predictors. Case Study
- Regression 3.4 - Inference using Modern Nonlinear Regression Methods. Case Study
- Regression 4.0 - Causality: When It Can and Cannot be Established
- R-Start-up Instructions
Case Study 1: Predicting Wages 1
- Readme document: Module 2 - Predicting Wages 1
- Case Study 1 Package: Module2_CS1_Predicting-wages-1.zip
Case Study 2: Gender Wage Gap
- Readme document: Module 2 - Gender Gap
- Case Study 2 Package: Module2_CS2_gender-wage-gap.zip
Case Study 3: Do poor countries grow faster than rich countries?
- Readme document: Module 2: Rich Countries vs. Poor Countries
- Case Study 3 Package: Module2_CS3_poor-vs-rich-countries.zip
Case Study 4: Predicting Wages 2
- Readme document: Predictive Wages 2
- Case Study 4 Package: Module2_CS1_Predicting-wages-2.zip
Case Study 5: The Effect of Gun Ownership on Homicide Rates
- Readme document: Effect of Gun Ownership
- Case Study 5 Package: Module2_CS1_homicide-rates.zip
Recommended Reading
Module 3.1: Anomaly Detection and Hypothesis Testing
Instructors: David Gamarnik & Jonathan Kelner
- Case Study: Challenger
- Challenger-data.csv
Recommended Reading
Module 3.2: Deep Learning
Instructors: Ankur Moitra
- Case Study: Decision boundary of a deep neural network
Recommended Reading
Module 4: Recommendation Systems
Instructors: Devavrat Shah & Philippe Rigollet
- Case Study 1: Recommending Movies
- Case Study 2: Recommend New Songs to users based on their listening habits
- Case Study 3: Make new Product Recommendations
Recommended Reading
Module 5: Networks and Graphical Models
Instructors: Guy Bresler & Caroline Uhler
Case Study 1: Identifying New Genes that cause Autism
Case Study 2.1: Kalman Filtering: Tracking the 2D Position of an Object when moving with Constant Velocity
Case Study 2.2: Kalman Filtering: Tracking the 3D Position of an Object falling due to gravity
Recommended Reading
Module 6: Predictive Modeling for Temporal Data
Instructor: Kalyan Veeramachaneni
Case Study 6.1:
Case Study 6.2:
DISCLAIMER:
These Optional Case Study tutorials will require some prior knowledge and experience with the programming language you choose to use for reproducing case study results. Generally, participants with 6 months of experience using “R” or “Python” should be successful in going through these exercises. MIT is not responsible for errors in these tutorials or in external, publicly available data sets, code, and implementation libraries. As these tutorials are not required (except the two peer-reviewed case studies), we do not offer formal support from the faculty or TAs. However, should you have questions or need assistance, we recommend you utilize the Discussion Forums to pose questions and exchange ideas with your fellow participants. Please note that any links to external, publicly available websites, data sets, code, and implementation libraries are provided as a courtesy for the student. They should not be construed as an endorsement of the content or views of the linked materials.
Student-added resources
Suggest your TAs to add interesting resources in this space (*you can do it from the discussion forum: ASk your TAs sections)
MODULE 2
- Tensorflow Neural Network Simulation shown in class. FKG
Homepage of Profesor Chernozhukov with a great collection of papers covering many of the topics in the class. FKG- Analysis p-value histograms. Interesting analysis of how to interpret different p-value histograms.
MODULE 3
- Inspecting Algorithms for Bias, MIT Technology Review magazine (discussion of the risk of bias and metrics for evaluating classifier systems used for criminal justice decisions)