Skip to main content

Big Data in Education: Reading List

Week 1: Prediction

  • Baker, R., Siemens, G. (in press) Educational data mining and learning analytics. To appear in Sawyer, K. (Ed.) Cambridge Handbook of the Learning Sciences: 2nd Edition.
  • Siemens, G. (2013). Learning analytics: The emergence of a discipline. American Behavioral Scientist, 57 (10), 1380-1400.
  • Ferguson, R. (2012).Learning analytics: drivers, developments and challenges. International Journal of Technology Enhanced Learning (IJTEL),4(5/6),304-317.
  • Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques. Ch. 4.6, 6.1, 6.2, 6.4
  • Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques. Sections 4.6, 6.5.

Week 2: Diagnostic Metrics and Cross-Validation


Week 3: Feature Engineering and Behavior Detection

  • D'Mello, S. K., Picard, R. W., and Graesser, A. C. (2007) Towards an Affect-Sensitive AutoTutor. Special issue on Intelligent Educational Systems – IEEE Intelligent Systems, 22(4), 53-61.
  • Sao Pedro, M., Baker, R.S.J.d., Gobert, J. (2012) Improving Construct Validity Yields Better Models of Systematic Inquiry, Even with Less Information. Proceedings of the 20th International Conference on User Modeling, Adaptation and Personalization (UMAP 2012), 249-260.
  • Liu, H., Yu, L. (2005) Toward integrating feature selection algorithms for classification and clustering. IEEE Transactions on Knowledge and Data Engineering, 17 (4), 491-502.

Week 4: Knowledge Inference and Knowledge Structures

  • Corbett, A.T., Anderson, J.R. (1995) Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, 253-278.
  • Pavlik, P.I., Cen, H., Koedinger, K.R. (2009) Performance Factors Analysis -- A New Alternative to Knowledge Tracing. Proceedings of the International Conference on Artificial Intelligence and Education.
  • Baker, Frank B. (2001) The Basics of Item Response Theory. Chapters 1,2.
  • Barnes, T. (2005) The Q-matrix Method: Mining Student Response Data for Knowledge. Proceedings of the Workshop on Educational Data Mining at the Annual Meeting of the American Association for Artificial Intelligence.
  • Desmarais, M.C., Meshkinfam, P., Gagnon, M. (2006) Learned Student Models with Item to Item Knowledge Structures. User Modeling and User-Adapted Interaction, 16, 5, 403-434.

Week 5: Relationship Mining

  • Arroyo, I., Woolf, B. (2005) Inferring learning and attitudes from a Bayesian Network of log file data. Proceedings of the 12th International Conference on Artificial Intelligence in Education, 33-40.
  • Rau, M. A., & Scheines, R. (2012) Searching for Variables and Models to Investigate Mediators of Learning from Multiple Representations. Proceedings of the 5th International Conference on Educational Data Mining, 110-117.
  • Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques. Ch. 4.5
  • Merceron, A., Yacef, K. (2008) Interestingness Measures for Association Rules in Educational Data. Proceedings of the 1st International Conference on Educational Data Mining, 57-66.
  • Srikant, R., Agrawal, R. (1996) Mining Sequential Patterns: Generalizations and Performance Improvements. Research Report: IBM Research Division. San Jose, CA: IBM.
  • Perera, D., Kay, J., Koprinska, I., Yacef, K., Zaiane, O. (2009) Clustering and Sequential Pattern Mining of Online Collaborative Learning Data. IEEE Transactions on Knowledge and Data Engineering, 21, 759-772.
  • Haythornthwaite, C. (2001) Exploring Multiplexity: Social Network Structures in a Computer-Supported Distance Learning Class. The Information Society: An International Journal, 17 (3), 211-226

Week 6 Visualization

  • Corbett, A.T., Anderson, J.R. (1995) Knowledge Tracing: Modeling the Acquisition of Procedural Knowledge. User Modeling and User-Adapted Interaction, 4, 253-278.
  • Baker, R.S.J.d., Hershkovitz, A., Rossi, L.M., Goldstein, A.B., Gowda, S.M. (in press) Predicting Robust Learning With the Visual Form of the Moment-by-Moment Learning Curve. To appear in the Journal of the Learning Sciences.
  • Hershkovitz, A., Nachmias, R. (2008) Developing a Log-Based Motivation Measuring Tool. Proceedings of the First International Conference on Educational Data Mining, 226--233
  • Pardos, Z. A., Heffernan, N. T. 2010) Navigating the parameter space of Bayesian
  • Knowledge Tracing models: Visualizations of the convergence of the Expectation Maximization algorithm. Proceedings of the 3rd International Conference on Educational Data Mining.

Week 7: Clustering and Factor Analysis

  • Witten, I.H., Frank, E. (2011) Data Mining: Practical Machine Learning Tools and Techniques. Ch. 4.8, 6.6
  • Amershi, S. Conati, C. (2009) Combining Unsupervised and Supervised Classification to Build User Models for Exploratory Learning Environments. Journal of Educational Data Mining, 1 (1), 18-71.
  • Alpaydin, E. (2004) Introduction to Machine Learning. pp. 116-120.

Week 8: Discovery with Models

  • Hershkovitz, A., Baker, R.S.J.d., Gobert, J., Wixon, M., Sao Pedro, M. (2013) Discovery with Models: A Case Study on Carelessness in Computer-based Science Inquiry. American Behavioral Scientist, 57 (10), 1479-1498.
  • Aleven, V., Mclaren, B., Roll, I., & Koedinger, K. (2006). Toward meta-cognitive tutoring: A model of help seeking with a Cognitive Tutor. International Journal of Artificial Intelligence in Education, 16(2), 101-128.
  • Kinnebrew, J. S., Biswas, G., & Sulcer, B. (2010). Modeling and measuring selfregulated learning in teachable agent environments. Journal of e-Learning and Knowledge Society, 7(2), 19-35