Skip to main content

EDx Data Science for Construction, Architecture and Engineering

Launch 3 Syllabus 


Instructor: Dr. Clayton Miller

Assistants (alphabetical order): Ananya Joshi, Charlene Tan, Chun Fu, James Zhan, Mahmoud Abdelrahman, Matias Quintana, Miguel Martin, and Vanessa Neo.

This course introduces data science skills targeting applications in the design, construction, and operations of buildings. You will learn practical coding within this context with an emphasis on basic Python programming and the Pandas library.

Click here to download a PDF of this syllabus


The building industry is exploding with data sources that impact the energy performance of the built environment and health and well-being of occupants. Spreadsheets just don’t cut it anymore as the sole analytics tool for professionals in this field. Participating in mainstream data science courses might provide skills such as programming and statistics, however, the applied context to buildings is missing, which is the most important part for beginners.

This course focuses on the development of data science skills for professionals specifically in the built environment sector. It targets architects, engineers, construction and facilities managers with little or no previous programming experience. An introduction to data science skills is given in the context of the building life cycle phases. Participants will use large, open data sets from the design, construction, and operations of buildings to learn and practice data science techniques.

 Essentially this course is designed to add new tools and skills to supplement spreadsheets. Major technical topics include data loading, processing, visualization, and basic machine learning using the Python programming language, the Pandas data analytics and sci-kit learn machine learning libraries, and the web-based Colaboratory environment. In addition, the course will provide numerous learning paths for various built environment-related tasks to facilitate further growth.

 The following table outlines the Sections of this course:

Grading and Evaluation Policy

For Verified Track users, this course is graded through two quizzes from each section.

 To achieve the Verified Certificate, the participant will need to have a grade of 75% or higher. There are 10 total quizzes and the lowest two will be dropped for the final score. Each quiz has 10 points - 7 are multiple-choice questions and 3 points are available from the exercises. For the multiple-choice questions, you have two chances to answer each question, with a 5 min minimum break between leaving answers for a single question. The fill-in-the-blank question for the exercise can be attempted multiple times.

ECTS Credit Equivalent

Upon completion of the Verified Track version of this course, a participant will have completed approximately 25-30 hours of online video instruction and hands-on exercise with support from myself and a teaching assistant team. This course is equivalent to 1 ECTS credit upon completion of the Verified Certificate track that includes 10 evaluation quizzes that test the understanding of the participant. 

The Verified Certificate from EDx with name and dates must be included with this syllabus to show full participation in the course.

Section 1: Introduction to Course and Python Fundamentals 

In this introduction, an overview of key Python concepts is covered as well as the motivating factors for building industry professionals to learn to code. The NZEB at the NUS School of Design and Environment is introduced as an example of a building that uses various data science-related technologies in its design, construction, and operations.

Section 2: Introduction to the Pandas Data Analytics Library and Design Phase Application Example

The foundational functions of Pandas are demonstrated in the context of the integrated design process through the processing of data from parametric EnergyPlus models. Further future learning path examples are introduced for the Design Phase including building information modelling (BIM) using Revit or Rhino, spatial analytics, and building performance modelling Python libraries.

Section 3: Pandas Analysis of Time-Series Data from IoT and Construction Phase Application Example

Time-series analysis Pandas functions are demonstrated in the Construction Phase through the analysis of hourly IoT data from electrical energy meters. Further future learning path examples are introduced for the Construction Phase including project management, building management system (BMS) data analysis, and digital construction such as robotic fabrication.


 Section 4: Statistics and Visualization Basics and Operations Phase Application Example

Various statistical aggregations and visualization techniques using Pandas and the Seaborn library are demonstrated on Operations Phase occupant comfort data from the ASHRAE Thermal Comfort Database II. Further future learning path examples are introduced for the Operations Phase including energy auditing, IoT analysis, and occupant detection and reinforcement learning.


Section 5: Introduction to Machine Learning for the Built Environment

This concluding section gives an overview of the motivations and opportunities for the use of prediction in the built environment. Prediction, classification, and clustering using the sci-kit learn library is demonstrated on electrical meter and occupant comfort data. The course is concluded with suggestions on more in-depth Python, Data Science, and Statistics courses on EDx.