College of Education

Theory-based Computational Analysis of Classroom Audiovisual Data

About Team Publications Workshops
a photo of the PIs of the project

This proposal was submitted in response to EHR Core Research (ECR) program announcement NSF 19-508. The ECR program of fundamental research in STEM education provides funding in critical research areas that are essential, broad and enduring. EHR seeks proposals that will help synthesize, build and/or expand research foundations in the following focal areas: STEM learning, STEM learning environments, STEM workforce development, and broadening participation in STEM. The ECR program is distinguished by its emphasis on the accumulation of robust evidence to inform efforts to (a) understand, (b) build theory to explain, and (c) suggest interventions (and innovations) to address persistent challenges in STEM interest, education, learning, and participation.


This EHR Core Research project is conducting methodological research on the computational analysis of video data focused on the social and spatial dimensions of STEM learning in classrooms. Video data are complex. They involve visual, acoustic, spatial, and temporal features that can be reduced in several ways. To date, analysis of video data of STEM classrooms has not been able to leverage computational power to take advantage of their richness. However, recent advancements in data science, coupled with existing speech analytics methods, make it possible to computationally identify important features from video in ways that preserve complexity and nuance. These advancements will improve research replicability. The methods developed through this project will facilitate use of sophisticated computational analysis with video data by more researchers. Application of these new methods will help increase the scale and generalizability of video research and lead to the building of new theory.


This research project builds on state-of-the-art computer vision and speech analytics methods tested on video data collected in STEM classrooms. It does so within a computational grounded theory methodological framework, which leverages the interpretive power of grounded analytical approaches with the processing power of computational methods. Specifically, two types of computational analysis procedures will be produced: (a) extracting meaningful features from video and audio data of STEM classrooms, and (b) conducting exploratory pattern identification using these extracted features. To develop these procedures, existing large-scale video datasets of STEM classrooms will be used to test and refine increasingly sophisticated analyses, which will also be used to demonstrate the application of these methods to investigate the social and spatial dimensions of STEM classrooms. The project focuses on integrating these methods to improve their power and leverages existing large-scale datasets of STEM classrooms, such that the methods developed can be tested on realistic data. The datasets are extensive enough to support the investigation of a wide range of research questions, including high-inference questions about students' participation in disciplinary practices. Finally, by pairing computational and grounded analytical methods, the project is developing methods that have the potential to enhance and test construct validity of the patterns found in the data.