Analyzing learning with speech analytics and computer vision methods: Technologies, principles, and ethics
Elizabeth Dyer, Cynthia D’Angelo, Nigel Bosch, Christina (Stina) Krist, Joshua Rosenberg
This half-day workshop focuses on video and audio data collection methods that allow researchers to effectively use emerging computer-focused analytical methods (e.g., speech analytics and computer vision techniques) in combination with human-focused analysis (e.g., qualitative analysis). Video and audio recordings are an increasingly common data source for examining the complexities and nuances of learning in situ. To date, analysis of video and audio data of learning has been unable to fully leverage computational methods that take advantage of this richness, especially with visuospatial and acoustic features (as opposed to textual extractions; e.g., transcripts).
Recent advances in computer vision, coupled with existing speech analytics methods, make it feasible to identify theoretically and practically important features from video that matter for examinations of learning. Additionally, these computational methods for video and audio data are likely to be most powerful when integrated with human-conducted analysis and decision-making, such as the computational grounded theory methodological framework. However, these new computational methods require different technical specifications for video and audio data than human-focused analysis, many of which must be decided and set before recording occurs.
In this workshop, we will share new principles for collecting audio and video data so that they can be used with innovative computational methods. Specifically, participants in this workshop will:
Would you like to join our ICLS workshop? Please register up via the ICLS website.
Advancing Computational Grounded Theory for Audiovisual Data from Mathematics Classrooms
Cynthia D’Angelo, Elizabeth Dyer, Christina Krist, Josh Rosenberg, & Nigel Bosch
This poster will discuss early findings from a project that is developing theory-based approaches to combine computational methods and qualitative grounded theory in order to analyze classroom video data of middle school mathematics classrooms. These early findings involve the feasibility of using out-of-the-box implementations of video and audio processing algorithms for analysis of video and audio data, focusing on methods to capture instances of collaboration and student–teacher interactions.
Winesberry, R., Meadows, M., & Dyer, E. B. (January, 2020). An Observational Analysis of Participation Structure Within A Mathematics Classroom. Poster presented at the 14th Annual Tennessee STEM Education Research Conference, Cookeville, TN.
Mathematics teaching that provides students opportunities to make sense of mathematical ideas through problem-solving requires more active classroom structures for students. In this study, we examine the participation structure of one teacher at different points during one school year.
Dyer, E. B., Rosenberg, J. M., Bosch, N., Krist, C., & D’Angelo, C. (September, 2020). Better together? Initial findings and implications from combining qualitative coding and computational methods to analyze classroom audiovisual data. Presentation at the AERA Satellite Conference on Educational Data Science. Stanford, CA.
Studies of classroom teaching and learning increasingly rely on video data. Conducting analyses that attend to auditory and visual dimensions of these data sources poses several research challenges, especially at scale. This poster presents early findings from an ongoing National Science Foundation-funded research project exploring new methodologies for analyzing such data sources in ways that are rigorous, reproducible, meaningful, interpretable, and scalable. This project utilizes a computational grounded theory (CGT; Nelson, 2017) methodological approach, combining machine learning and grounded theory to analyze classroom video data of middle school mathematics classrooms. We use a dataset that includes 106 videotaped mathematics lessons from 10 teachers of grades 8-12 across two school years. CGT involves three steps: exploratory pattern detection, interpretation and refinement of themes, and thematic pattern confirmation. For step 1, we will use audio and video processing software to extract information from audiovisual data (e.g., Praat, OpenPose). We will interpret these patterns in light of theoretically-derived questions about teaching and learning mathematics (step 2) and confirm them using supervised machine learning methods (step 3). This poster will report early findings concerning the feasibility of using out-of-the-box software to address questions about collaboration and student-teacher interactions. We will also highlight the implications of the application of CGT by other educational researchers. This project and early-stage work also have implications for the burgeoning field of data science education, particularly regarding how quantitative and computational methods may be used to answer the types of questions in novel and more efficient ways—those that are typically answered using qualitative methods.