About Me
Greetings!
I’m Khang Dang, currently working my way through a Ph.D. in Information Systems at the New Jersey Institute of Technology (NJIT). My academic journey started with a solid foundation in Civil Engineering, where I earned a combined Master’s and Bachelor’s degree from Vietnam National University and the National Institute of Applied Sciences of Lyon. My early work focused on making buildings energy-efficient and comfortable for people, which set the foundation for my human-centered approach today.
After that, I spent some time at Clemson University, working on a project funded by the National Science Foundation (NSF) and the National Institute for Occupational Safety and Health (NIOSH). There, I developed an audio-based AI model using deep neural networks to detect collision hazards on construction sites. This work was all about keeping workers safe with an AI model that could be integrated into wearable hearing protection—allowing critical sounds to come through while blocking out unwanted noise.
Fast forward to now, my research is centered on making technology more accessible for everyone. Whether it's developing virtual reality experiences for blind and low-vision users or enhancing video accessibility for the D/deaf and hard-of-hearing communities, I’m all about using tech to improve lives. My work is grounded in a human-centered approach, always involving the very people who will benefit from these innovations.
I’m passionate about pushing the boundaries of what’s possible with AI and XR, with the goal of creating solutions that are as inclusive as they are cutting-edge. It’s all about ensuring that everyone, regardless of their abilities, can enjoy the full benefits of today’s technology.
Publications
-
Virtual reality (VR), inherently reliant on spatial interaction, poses significant accessibility barriers for individuals who are blind or have low vision (BLV). Traditional audio descriptions (AD) typically provide a verbal explanation of visual elements in 2D or flat video media, facilitating access for BLV audiences but failing to convey the complex spatial information essential in VR. This shortfall is especially pronounced in musical performances, where understanding the spatial arrangement of the stage setup and movements of performers is crucial. To overcome these limitations, we have developed two AD approaches—Spatial AD for a dance performance and View-dependent AD for an instrumental performance—within VR-based 360° environments. Spatial AD employs spatial audio technology to align descriptions with corresponding visuals, dynamically adjusting to follow the visuals, such as the movements of performers in the dance performance. Meanwhile, View-dependent AD adapts descriptions based on the orientation of the VR headset, activating when particular visuals enter the central view of the camera, ensuring that the description aligns with the user's attention directed to a particular location within the VR environment. These methods are designed as enhancements to traditional AD, aiming to improve spatial orientation and immersive experiences for BLV audiences. This demonstration showcases the potential of these AD approaches to improve interaction and engagement, furthering the development of inclusive virtual environments.
-
Our research focuses on making musical performance experience in virtual reality (VR) settings non-visually accessible for Blind and Low Vision (BLV) individuals by designing a conceptual framework for omnidirectional audio descriptions (AD). We address BLV users' prevalent challenges in accessing effective AD during VR musical performances. Employing a two-phased interview methodology, we initially collected qualitative data about BLV AD users' experiences, followed by gathering insights from BLV professionals who specialize in AD. This approach ensures that the developed solutions are both user-centric and practically feasible. The study devises strategies for three design concepts of omnidirectional AD (Spatial AD, View-dependent AD, and Explorative AD) tailored to different types of musical performances, which vary in their visual and auditory components. Each design concept offers unique benefits; collectively, they enhance accessibility and enjoyment for BLV audiences by addressing specific user needs. Key insights highlight the crucial role of flexibility and user control in AD implementation. Based on these insights, we propose a comprehensive conceptual framework to enhance musical experiences for BLV users within VR environments.
-
High-quality closed captioning of both speech and non-speech elements (e.g., music, sound effects, manner of speaking, and speaker identification) is essential for the accessibility of video content, especially for d/Deaf and hard-of-hearing individuals. While many regions have regulations mandating captioning for television and movies, a regulatory gap remains for the vast amount of web-based video content, including the staggering 500+ hours uploaded to YouTube every minute. Advances in automatic speech recognition have bolstered the presence of captions on YouTube. However, the technology has notable limitations, including the omission of many non-speech elements, which are often crucial for understanding content narratives. This paper examines the contemporary and historical state of non-speech information (NSI) captioning on YouTube through the creation and exploratory analysis of a dataset of over 715k videos. We identify factors that influence NSI caption practices and suggest avenues for future research to enhance the accessibility of online video content.
-
Blind and low-vision (BLV) individuals often face challenges in their attendance and appreciation of musical performances (e.g., concerts, musicals, opera) due to limited mobility and accessibility of visual information. However, the emergence of Virtual Reality (VR) based musical performance as a common medium of music access opens up opportunities to mitigate the challenges and enhance the musical experiences by investigating non-visual VR accessibility. This study aims to 1) gain an in-depth understanding of the experiences of BLV individuals, including their preferences, challenges, and needs in listening to and accessing various modes (audio, video, and on-site experiences) of music and musical performances and 2) explore the opportunities that VR can create for making the immersive musical experiences accessible for BLV people. Using a mixed-methods approach, we conducted an online survey and a semi-structured interview study with 102 and 25 BLV participants, respectively. Our findings suggest design opportunities for making the VR space non-visually accessible for BLV individuals, enabling them to participate equally in the VR world and to further access immersive musical performances created by VR technology. Our research contributes to the growing body of knowledge on accessibility in virtual environments, particularly in the context of music listening and appreciation for BLV individuals.
-
Safety-critical sounds at job sites play an essential role in construction safety, but hearing capability is often declined due to the use of hearing protection and the complicated nature of construction noise. Thus, preserving or augmenting the auditory situational awareness of construction workers has become a critical need. To enable further advances in this area, it is necessary to synthesize the state-of-the-art auditory signal processing techniques and their implications for auditory situational awareness (ASA) and to identify future research needs. This paper presents a critical review of recent publications on acoustic signal processing techniques and suggests research gaps that merit further research for fully embracing construction workers’ ASA of hazardous situations in construction. The results from the content analysis show that research on ASA in the context of construction safety is still in its early stage, with inadequate AI-based sound sensing methods available. Little research has been undertaken to augment individual construction workers in recognizing important signals that may be blocked or mixed with complex ambient noise. Further research on auditory situational awareness technology is needed to support detecting and separating important acoustic safety cues from complex ambient sounds. More work is also needed to incorporate context information into sound-based hazard detection and to investigate human factors affecting the collaboration between workers and AI assistants in sensing the safety cues of hazards.
-
Collisions between workers and operating vehicles are the leading source of fatal incidents in the construction industry. One of the most prevalent factors causing contact hazards is the decline in construction workers' auditory situational awareness due to the hearing loss and the complicated nature of construction noises. Thus, a computational technique that can augment the audible sense of a worker can significantly improve safety performance. Since construction machines often generate distinct sound patterns while operating at the construction sites, audio signal processing could be an innovative solution to achieve the goal. Unfortunately, the current body of knowledge regarding automated surveillance in construction still lacks such advanced methods. This paper presents a newly developed auditory surveillance framework using convolutional neural networks (CNNs) that can detect collision hazards by processing acoustic signals in construction sites. The study specifically has two primary contributions: (1) a new labeled dataset of normal and abnormal sound events relating to collision hazards in the construction site, and (2) a novel audio-based machine learning model for automated detection of collision hazards. The model was trained with different network architectures, and its performance was evaluated using various measures, including accuracy, recall, precision, and combined F-measure. The research is expected to help increase the auditory situational awareness of construction workers and consequently enhance construction safety.
Peer Review Contributions
I enjoy being part of the peer review process. It’s been a rewarding experience to support the academic community and stay connected with the latest research. Here’s a snapshot of where I’ve lent a hand:
Annual Conference of the Cognitive Science Society (CogSci) – 5 reviews
ACM Conference on Virtual Reality Software and Technology (VRST) – 4 reviews
ACM Conference on Human Factors in Computing Systems (CHI) – 3 reviews
ACM Conference on Interactive Media Experiences (IMX) – 3 reviews
ACM Conference on Designing Interactive Systems (DIS) – 2 reviews
ACM SIGCHI Interaction Design and Children Conference (IDC) – 2 reviews
ACM Symposium on Spatial User Interaction (SUI) – 2 reviews
IEEE Symposium on Mixed and Augmented Reality (ISMAR) – 2 reviews
ACM Conference on Conversational User Interfaces (CUI) – 1 review
ACM Conference on Mobile Human-Computer Interaction (MobileHCI) – 1 review
In addition to reviewing, I’ve also served as an Area Chair:
ACM Conference on Conversational User Interfaces (CUI) – 5 editor records