Imagine if doctors could one day diagnose throat cancer, Alzheimer’s disease, depression or any other illness based on a patient’s voice. To make this a reality, Washington University School of Medicine in St. Louis is joining the National Institutes of Health (NIH) Bridge2AI program, an estimated $130 million initiative aimed at transforming the use of artificial intelligence (AI ) in biomedical and behavioral science .
One of the first projects involves building a database of different human voices and using the tools of AI and machine learning to train computers to recognize diseases based on characteristics of the human voice. This effort – dubbed Voice as a Biomarker of Health – will bring together researchers from 12 institutions across North America, including Washington University, to build the database, which is ethically sourced and also protects patient privacy.
“There is evidence that well-designed computer models can predict who has dementia or cancer based on, for example, voice recordings, which would then complement additional diagnostic methods,” said Philip RO Payne, PhD, Janet and Bernard Becker Professor, Chief Data Scientist and Director of the Institute for Computer Science. “We will also spearhead new efforts in training and staff development in the field of AI and its applications in biomedicine. As part of that, this project will help define a whole new way of creating this type of complex dataset and sharing it—in an ethical way that protects privacy—with a wide range of scientists.”
Payne leads the project at Washington University and works with researchers across North America, including at the University of South Florida in Tampa and Weill Cornell Medicine in New York, who are leading the project statewide.
In addition to building this unique data set, Washington University will co-lead a skills and people development core for the national project. The core, co-led with Oregon Health & Science University, will focus on training investigators — including scientists from academia, industry, government, and even citizen scientists — from across the country to access the voice data and use it can use for research purposes. According to Payne, each researcher who wants to learn how to use the dataset is given a customized training plan, with much of the learning delivered in a virtual format and then supported by face-to-face mentoring.
“Often citizen scientists, or people we would include in this category, are patients who themselves have the disease, or specialists in private practice who help patients with specific conditions, e.g. B. People who stutter and the speech therapists who work with them,” Payne said. “We are developing outreach efforts to connect with people in the community to participate in this research and also to help us gather a rich and diverse dataset of human voices. This is critical to building an ethical and representative dataset that eliminates potential bias.”
Based on the existing literature and ongoing research, the research team identified five disease categories where voice changes have been associated with disease and where early detection urgently needs to be improved. The data collected for this project will focus on the following disease categories:
- Voice disorders (larynx cancer, paralysis of the vocal folds, benign lesions of the larynx).
- Neurological and neurodegenerative diseases (Alzheimer’s, Parkinson’s, stroke, Amyotrophic Lateral Sclerosis).
- Mood and psychiatric disorders (depression, schizophrenia, bipolar disorder).
- respiratory diseases (pneumonia, Chronic obstructive pulmonary diseaseheart defect).
- Pediatric voice and language disorders (speech and language delays, autism).
Although preliminary work with voice data has been promising, limitations in integrating voice as a biomarker into clinical practice have been linked to small data sets, ethical concerns around data ownership and privacy, bias and lack of diversity in the data. To solve these, the Voice as a Biomarker of Health project is creating a large, high-quality, multi-institutional, and diverse voice database linked to identity-protected and unidentifiable biomarkers from other data such as demographics, medical imaging, and genomics. Federated learning technology — a novel AI framework that allows machine learning models to be trained on data without the data ever leaving its source — is being used by French-American AI biotech startup Owkin across multiple research centers to develop these cross-center Demonstrating AI research can be done while respecting the privacy and security of sensitive voice data.
“Using speech to diagnose disease becomes particularly interesting when you think about the proliferation of virtual care and telemedicine during the pandemic,” Payne said. “Doctors have become more accustomed to seeing people from a distance, even though they cannot physically examine patients. But what if there is an AI algorithm during a virtual visit that can use the patient’s voice to recognize high blood pressure, for example. We’re not there yet, but this could potentially make future telemedicine more useful, with higher quality, better safety, and improved health outcomes, especially for people living far from healthcare providers.”
Supported by AI experts, bioethicists and social scientists, the project aims to transform the fundamental understanding of diseases and introduce a new method for diagnosis and treatment of diseases into the clinical setting. Because human voice recordings are inexpensive, easy to store, and readily available, diagnosing diseases by voice using AI could be a transformative step in precision medicine and healthcare accessibility, Payne added.