RSS-Feed abonnieren
DOI: 10.1055/a-2089-5758
Künstliche Intelligenz bei akustischen Signalen zur Bestimmung der Stimmqualität
Artificial intelligence in acoustic signals for the determination of voice qualityAbstract
Objective measurement and quantification of voice quality is a problem that has always preoccupied voice research. Recently, artificial intelligence (AI) methods have been increasingly used for this purpose. This article discusses the difficulties of determining voice quality and shows which problems can occur with objective and subjective measurement methods. As a more recent approach, the use of AI and its possibilities and limitations are discussed.
-
KI ist ein mächtiges Werkzeug. Allerdings ist sie durch die Daten, mit denen sie arbeitet, und durch die fehlende Klarheit der Problemstellung limitiert.
-
Die eindeutige und wohldefinierte Bestimmung von Stimmqualität ist ein Problem, das selbst mit KI nicht zu 100 % und eindeutig zu lösen ist.
Publikationsverlauf
Artikel online veröffentlicht:
05. September 2023
© 2023. Thieme. All rights reserved.
Georg Thieme Verlag KG
Rüdigerstraße 14, 70469 Stuttgart, Germany
-
Literatur
- 1 Barsties B, De Bodt M. Assessment of voice quality: Current state-of-the-art. Auris Nasus Larynx 2015; 42: 183-188 DOI: 10.1016/j.anl.2014.11.001.
- 2 Kreiman J, Vanlancker-Sidtis D, Gerratt B. Defining and measuring voice quality. In: From Sound to Sense: June 11–13, 2004 at MIT. 2004 https://www.researchgate.net/publication/237236417_Defining_and_measuring_voice_quality
- 3 Keller E. The Analysis of Voice Quality in Speech Processing. In: Nonlinear Speech Modeling and Applications. Berlin: Springer-Verlag; 2005
- 4 Scherer S, Kane J, Gobl C. et al. Investigating fuzzy-input fuzzy-output support vector machines for robust voice quality classification. Comp Speech Lang 2013; 27: 263-287 DOI: 10.1016/j.csl.2012.06.001.
- 5 Heman-Ackah YD, Sataloff R, Laureyns G. et al. Quantifying the Cepstral Peak Prominence, a Measure of Dysphonia. J Voice 2014; 28: 783-788 DOI: 10.1016/j.jvoice.2014.05.005.
- 6 Watts CR, Awan SN, Maryn Y. A Comparison of Cepstral Peak Prominence Measures From Two Acoustic Analysis Programs. J Voice 2017 31: 387.e1-387.e10 DOI: 10.1016/j.jvoice.2016.09.012.
- 7 Brockmann-Bauser M, de Paula Soares MF. Do We Get What We Need from Clinical Acoustic Voice Measurements?. Appl Sci 2023 13: 941 DOI: 10.3390/app13020941.
- 8 Schlegel P, Semmler M, Kunduk M. et al. Influence of Analyzed Sequence Length on Parameters in Laryngeal High-Speed Videoendoscopy. Appl Sci 2018 8: 2666 DOI: 10.3390/app8122666.
- 9 Schlegel P, Kunduk M, Stingl M. et al. Influence of spatial camera resolution in high-speed videoendoscopy on laryngeal parameters. PLOS one 2019 14: e0215168 DOI: 10.1371/journal.pone.0215168.
- 10 Kreiman J, Gerratt BR, Kempster GB. et al. Perceptual Evaluation of Voice Quality. J Speech Lang Hear Res 1993; 36: 21-40 DOI: 10.1044/jshr.3601.21.
- 11 Sisman B, Yamagishi J, King S. et al. An Overview of Voice Conversion and Its Challenges: From Statistical Modeling to Deep Learning. IEEE/ACM Transactions on Audio, Speech and Language Processing 2021; 29: 132-157 DOI: 10.1109/TASLP.2020.3038524.
- 12 Ning Y, He S, Wu Z. et al. A Review of Deep Learning Based Speech Synthesis. Appl Sci 2019 9: 4050 DOI: 10.3390/app9194050.
- 13 Fagherazzi G, Fischer A, Ismael M. et al. Voice for Health: The Use of Vocal Biomarkers from Research to Clinical Practice. Digit Biomark 2021 5: 78-88 DOI: 10.1159/000515346.
- 14 Fjelland R. Why general artificial intelligence will not be realized. Humanit Soc Sci Commun 2020; 7: 10 DOI: 10.1057/s41599-020-0494-4.
- 15 Zhang B, Dreksler N, Anderljung M. et al. Forecasting AI Progress: Evidence from a Survey of Machine Learning Researchers. arXiv 2022; DOI: 10.48550/arXiv.2206.04132.
- 16 Rajpurkar P, Chen E, Banerjee O. et al. AI in health and medicine. Nat Med 2022; 28: 31-38 DOI: 10.1038/s41591-021-01614-0.
- 17 Schlegel P, Kniesburges S, Dürr S. et al. Machine learning based identification of relevant parameters for functional voice disorders derived from endoscopic high-speed recordings. Sci Rep 2020; 10: 10517 DOI: 10.1038/s41598-020-66405-y.
- 18 Elish MC, Boyd D. Situating methods in the magic of Big Data and AI. Commun Monogr 2018 85: 57-80 DOI: 10.1080/03637751.2017.1375130.
- 19 Gary M. The Next Decade in AI: Four Steps Towards Robust Artificial Intelligence. arXiv 2020; DOI: 10.48550/arXiv.2002.06177.
- 20 Borsky M, Mehta DD, Stan JHV. et al. Modal and Nonmodal Voice Quality Classification Using Acoustic and Electroglottographic Features. IEEE/ACM Transactions on Audio, Speech and Language Processing 2017; 25: 2281-2291 DOI: 10.1109/TASLP.2017.2759002.
- 21 Verde L, Pietro GD, Ghoneim A. et al. Exploring the Use of Artificial Intelligence Techniques to Detect the Presence of Coronavirus Covid-19 Through Speech and Voice Analysis. I. EEE Access 2021; 9: 65750-65757 DOI: 10.1109/ACCESS.2021.3075571.