Gurugubelli Krishna received his doctorate in Electronics and Communication Engineering (ECE). His research work was supervised by Dr. Anil Kumar V. Here’s a summary of Gurugubelli Krishna’s thesis, Extraction of Information from Speech for the Detection and Assessment of Dysarthria as explained by him:
Dysarthria is a disorder resulting from weaknesses of neuromuscular execution in motor speech production due to brain tumors, brain injury, stroke, cerebral palsy and facial paralysis. The abnormalities in the resonance, articulation, respiration, phonation, and prosody of speech are associated with dysarthria leads to poor speech intelligibility. The symptoms of poor speech quality and reduced intelligibility can be used to identify the dysarthria. Degree to which the listener understands the dysarthric individual’s speech is referred to as speech intelligibility of that speaker. Dysarthric speech detection and intelligibility assessment are very important steps in clinical diagnosis of dysarthria. The subjective intelligibility assessment methods may influence by the listener familiarity with patients, the contextual, suprasegmental factors, and semantic/syntactic features. Moreover, the subjective intelligibility assessment methods are costly and time-consuming. Objective intelligibility assessment methods, on the other hand, are economical, repeatable, reliable and can assist in remote patient rehabilitation monitoring. The growing evidence suggesting that clinicians are becoming more receptive to objective intelligibility assessment systems in which the speech intelligibility can be assessed by the trained acoustic model.
An objective assessment method like acoustic to articulatory representation of dysarthric speech can effectively uncover the dysarthria specific features that are useful for speech assessment. Additionally, the detection of articulatory changes from speech is useful for the assessment of dysarthria. Hence, the extraction of features from speech signal is considered to be one of the most important step in the development of clinical tools for automatic detection and assessment of dysarthria. In literature, few attempts have been done in the direction of automatically detecting dysarthria, intelligibility assessment of dysarthric speech (associated with the severity of dysarthria), and detecting dysarthria type. To characterize the irregular changes in articulatory events occur due to the abnormalities in speech production, instantaneous spectral representation of speech is important. Additionally, inclusion of excitation source features improves the performance of dysarthria assessment systems. The main objectives of this thesis are dysarthric speech detection (DSD) and dysarthric speech intelligibility assessment (DSIA).
The issues addressed in this thesis are summarized as follows:
- Acoustic analysis of rhotic approximants has been done, and the relation between duration of rhotic approximant sounds and dysarthria severity has been investigated.
- Based on the idea of the single frequency filtering technique, this thesis proposes the single frequency filter bank (SFFB) to estimate the instantaneous changes of the vocal tract system during speech production. Furthermore, perceptual-enhanced single frequency cepstral coefficients (PE-SFCC) have been proposed for DSD and DSIA.
- This thesis investigated the importance of analytic phase information in the perception of speech intelligibility. Further, the importance of group-delay features and analytic phase features for the automatic detection and assessment of dysarthria has been investigated.
- The knowledge of epoch locations in continuous speech has been investigated for the detection and assessment of dysarthric speech. Zero-phase zero frequency resonator (ZP-ZFR), which provides a stable implementation of zero frequency filtering (ZP-ZFF) has been proposed for the epoch extraction from continuous speech.
Keywords: Analytic phase, Dysarthria, Detection and assessment of dysarthria, Epoch locations, Instantaneous frequency, Intelligibility, Rhotic approximant, Single frequency filter bank, Strength of the excitation, Zero-phase, Zero frequency filtering.