Abstract: With current state-of-the-art (SOTA) automatic speech recognition (ASR) systems, it is not possible to transcribe overlapping speech audio streams separately. Consequently, when these ASR ...
Abstract: End-to-end speech-to-text translation (E2E ST) has increasingly aroused interest and attention recently, attempting to address the problem of data scarcity and modeling burden. Several ...