AVFormer: Injecting vision into frozen speech models for zero-shot AV-ASR
Posted by Arsha Nagrani and Paul Hongsuck Seo, Research Scientists, Google Research Automatic speech recognition (ASR) is a well-established technology that is widely adopted for various applications such as conference...