Could this Oxford-developed software help solve the tricky problem of lip-reading? | University of Oxford
Lip-reading
Even the best human lip-readers struggle with accuracy.

Could this Oxford-developed software help solve the tricky problem of lip-reading?

Stuart Gillespie

Researchers in Oxford's Department of Computer Science are developing software to tackle the tricky business of lip-reading. With even the best human lip-readers limited in their ability to accurately recognise speech, artificial intelligence and machine learning could hold the key to cracking this problem.

While the new software, known as LipNet, is still in the early stages of development, it has shown great potential, achieving a performance of 93% against an existing lip-reading dataset.

Yannis Assael, a DPhil candidate in Oxford's Department of Computer Science who worked on the project, said: 'LipNet aims to help those who are hard of hearing. Combined with a state-of-the-art speech model, it has the potential to revolutionise speech recognition.

'The implications for the future could be quite significant. As we said in our LipNet paper's introduction, machine lip-readers have enormous practical potential, with applications in improved hearing aids, silent dictation in public spaces, covert conversations, speech recognition in noisy environments, biometric identification, and silent-movie processing.

'So far, we have compared the 93% performance of LipNet against human lip-reading experts – 52% – using the largest publicly available sentence-level lip-reading dataset, called the GRID corpus. LipNet also outperforms the best previous automatic lip-reading system, which achieved 80% on this dataset. And that system, unlike LipNet, predicts only words, not complete sentences.'

Brendan Shillingford, another Computer Science doctoral candidate who worked on the project, added: 'The GRID corpus has a fixed grammar and limited vocabulary, which is why LipNet performs so well there. Nonetheless, there are no signs that LipNet wouldn't perform well when trained on larger amounts of more varied data, and this is what we are working on now.

'In the future, we hope to test LipNet in a real-world setting, and we believe that extending our results to a large dataset with greater variation will be an important step in this direction.'