Technology designed by scientists at Oxford University and Leeds University can learn British Sign Language (BSL) signs from overnight TV broadcasts by matching subtitled words to the hand movements of an on-screen interpreter.
The work [detailed here] is a crucial step towards a system that can automatically recognise BSL signs and translate them into words.
A major challenge in recognising signs is to track the signer’s hands as they move on the broadcast – no mean feat as these can get lost in the background, blur or cross – and the arms can assume a vast number of configurations.
The system tackles this problem by overlaying a model of the upper body onto the video frames of the signer by looking for probable configurations, finding the large number of frames where these can be correctly identified and then ‘filling in the gaps’ to infer how the hands get from one position to another.
Another big challenge is to match a target word that appears in a subtitle to the corresponding sign – particularly difficult as words and signs often appear separated in time and words can be signed in many different ways so the corresponding sign may not appear at all.
To overcome this problem the system compares a small number of sequences in which the target word appears in the subtitles with a large number of sequences in which it does not.
Within this footage it then finds the 7-13 frames that appear often in the ‘target word’ sequences and infrequently in the ‘no target word’ ones. This enables it to learn to match over 100 target words to signs automatically.
'This is the first time that a computer system has been able to learn signs on its own and on this scale in this way - with just the information available in the broadcast’s subtitle information and video frames and without the need for humans to give it annotated examples of what each sign looks like,’ said Andrew Zisserman of Oxford University’s Department of Engineering Science who led the work with Patrick Buehler at Oxford and Mark Everingham of Leeds University’s School of Computing.
Mark Everingham said: ‘It demonstrates the sort of very tough problems which advanced image recognition technology is starting to be able to solve. These technologies have the potential to revolutionise the automated searching, classifying and analysis of moving and still images.’
This research was supported by the Engineering and Physical Sciences Research Council, Microsoft and the Royal Academy of Engineering.