Finding the meaning behind the words we use is something humans are so good at that it often seems simple.
But for computers, understanding the emotions embedded in text is a very difficult task.
I spoke to Stephen Pulman of Oxford University's Computing Laboratory about his research which is helping computers to see what we mean:
OxSciBlog: Why do computers find it hard to understand the meaning behind words?
Stephen Pulman: There are several reasons. Words can be ambiguous: for example, 'cap' can mean 'beat', as in 'I can cap that', or 'limit', as in 'the government will cap student numbers'. If I hear 'the Council is to cap parking charges' I know - or at least hope - that 'cap' means 'limit' in this context. Computers have to be told this kind of thing explicitly, because they don't know anything about the world. Linguistically, the sentence could also mean that the Council were trying to beat some other Council's charges.
Another reason is that frequently we make inferences from words. If I want to know when the new Vice-Chancellor started work in Oxford, and I find a press release saying 'Professor Andrew Hamilton will be ceremonially installed as Oxford's 271st Vice-Chancellor today' then - at least, when I know which date 'today' referred to: another potentially difficult problem - I know when the new Vice-Chancellor (officially) started work.
OSB: How does your approach help them assess the emotional meaning of text?
SP: We have a very large list of words annotated for the emotional meaning they carry, and we also take account of the grammatical context in which these words occur, so that the effects of negation and other constructs that change meaning can be taken account of. A word like 'progress' is generally perceived as positive, but not when it is in a context like 'fail to progress', or 'little progress'.
OSB: What are the advantages of your approach over existing systems?
SP: By taking account of grammatical context, we can determine emotional attitudes towards the entities and relations mentioned in a text, rather than just characterising the text overall as positive or negative. For example, a movie review might be enthusiastic about the production, but critical of the plot, or praise one actor and criticise another.
OSB: How might systems using your approach be useful to firms and government agencies?
SP: Many large companies are concerned about their reputation, and the reception that their products receive in the marketplace. By analysing
news reports, blogs, or postings on social media like Facebook or Twitter, companies can get almost instant feedback about this. Government agencies can follow the attitudes of dissident or terrorist groups by using systems like ours to track mentions of people or places in intercepted emails and texts or on web sites, particularly when combining our technology with automated translation.
More about this work in an article in The Economist.
Stephen Pulman is Professor of Computational Linguistics at Oxford's Computing Laboratory.