The ambiguities that make NLP challenging

Natural language processing deals with the interaction between computers and human natural language. This very interesting field which merges interests of artificial intelligence and linguistics is not without the common challenges that characterise the pursuit of trying to teach machines intelligent behaviour. Here we look at how human language is riddled with ambiguities, through which can make NLP difficult.

Ambiguity on a syntactic level

When a sentence has multiple possible parse trees, it can be interpreted in more than one way. Sometimes even with contextual information it is hard for the machine to decipher which meaning was intended by the speaker.

I saw the girl with the glasses

Does that mean I used the glasses and I saw the girl? Or did I see the girl who was wearing glasses?

Common syntactic ambiguity is seen with sentences with phrase attachment and conjunction.

Ambiguity on an acoustic level

In terms of acoustic speech recognition, some phonological sounds share similar acoustic features which the machine may read wrong. Even more so when you take into account co-articulary effects where temporal contexts of speech sounds are merged and blended as a result of the goal for less ‘effortful’ speech.

Ambiguity on a semantic level

Words can have more than one meaning. Researchers have different ways of looking at most probable meanings of words depending on the surrounding words within the phrase. Still, sometimes it can be tricky when two possibilities have a similar level of probability of occurrence within that environment.

Ambiguity on a discourse level 

John said to his brother that he was bothered

Is John bothered? Or he said his brother was bothered?

We as humans often have moments where we need clarification, especially when it comes to pronouns. At the discourse level, pronouns can be ambiguous when there are more possible referees of one pronoun. Cognitively, the more complex and more amount of information (including pronouns and characters) in a given discourse, the more effortful it is for us to hold that in memory in order to continue following the story. For the machine, it seems a tricky challenge that even us sometimes struggle with.


This week’s topic was inspired by the NLP lectures of Michael Collins at Columbia University.





Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s