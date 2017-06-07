In the early days, the field involved writing out symbolic rules of grammar—subject, noun, clauses, predicates with a verb, perhaps followed by a noun phrase, perhaps followed by a prepositional phrase and so forth. People worked for years trying to replicate grammar and lexicons. It worked fine in very small contexts, but never really extended to understanding meaning.

Then in the ’90s the first revolution came when masses of language became available online, digitally. It was then that people started to explore statistical methods of analyzing all that data and building probabilistic models of which words are likely to appear together to create meaning and sentiment.

The grammatical rules have very little to do with communication, what it means to sound natural. You could walk up to someone and say, “Good morning, how are you this morning?” It’s perfectly correct, but no one says that. They say, “Hey, how’s it going?” Most language is comprised of these softer decisions as to how people use the language.

In natural language processing, you need the computer to understand the world. How do you do that? Well, a very good way is through the enormous amount that has already been written about the world. Every day, writers across the planet are writing about our world and how it all works. We create computational models that assign mathematical values to words and groups of words and use them to successfully read text and derive meaning.