Connect the digital thread with the best AI-powered search for manufacturers Watch the demo

Understand Natural Language Processing in 5 questions

Posted by Charlotte Foglia

Understand natural language processing

One of the many exciting futurisms predicted by the original Star Trek TV series was the notion that people could have conversations with computers. “Computer, show me a map of all Klingon bases in the galaxy,” Captain Kirk would say to some unseen microphone on the bridge of the Enterprise. In response, the computer would—as if by magic—show the requested map on a large screen.

It seemed so incredibly far-fetched when the series came out in 1966. At that point, human beings could only interact with computers using teletypes and, in some cases still, paper punch cards. What Star Trek viewers could not have known, however, was that even by 1966 computer scientists were already a decade into ambitious projects aimed at getting computers to understand human language. Progress was slow at first, but the last few years have seen remarkable breakthroughs in natural language processing (NLP).

Today, we take it for granted that we can ask Siri to show us a map, as if we were Captain Kirk, but it has taken half a century to get to this point where computers are capable of speech recognition. As NLP’s sophistication and potential increases, it’s worth taking a moment to understand how computer achieved the powers of natural language understanding and natural language generation. These concepts are likely to predominate the IT field in the coming years.

What is Natural Language Processing?

Simply speaking, natural language processing refers to any situation where a human-to-machine interaction where the computer is able to understand or generate human-like language. This is in contrast to what computers were previously limited to, which was some form of machine language. Simple as the end result may appear, the actual process of getting a computer to perform NLP represents an extremely complex synergy of different scientific and technical disciplines.

NLP brings together computer science, linguistics, machine learning and artificial intelligence (AI). Depending on how the NLP software is designed, the process may also involve statistics and data analytics. These elements, working in concert, make NLP capable of “hearing” or reading natural human speech and accurately parsing the words so the computer can take the action expected by the human user.

How does it work?

How does NLP work? There is no single way that NLP software functions. However, in general, almost all NLP tools have capabilities that enable them to distinguish syntactic and semantic rules in addition to recognizing many different words. This is not an easy thing to do.

Consider the following written enterprise search query: An employee wants to know if the company has declared December 26 a holiday. She types “Do we get Boxing Day off?” That’s a question that most human beings can easily understand, even if they have to figure out that December 26 is sometimes called “Boxing Day.”

An NLP program is going to have to deconstruct this question on syntactic and semantic levels in order to enable the enterprise search engine to return a result.

First, it has to recognize the “Do we” is a question reflecting “we,” meaning employees of the company. Then, it has to parse the words “get” and “off” as referring to getting time off from work, versus say, getting off a plane. It needs to understand what Boxing Day is… to the point where the search engine can actually respond to the real query, which is “Is December 26 a holiday?”

Performing these tasks is a pretty gigantic challenge. Human language is fantastically complex, with English being arguably one of the most difficult of them all. NLP transforms data into something that a computer can interpret by starting with what is known as “data pre-processing.”

In this stage, the NLP tool analyzes syntax and semantics to understand the grammatical structure of the text. It identifies how the words relate to one another in the specific context at hand.

Data pre-processing may utilize tokenization, which breaks text down into semantic units for analysis. The process then tags different parts of speech, e.g. “we” is a noun, “do” is a verb and so forth. It could then perform techniques called “stemming” and “lemmatization,” which reduce words to their root forms. The NLP tool might also filter out words like “a” and “the” that don’t any unique information.

The data pre-processing step generates a clean dataset for the actual linguistic analysis. The NLP tool has an algorithm that then interprets the dataset. The algorithm can take one of several forms. With a rule-based approach, the NLP tool uses grammatical rules created by expert linguists.

Alternatively, and this is increasingly common, NLP makes use of machine learning algorithms. These models are based on statistical methods that “train” the NLP to get better and better at understanding human language. Going further, the NLP tool might take advantage of deep learning, sometimes called deep structured learning, which is based on artificial neural networks.

What is an NLP application?

NLP applications put NLP to work in specific use cases, such as intelligent search. The technology has many uses, especially in the business world where people need help from computers in dealing with large volumes of unstructured text data.

For example, a company might benefit from understanding its customers’ opinions of the brand. However, whatever insights there are regarding the brand are hidden with millions of social media messages. No human being is going to read them all. However, an NLP tool that is tuned for “sentiment analysis” could get the job done.

Other notable NLP applications include:

  • Virtual Assistants and Chatbots—these familiar bits of software can answer questions and provide online help, among many use cases using NLP. They are usually configured to learn from every “conversation” they have.
  • Market Research—NLP is able to help marketers learn about their customers by analyzing human language contained in unstructured data such as chat threads and online comments. This process uses text classification, another NLP application. Market research could also require text extraction, wherein the NLP tool looks for specific words, such as a product name, and extracts the relevant text for use in customer analysis. The software may be able to infer purchase intent, among other capabilities.
  • Speech Recognition—NLP tools can recognize spoken language, such as is the case with virtual assistants like Amazon’s Alexa and Apple’s Siri. The technology can also be put to work transcribing recordings and voice messages.
  • Urgency Detection—An NLP tool can be trained to spot urgent issues in a stream of natural language. For example, if a company receives 100,000 support emails a day, it can use NLP urgency detection to find customer who need help right away by spotting phrases like “I’m locked out of my car” or “I am about to go into the hospital.”

What are the benefits of NLP?

As the NLP applications suggest, the technology can deliver an extensive array of benefits to businesses. With NLP, machines get smarter and more able to interact with human beings.

In enterprise search, for example, NLP-enable search platforms make it easier for employees to find information and documents. They don’t have to know exactly what they’re searching for. They can write a more general query and the platform will use its human language skills to discover whatever it is the user needs. This saves time and ultimately, money.

Government and public sector organizations can similarly benefit from NLP. By making it possible for people to get information and services without having to wait for a person, NLP can potentially improve people’s lives. It helps keep costs down, which is also a constant issue in the public sector.

What are the challenges of NLP?

NLP is an impressive technology, but it’s still quite early in its lifecycle. In 2066, people will probably be amazed at how primitive 2020’s state of the art looks to them.

Many challenges exist. These include making speech recognition better along with achieving a more consistent and accurate understanding of language. This problem is partly due to some current limitations of AI. There is more to intelligence than just language, after all.

For instance, a computer may not understand the meaning behind a statement like, “My wife is angry at me because I didn’t eat her mother’s dessert.” There’s a lot of human culture embedded in the language. The technology to identity such nuances has not been invented so far.

NLP is a foundational aspect of intelligent search solutions used in the enterprise. By understanding natural language, an enterprise search platform is better able to give users the information they seek. The platform can achieve this goal by connecting NLP with data in all systems, formats, locations or languages. The technology has come a long way in recent years, with many more advances sure to come in the near future. The bridge of the starship Enterprise awaits.