July 10, 2020

Natural language generation. What it is and why it matters.

Mike Khanna

Natural language generation lets computers create meaningful sentences that humans understand. This is a key part of embedding AI in business processes.

Over recent years, natural language processing (NLP) has grown from an obscure research topic to a central aspect of AI. NLP is a catch-all term relating to teaching computers to communicate in human language. At its heart lie two key concepts. Natural language understanding (NLU) is about understanding human vocabulary and grammar to extract the meaning in sentences and longer texts. Natural language generation (NLG) is the flipside of the coin. NLG lets computers generate meaningful sentences that can be understood by humans. So, what’s so great about that? In this blog, I will explain why NLG is essential if you are embedding AI into your business processes.

Why it pays if computers can compose text

Many business processes rely on generating large numbers of documents. For instance, when you sign up a new customer or provider. When you employ a new person. Or even when you generate your annual report to shareholders. In all cases, you need to process large amounts of data and then use this to generate complex documents. This is something NLG is perfect for.

Similarly, in industrial settings, you might want to generate status reports on the fly based on current sensor data. Or create an interactive dashboard showing details of your e2e manufacturing process. NLG is a core element in such systems. But where did NLG come from in the first place?

A brief history of NLP

Before we look in detail at NLP, I want to give you a brief history lesson. Many people assume NLP is a new concept or that it was purely SciFi (think Star Trek’s talking computers). However, NLP is one of the oldest branches of theoretical computer science. Back in the 1950s, Alan Turing, often dubbed the father of modern computing, became interested in artificial intelligence. At the time, the most powerful computers were the Manchester Mark 1 and EDSAC. Neither of these had anything like the computing power needed for even simple AI. So, Turing’s work was entirely theoretical.

One of the key questions he addressed was how to tell if a computer is exhibiting human intelligence. His simple solution was a modification of a popular party game called the imitation game. In Turning’s game, a human and computer would hold a conversation. Another human would then judge which of the two conversationalists was the computer. If the judge couldn’t reliably tell, the computer had won the game. Nowadays, we use a modified version of his game called the Tuinrg Test to validate chatbots.

But, you might ask, why is this significant to NLG? The answer is that this was the first time someone had suggested a computer could conduct natural language conversations. In turn, this inspired the creation of a new field of computer science looking at natural language processing.

Natural language generation is a core part of NLP

The main elements of NLP

Over the past decade, we have become rather blasé about natural language processing. Thanks to Siri, Alexa, and the like, the idea of humans speaking to computers no longer feels magical or special. But this wouldn’t be possible without NLP. So, let’s take a look at the key elements that allow us to talk to Alexa.

Natural language understanding

All 2-way verbal communication consists of 2 key parts: understanding and responding. Natural language understanding (NLU), is the term for teaching a computer to understand human language. NLU takes the text, converts it to structured data, and tries to extract the meaning from it. It achieves this using its knowledge of vocabulary, grammar, and the context of the sentence.

NLU is quite a challenging problem, especially when dealing with English. This is because English is an extremely rich language but with quite weak grammar. There are always several ways to say anything in English. For instance, if you want to tell your spouse that dinner is ready you could say: “Dinner’s ready”, “the food’s on the table”, “come and eat”, “dinnertime”, etc.

Over time, we have taught computers to understand natural language using techniques like machine learning. In particular, NLU relies on reinforcement learning and human assistance. Nowadays, the best systems are able to understand even complex examples of natural language.

Natural language generation

NLG or natural language generation is the flipside of the coin. NLG is about getting a computer to generate natural language text from structured data. This allows the computer to talk back to you.

NLG has developed over a number of years. In that time it has grown from simple fill-the-gap systems to modern AI-based approaches. Let’s look at the evolution in a bit more detail.

Template approaches

The early forms of NLG all relied on templates. Originally, these were fill-the-gap templates where the computer completed the missing entries by accessing the relevant data in e.g. a database. For instance, you might have a system that sends payment reminders. “Dear {NAME}, Your next payment is due. Please pay us {AMMOUNT} by {DATE}.”

Over time, these systems became more intelligent, using scripts and rules to create more complex templates. These allowed businesses to automatically generate more complex documents, such as simple contracts. The most recent evolution of this approach sees computers using grammatical rules to ensure the text makes perfect sense. Thus, instead of using a specific test to ensure word agreement, the computer understands the actual rules. For example, the computer understands the difference between “you bought two items” and “you bought an item”.

Machine learning NLG approaches

The advent of machine learning allowed NLG to advance in leaps and bounds. Early NLG approaches often used Markov chain models to predict the correct next word in a sentence. However, more modern NLG systems tend to rely on more complex machine learning approaches. These include recurrent neural networks, LSTM models, and specialist models like the Transformer. These enabled new forms of NLG.

Dynamic sentence generation

Dynamic sentence generation involves constructing sentences from scratch. In other words, it does away with human-created templates altogether. Instead, it uses a representation of the desired meaning of the sentence. It then uses its knowledge of vocabulary, grammar, and usage to build the sentence word by word. Importantly, this approach allows the NLG system to exhibit human-like behaviors, like simplifying by using references within the sentence. For instance, “You will need to sign for your order when it is delivered.”

Dynamic document generation

The very latest NLG systems are capable of dynamic document generation. Here, the system is generating complete documents that have a good structure and narrative. This requires a 3-stage process:

Document planning, which involves planning the overall message and structure for the document.

Micro-planning, where the machine starts to add the bullet points for each part of the document.

Realization, where the computer converts the bullet points into actual sentences and coherent paragraphs.

NLG in the real world

Sonasoft Saibre provides a good example of NLG being used in the real world. Saibre is a suite of AI bots that are designed to streamline customer and employee support systems. One key part of Saibre is the ability to automatically generate relevant responses to customer tickets. This involves using NLU to understand the request, then parsing support documentation before using NLG to create a suitable response. And if you want to try doing NLG for yourself, try playing with SimpleNLG, an opensource Java API.

White Paper

SAIBRE AI Ecosystem

End-to-end AI applications that solve any business problem