Modern marketers often have to reconcile long-standing marketing strategies with changing technologies that become more and more complex. For search engine marketing this now means understanding how natural language processing might change the landscape.

It is impossible to predict how Google will change its algorithm, but search engine natural language processing (NLP) stands to be one of the most important evolutions in search. So what is search engine NLP? How do you use NLP for marketing on Google? What does it mean to focus on natural language, and how can marketers learn from Google’s goal of better understanding language?

In 2019 Google announced that it had taken another major step toward understanding language by implementing a process for better understanding words within the context of search queries. More specifically they had added a complex NLP process built on Biderectional Encoder Representations from Transformers or BERT.

This was the most recent event in a string of updates over the years that have hinted at Google’s long-standing goal of better language processing.

The Google BERT update meant that Google could use the content of a search query to better understand the specific definition of each word in a search phrase. It’s significant because it greatly changes the way search engines can handle language – and could play a major roll in how to use NLP for marketing and SEO.

For example queries like “frequent flyer programs” or “custom flyer printing” both contain the same word – “flyer” – but each case utilizes a different definition of the word. As humans we can look at these phrases and understand the difference based on context – that one of these refers to airline awards programs, and that the other refers to promotional paper printouts. Its intuitive.

Until recently, the science of search engine computing meant that it was very difficult to determine “intent” when users typed in a query into something like Google. It was difficult for Google to understand which definition of “flyer” was intended and the best way to handle such a query would be to search for instances of the complete phrase within the content of pages stored in the search index.

All this meant that Google was really only trying to match results for a search query. In short, Google searches were best at returning results that matched the structure or text of a search, but not necessarily the intended meaning.

 

Some History on Search Engine NLP

So what does this all mean? And more importantly what can businesses gain from learning how to use NLP for marketing?

Understanding search engine NLP will be important for websites looking to utilize SEO, especially as Google’s algorithm continues to become more sophisticated. It’s possible that machine learning based AI can help Google train its algorithm by scanning a body of text and using each word in that text to help understand the definition of every other word in the text.

Future SEO could hinge on this.

Discussion about search engine NLP has been around for a long time. NLP marketing and SEOs have talked about the concept of Latent-Semantic-Indexing (LSI) as early as 2005 – which has never been confirmed by Google to even officially exist. The idea behind LSI is largely similar to how the modern BERT algorithm actually works: search engines can “learn” about which words are related, learn about closely related phrases, and learn about synonyms by merely seeing how they appear together.

In 2013 Google introduced the Hummingbird algorithm to its complete core algorithm. Its first official step into search engine NLP. It was also an acknowledgment that something very much like LSI was actually being used.

Hummingbird was a huge step toward natural language processing and it meant that NLP for search engines and NLP marketing were now on the forefront of SEO best practices. The update sought to down rank sites that were stuffing content with keywords while also better ranking sites with complex content that was previously difficult for Google to understand. One of its main focuses was understanding “conversational” language and more complex phrases.

When Hummingbird was rolled out to the web it impacted close to 90 percent of all searches.

Conversational language was a key concept here and fundamental to how Google was approaching search engine NLP. Around that same time they introduced voice search – search phrases strung together from voice input into text, as well as the Google Knowledge Graph, which could provide specific info about nouns and entities (like celebrities, landmarks, locations, etc.).

These updates meant that for the first time Google’s search engine NLP could begin to grasp the concept of synonymous word definitions and homonyms. It meant that marketers could expect ever more granular search results and that they could tailor their content to an ever more precise audience.

It also meant that exploiting search engine NLP for marketing involved trying to figure out exactly how to utilize new SERP features like knowledge graphs and rich results. Google insisted then, just like they do now, that best SEO results come from producing content that is best tailored to user needs, and not any secret NLP marketing strategy.

In 2015 Google introduced the now famous RankBrain algorithm update, a major step into search engine NLP and the use of artificial intelligence. This meant that NLP marketers had to understand how RankBrain changed the way results were delivered to searchers in order to take advantage of its functionality.

Here’s how RankBrain works. Google describes RankBrain as the algorithm’s ability to use AI to guess at the meaning of some search phrases and to filter search results accordingly. This was particularly good for “never-before-seen” search queries.

When this early NLP precursor was announced by Google in a Bloomberg article they explained:

RankBrain uses artificial intelligence to embed vast amounts of written language into mathematical entities — called vectors — that the computer can understand. If RankBrain sees a word or phrase it isn’t familiar with, the machine can make a guess as to what words or phrases might have a similar meaning and filter the result accordingly, making it more effective at handling never-before-seen search queries.

It is important to note that RankBrain is not a ranking signal. It functions as part of the algorithm that’s concerned about which URLs are best to deliver to the SERP, not how to rank them. In simple terms RankBrain uses machine learning to garner context for search keywords and to provide best results when it isn’t sure what a query means.

RankBrain is Google’s way of utilizing a new system of search engine NLP to better serve its users. The ultimate goal is to make sure that never-before-seen searches and unique long-tail search terms don’t come up with nothing.

When Google rolled out the BERT update late last year its function was similar, it was not a ranking algorithm but instead a results algorithm. It didn’t replace RankBrain, it just functioned alongside it. It was intended to give Google a better grasp of language by greatly expanding the technologies behind how to understand word context.

 

The Latest Steps in Search Engine NLP

The most recent addition to Google’s NLP search engine algorithm crown is the BERT jewel. BERT has taken the search giant’s use of AI to the next level with a search results algorithm that can deduce the meaning of each individual word in a body of text.

The update was based on the concept of “transformers,” models that process words in relation to all the other words in a sentence, rather than one-by-one in order.

By analyzing individual words in the body of a text in relation to every other word in the same body of text, the algorithm can gain a more complete picture of the text then simply analyzing each word one-by-one.

Google’s use of transfer learning means that it’s incorporating into search algorithms the pretraining from an AI model on data-rich tasks, and is fine-tuning with other tasks. With BERT they are able to train their model using vast amounts of text on the world wide web. The key difference with other training models like Semi-supervised Sequence Learning, Generative Pre-Training, ELMo, and ULMFit is that BERT is distinctly bidirectional. This means that the advanced model is able to go beyond just processing text from beginning to end, it’s able to analyze back-to-front, and every way in between.

The model is able to “predict” words by masking them and using other words in the text to “predict” the missing word. But instead of simply going one-by-one, in order, to predict the next word, it goes through a process where it masks each word and uses the context of every other word to predict the masked word.

With this machine learning system the BERT model is able to become a process for search engine NLP by taking what it learns from training processes and using them to gain insight on new complex search queries entered into Google search. Like Hummingbird, Google uses the BERT search NLP program just for returning results, not for rankings.

BERT is also able to work across multiple languages, meaning that NLP marketing in the future could mean a more globalized approach to search engines. That search results in Google could extend beyond just the language of the searcher. Marketers that are able to construct their content for a global world of searchers may be able to see fine-tuned traffic trickle in from search terms that are more granular than ever before.

And now Google is moving even further into the world of NLP for search engines.

Google is building a system for “Text-to-Text Transformer” or T5 – a framework for processing language – to crawl a giant body of text and to train itself to understand context. This body of text is called a “corpus” and makes up Google’s “Colossal Clean Crawled Corpus.

The basis of “text-to-text” here means that the input for the T5 language processing model is text, and the output is text as well. Google has trained the T5 model of NLP for answering questions directly, by only using its own pre-trained knowledge, and without referring to a text source. This means that in the feature, search engine NLP could possibly help give Google searchers answers directly from Google itself, without having to direct searchers to websites from the search results.

This sort of natural language processing technology could also improve Google’s ability to return rich-snippets and knowledge graphs. Recently Google has hinted at the necessity of using neural networks to parse other sorts of data beyond text. Specifically tables.

The BERT search engine NLP process could help Google handle number data or data stored in tables. Specifically, BERT’s strength is in 1) helping Google understand what the query is actually for and 2) in encoding what the table data consists of so that it knows what to look for.

Currently the T5 model is not a part of Google’s search algorithm. There’s also no indication to suggest that the BERT model is being used by Google to help supply data results to searchers. But these two recent developments in language processing suggest that Google is continuing to fine-tune its ability to give searchers better results.

For search-engine NLP Google is continuing to evolve the accuracy of its search results by giving searchers better answers to more complex data queries and more complex language-based questions. It also means that processes like BERT can help Google deliver results across languages, and thus across the globe.

 

How to Use NLP for Marketing

When Google’s VP of search Pandu Nayak announced this new language processing system in a blog post last year, he talked about how it would affect users on the other end, and of course marketers.

Google has always been reticent about how its search rankings work completely, meaning that it’s impossible for marketers and outsiders to ever know what future SEO will be like. For SEO marketers and content marketers this may mean having greater faith in Google to bring searchers to your site. It may mean SEO strategy that veers closer to content marketing, CRO, and UX optimization. The future of search optimization may rely less on keywords than ever before – and more on clear, concise, and well structured content that’s designed for humans.

One example from Nayak is for the search phrase “2019 brazil traveler to usa need a visa.” Google’s search engine NLP would now add greater significance to the word “to” in this phrase – where as before 2019 Google would ignore “stop words,” or prepositions like “for,” “to,” and “of.”

In this example “to” clearly implies a searcher traveling from Brazil to the US, and not the other way around.

Via Google

Because prepositions like this now play a roll in search results, marketers will now have to consider how their content’s phrasing can affect results. Traditional stop words and prepositions will now play a larger role in page meta title tags, H-tags, on-page titles, and other areas of the site.

Of course, if you are designing your site for humans (which you should be!) then most likely you won’t need to do anything differently. If your content is designed for accuracy and better UX, then you should be set up to use search engine NLP for marketing.

Google’s advice to SEOs about its intermittent core algorithm updates is always the same: seek to create good content for humans.

When the BERT search engine NLP model was rolled out, Google’s Danny Sullivan insisted that there was no way to optimize for it. He also made it clear that there was nothing for marketers to be “rethinking,” suggesting that traditional SEO best practices hadn’t really changed.

John Mueller, the Senior Webmaster Trends Analyst who is often the public voice for Google says much the same thing, “The text on the page is something that you can influence. Our recommendation there is essentially to write naturally.”

He reiterates that for the purpose of search engine NLP modeling, BERT is only focused on better search results – and is not designed to effect page rankings. For those wondering how to use NLP for marketing the secret lies in earnest content with reader experience in mind.

“So, if anything, there’s anything that you can do to kind of optimize for BERT, it’s essentially to make sure that your pages have natural text on them…” – John Mueller

 

Best Practices on Using NLP for Marketing

With Google as an ever more NLP based search engine it could mean that marketers will have to think less-and-less about keyword driven strategies, and more about user driven strategies.

Though keyword optimization, on-page SEO optimization, and natural backlink growth strategies are still important for SEO, things might be changing. Business owners and webmasters may have to consider NLP marketing based growth strategies that hinge more on UX and user-friendly content.

But like Google has already say before, that’s not much different to how we approach NLP for marketing already. Best SEO practice is to steer content toward user intent and to create content that best meets user needs.

Websites should focus on adding EAT friendly content for Google SEO, and they should focus on content that is not only accurate but helps drive site visitors toward their intended goal.

Marketers can also stick to best practices with H-tags, page formatting, site-structure, and content visibility to ensure that NLP based search engines are able to source data to SERPs effectively.

 

Using H-Tags

H-tags (like H1s, H2s, H3s, etc,) are not only a ranking factor for search engines (although their effect is minimal) they can help search engines understand the structure of your on-page content and help them to parse it.

It’s possible that proper use of H-tags can help your site appear in more rich-results snippets on Google, which can help with CTR. Search engines using NLP are able to match queries that are placed in H-tags with the content that appears after and place this content on the SERP.

Though adding a bunch of semantic HTML to your site, like H-tags, isn’t recommended just for its own sake, using them properly can help NLP models from search engines better present data on your site.

They can also help users better find the information they are looking for and help them to understand the structuring of your on-page content. Use H-tags with listed items, questions (like FAQ pages) or with site content where it can be helpful to indicate a hierarchy of information.

 

Make Sure Your Content is Visible

Making sure that your site’s content is visible to search engines, and that it can be indexed is one of the most basic first steps in SEO. For sites concerned about search engine NLP marketing, your content will need to be available to Googlebot if it’s going to be displayed to searchers.

Make sure that JavaScript or Flash content on your site isn’t hiding important content. Googlebot has become much more sophisticated in rendering JavaScript content – which means that although JavaScript used to be a big problem, it’s now rarely an issue. But if you have JavaScript that contains links, content, or nav-bar elements that are hidden to Google it can hurt your rankings in the search index and prevent natural language based search engine algorithms from properly understanding your content.

 

Focus on EAT for Humans

With an emphasis on user focused content, modern SEO and NLP marketing will mean paying attention to best practices already outlined by Google.

Google’s NLP focused algorithm updates have all emphasized good content as the only way to protect against ranking losses. Their advice is always the same: sites hurt by algorithm updates aren’t necessarily doing anything wrong, but best practice is to strive to create the best possible experience.

For marketing on sites that offer EAT content you’ll want to remember what it stands for: Expertise, Authority, Trustworthiness. This comes from Google’s Search Quality Evaluator Guidelines that define what EAT content is and how it places a focus on the authority and quality of a sites content.

Strive to demonstrate to site visitors your site’s authority in your industry with content that is well researched, provides information that meets visitor expectations, and is inherently trustworthy.

Content on your site should demonstrate:

  • The expertise of the creator of the MC.
  • The authoritativeness of the creator of the MC, the MC itself, and the website.
  • The trustworthiness of the creator of the MC, the MC itself and the website.

These guidelines emphasize the authority and expertise of the content itself. If your content is detailed enough and designed to meet your target audience’s needs and answer their questions, then it will be better suited to appear in search results on Google. Many marketers will do well to ensure useful contact pages with up-to-date contact info, along with informative about pages that demonstrate the expertise of the business.

 

Learn More About Search Engine NLP and Marketing

Contact us for more information on search optimization services and how to improve your site for the future of SEO. Get a free website consultation and learn about how Radd can help you with full service SEM.