Unwrapping Adthena’s Machine Learning (and how it powers the Whole Market View)

Pat Hong Posted by Pat Hong

machine-learning-explained

As a new starter at Adthena, one of the things I’ve had to get my head around is the machine learning and data which powers the search product. For readers who are discovering Adthena for the first time, you may share some of the questions that I’ve been tackling…

For starters:

What exactly is meant by “machine learned data”, and how is it better than regular search data?

To further complicate matters, many machine learning projects you may hear about cite not only machine learning, but also artificial intelligence, neural networks, supervised and unsupervised learning, and so on. Without further clarification, these can perhaps be glossed over as buzzwords.

Real world applications of machine learning are incredibly varied. Google’s PlaNet neural network, for example, is able to predict the location of any photo to a significantly more accurate level than any human by analysing just pixel data. The Elon Musk backed venture OpenAI on the other hand, is training computers to outperform the world’s best competitive gamers by training machines against themselves for the equivalent of several lifetimes of experience.

Adthena’s machine learning is similarly unique (and patented), and is concerned primarily with the processing of search advertising data. This post will explore some of the fundamentals of Adthena’s machine learning, and will unwrap (as much as possible) the years of development and data science that bring us to where we are today.

What is Adthena’s machine learning?

Adthena’s machine learning comprises of predominantly four stages, which process, interpret, and analyse the big data of the world wide web. Its purpose is to turn the massive amount of unstructured data, present and constantly being generated on the world wide web, into a structured data model that can be useful to advertisers.

Why is this useful? A structured data model enables the interrogation, research, and manipulation of data, and ultimately it enables the powerful grouping, segmentation, and filtering that is crucial to the management of successful ad campaigns.

Everything starts with the domain…

The starting point for all Adthena’s data and insight lies in the information present in any website or domain. That’s a concept that is very prevalent in the platform (where you can simply plug in a domain and let the platform immediately populate dashboards and identify search opportunities).

Adthena’s algorithms crawl domain data (which includes all the human-readable text as well as other useful information contained in page level metadata), and introduces natural language processing which brings focus to the most relevant search terms for each domain.

To further refine this data, in order to extract the more valuable insights, a number of processes are applied which form the foundation of Adthena’s machine learning.

Supervised and unsupervised learning

The Adthena technical documentation has these notes on what elements makeup Adthena’s machine learning, which fall under these two wider disciplines:

Unsupervised Learning: dense vector embeddings of search terms. This essentially allows similar words to be ‘close’ to each other, e.g. ‘king’ and ‘queen’.

  • NLP, Neural Networks (1): search term categorisation. This supervised learning algorithm maps search terms to one of up to 8000* categories, e.g: “cricket nets” = Sports & Fitness → Sporting Goods → Cricket Equipment
  • NLP, Neural Networks (2): finding Whole Market search terms that are meaningful to clients.

Supervised Learning: using search term categorisation within the regression model to predict CTR/CPC from the indexed data.

Online Learning: we use online learning to handle large amounts of indexed and search term categorisation data.

Let’s expand on these points:

“Dense vector embeddings of search terms”

Vector embedding enhances search terms (such as the ones crawled relevant domains), and add additional dimensions to enrich the keyword dataset. Think of this as supplementing each search term with a score for it’s proximity to another search term.

For example, consider a hypothetical keyword list containing only five words: “King, Queen, Man, Woman, and Princess”.

In normal usage, the word ‘King’, will naturally attain a high score for its relationship with the word ‘man’. Further, ‘king’ will also score highly with ‘queen’ (and vice versa), and also scores slightly less for its relationship with the keyword ‘princess’. You end up with something akin to the model below:

word2vec-image

[Source]

When you apply that same concept to ALL the linguistic data we have collected from domain level crawls, it creates a foundation for a data model which can infer a much greater level of insight than from simply keywords alone. Each search term has now been supercharged with thousands of additional dimensions defining relationships with other search terms.

And with multiple domains themselves containing thousands of keywords, that is, effectively, enough to form an incomprehensibly huge matrix of relationship quality between various search terms.

This is where the machine learning becomes really valuable as it enables the processing of these huge datasets.

“Natural Language Processing (NLP), Neural Networks (1)”

Search term vectors are a powerful potential resource, but further processing is needed to extract value. Potentially, there are significant insights into search behaviour, search intent, and valuable competitive search intelligence — but we’re not able to unlock any of that yet without further processing.

To make the data more useful for search marketers, Adthena uses supervised machine-learning algorithms to map relevant search terms to Google’s (recently expanded) hierarchy of eight-thousand Adwords categories. As the documentation states, the algorithm takes a search term, for example “cricket nets”, and pairs it to the category: Sports & Fitness → Sporting Goods → Cricket Equipment.

We call this a neural network because it constantly adapts, and is able to process a constant input of data, whilst continuing to improve and learn as the dataset grows (and additional domains within a market are indexed). This ensures the search term categorisation is always up to date and relevant.

“Natural Language Processing (NLP), Neural Networks (2)”

There is an additional layer which identifies the search terms that are most useful to search marketers. This part of the process refines the data model by identifying valuable search terms in key search verticals and niches. Essentially, this zones in on the valuable search terms and keywords that are most useful for ad campaigns. (This is incredibly sophisticated and ML driven, but to provide a working context, the algorithm may, for example, identify search terms containing modifiers that indicate a level of purchase intent i.e. “buy iPhone 8″, or “Gibson Les Paul Standard review“.)

It will also identify all the search terms that are relevant to key search advertising verticals and niches, such as finance, insurance, technology, fashion, retail, or travel etc.

The data model continuously introduces data returned from the SERP for all valuable search terms (i.e. the ads themselves). These results encompass both paid and organic search, and are further segmented by device (mobile/desktop) and geography, to provide an accurate reflection of the ads users are served when making search queries.

“Using the regression model to predict CTR/CPC from the indexed data”

To attain CTR/CPC data (Google do not make this data public), we use a regression model to make a detailed estimate of the CTR and CPC value of various search terms.

A regression model is a means of using a dataset containing search terms with known CPC and CTR values, to predict unknown values from a similar dataset of search terms, to form an estimate for CPC/CTR. Because we have a detailed data model of relationship quality between ‘close’ keywords, modifiers, and search terms, we can form a genuine estimate for CPC and CTR values to a high level of accuracy.

But how is machine learned data better?

Adthena’s machine learning takes unique, proprietary domain level data, and uses machine learning to further refine and extract value from it. Other competitive intelligence solutions skip this step — typically they index SERP results which puts the focus solely on known quantities.

This is why competitor products often depend on seed lists of keywords. The problem with this, is that ultimately, this kind of data does not offer as much competitive intelligence (as advertisers will only gain insights relating to their known keyword universe).

Adthena’s machine learning ensures the scope of data is much wider, encapsulating industry wide search terms for both paid and organic search, and isn’t just snapshot of the SERP. It forms a detailed, complete picture of the entire competitive search landscape and offers the most advanced competitive intelligence available for search marketers. This is what we call the Whole Market View.

As a final note, as more and more domains are indexed, the data model is continually refined, improving CPC and CTR estimates.

Hope you enjoyed this post! And if (like me) you would like to learn more about Adthena, our data or our competitive intelligence platform, contact us.

About the author

Pat Hong
Pat Hong
Pat is the Digital Content Strategist at Adthena. He works on Adthena's content projects, covering adtech news, trends, and insights. He studied Film and Television at the University of West London.