Leverage natural language processing to unpack user intent on a whole new level
Nik says: “In 2024, we have a remarkable, uncharted ability to unpack user intent at a level we have never been able to see before. My tip is to prioritise natural language processing so that you can do just that.
With the evolution of search engines and their algorithms, along with the increasing sophistication of AI-driven models, we’re seeing SGE placing even greater emphasis on understanding the context and intent behind user queries. We can reverse engineer these mechanisms to create better high-quality, valuable content, understand our pages far better, and create internal linking methodologies that outstrip anything that we’ve been able to do previously.”
In a post-SGE AI-influenced world, is SEO changing forever?
“I don’t think it’s completely changing forever; it’s our approach that needs to change. With the rise of AI tools at our disposal, we can leverage this to empower ourselves. We can delve into new depths with code and start to answer questions that were previously difficult to answer.
For example, if natural language processing helps search engines understand, interpret, generate, and utilise machine translation, sentiment, and recognition, we can leverage that to perform a suite of new tasks. With ChatGPT-4 and Code Interpreter, in particular, we can give custom instructions. We now have much greater access to a wealth of information that we’ve never had access to before. Using things like Hugging Face, which Google themselves have generated, we have access to even greater semantic understanding.
We can look at different frameworks, look at text embeddings, and capture more of the semantic content of web pages. We now have a whole new level of access and capability.
How do you leverage natural language processing techniques for information retrieval?
“An SEO can leverage this by using a suite of different natural language processing libraries – libraries like spaCy, TensorFlow, and Hugging Face. We can utilise BARD to create amazing visualisations. We can carry out text pre-processing; we can crawl a whole piece of content, prepare that text or that content for web pages and processing, and strip them down to their text embeddings.
We can create different mathematical representations for this text, at either the page level, the sentence level, or even the query level. We can strip them down, tokenize these things, remove any stop words, and convert these so that we can use this now-clean data to do many different things.
We can use this to unpack the actual intent of the page. If we have a whole corpus of text, we can see what that actually looks like for a search engine and how it would be perceived. We can extract things like entities and attach semantic meanings to those. We can utilise different pages and map them to each other to create nice internal links – or even 301 redirects to map an old page to a new page. We are now able to do all of this and more by calculating things like similarity matrices or by looking at the different related pages to one another.
This is something that I’ve only ever dreamed of being able to do. In the past, I would have needed to utilise people smarter than me in computer science and data science but now, because of AI, we can use these types of libraries, create Python and create text, and unpack this at a rate that is just unprecedented.
Get in there. Try and ask some really intuitive questions. You will be able to answer questions in a way that was never possible before.”
How do you use this process to score what you’re currently doing and compare it to your competition? What software do you use?
“TensorFlow is pretty awesome. TensorFlow, and a whole variety of other visualisation tasks, allow you to map them out using Search Console APIs. We can take the Search Console API, take all of the queries, look at the JSON content, and unlock that to find similarities within the data. This allows us to see the deviation and correlations between data (maybe over a period of time; based on last year, or based on this year), and now we can start to map whether or not there are any strong correlations or any patterns.
Let’s talk about an event as an example. Maybe an event from last year was starting to generate some interest and we’ve seen more impressions, more clicks, go towards this cluster of queries. We are now able to detect that and predict, based on what we’ve seen from the historical data, that we might see the same kind of patterns being generated around the same time this year. This is the level of detail we can pull out. It is really easy to do as well, by building deviations from traffic baselines and identifying strong correlations from Search Console data. We can ask and build custom Python scripts that will do this for us.
If you haven’t done this before, this is something that’s now been made a lot easier and a lot more accessible. That’s great because a lot more is going to come from this wider accessibility.”
How does this practically impact your SEO strategy?
“By having a really good grasp of the core concepts, you can now perform tasks at a rate that is a lot faster, and hopefully a lot more intuitive. Now, you’re able to graphically show a visualisation of what you’re trying to express to a client or a team, and you are able to build certain models to help extrapolate what you’re trying to say.
That makes it so much easier to scale your work. It doesn’t necessarily take away from the core aspect of what you do on a day-to-day basis as an SEO, but the time it takes and the amount of data that you can merge and create, utilising the same processes that a search engine would, is something that is completely unprecedented.”
Should every SEO be researching this and finding out more about it, or is it only a certain type of SEO that can and should be doing this?
“I think everyone can, which is really amazing. Electronic music completely revolutionised the music world and levelled the playing field. It allowed a lot more people to get in there and start creating music without a formal musical background and training. Now we’re seeing this in the tech world. More people are able to create custom scripts, create ways to talk to and integrate with different APIs, and build models that replicate, showcase, and transform the way that search engines work.
Say, for example, I want to use the Knowledge Graph API to pull in a whole bunch of information and attributes about specific entities. I can do that, and I am able to link in my Google Search Console data and get a deeper level of extraction – just based on pulling two APIs together to build something unique. This is something that we can do quite easily now. There is now a platform for putting these methodologies out there and it has given people the opportunity to go and explore.
My biggest tip is to go play. In previous years, I’ve said, ‘Let’s look at crawlability before we even worry about whether it’s going to rank or not. If it can’t crawl, if it can’t index, and it can’t render, then it’s probably not going to rank and it’s probably not going to be effective.’ That was my recommendation last year. We have learned so many great things from the past – from what I and other people have been able to recommend – but this year is the year for play. That’s a really empowering way that we can look at all the things that are now available to us.
Instead of saying, ‘Maybe see what comes out of this and be a passenger’, I say, ‘Get in the driver’s seat.’ Let’s start exploring our options. Let’s test a few things and build a few things. It has given us a wide-reaching array of opportunities. Starting with natural language processing, at the very core of how search intent works, is really bread-and-butter stuff. The way for an SEO to begin to learn about what they need to do to make a page effective is by understanding user intent.
Now, we can utilise a whole variety of different libraries that help do that work for us and give us really great clues to follow. We can combine these clues with our own lateral thinking, and say, ‘This is what it’s presenting to me. Does this match what I think this page should be about?’ If it’s a ‘no’, then maybe you need to tweak things. If it’s a ‘yes’, how can you build something from this understanding of what the page represents?
You can now look at not just one or two, but maybe 2,000, 20,000, or 200,000 pages. You can start to build it into a model and do things like pull out specific anchor text from the page and build internal links that link up these pages. This is something that we are now able to do, and I think that’s just awesome.”
Where should an SEO go and play, to begin with, and what are the first few steps of the game?
“I would say that the first step of the game is looking at a particular task that you want to do. Do you want to unpack the sentiment? Do you want to take a whole corpus of text and truly understand it?
For one of my first tasks, which I really loved, I built a crawler and tried to mimic the way that Screaming Frog (one of my favourite tools) works. Hats off to Screaming Frog. It is actually incredible what they do and how they’ve been able to build this tool. There’s a reason why it’s so awesome and it’s really, really hard to replicate.
I wanted to be able to crawl a whole bunch of pages, extract as much value out of that as possible, and start to build a way that I could use this as an array. I wanted to be able to say, ‘How are we pulling all of this for all of these pages?’ I wanted to break down each of these pages into their own mathematical representations and see what that mapping looks like.”
If an SEO is struggling for time what should they stop doing right now so they can spend more time doing what you suggest in 2024?
“Stop getting so distracted with the number of changes that are happening. Try to go back to more fundamental work and understand more about how a search engine works, how things are built, why you would crawl a page, why pages would be picked up, and why some pages would be chosen over others.
Go back, understand the core aspects of SEO, and ask ‘Why?’ Asking yourself ‘Why?’ is such a valuable question because if you don’t understand these things then you can get off the track really quickly.”
Nik Ranger is Senior SEO Consultant at Dejan, and you can find her over at DejanMarketing.com.