Use data wisely to improve your processes and prevent future damage
Marco says: “Use SEO data to improve processes and prevent future damage.”
What SEO data are we talking about here?
“Mostly Search Console data, analytics, and crawl data from a crawler like Screaming Frog. If possible, you should be integrating non-SEO data as well.
Non-SEO data is any data that is not purely SEO: customer data, sales data, etc. It could be content information, like the author of a given article, which can still provide a lot of value for SEO.”
Do you bring this into data management software, like BigQuery, alongside your Google Search Console data?
“Yes, but it mostly depends on the project I'm dealing with.
It's a great idea to have the data you care about or want to preserve in a storage system like BigQuery, where you can join the tables and combine this data. It's not always possible, but it's something I recommend if you have the opportunity. BigQuery is not expensive, in most cases, especially if you're a big company.”
What is your process for acquiring data?
“My process is more of a framework for analytics as a whole and it's based on standard practices. It’s not completely new. It’s the same idea that you would find in any analytics book, I just adapted it to SEO.
First, you have to gather the requirements: what the client wants or what the project is about. The idea is to use data for SEO but it's not only SEO. It's also about how to use data in general, so you can apply the same reasoning in other industries. Once you get the requirements (what they want, what the best metrics are, what the goals are, what their intentions are, their capacity, etc.), you can move on to gathering your data. Try to find the proper data to analyse the problem.
Then, you clean the data. Remove what you don't care about, what creates noise, and what is not relevant to your analysis.”
How do you know what you shouldn't care about?
“In most cases, you know from experience. If you have a WordPress website, tags, categories, or filler pages like author pages are not relevant for a content audit. If you want to know what pages you have to prune, it will never be those pages even if they don't get organic traffic. It wouldn't make any sense, so you can ignore them. If the goal of your analysis is to find the best articles or the worst articles, you wouldn't care about those pages because they are outside of your scope.
In other cases, it's just noise. If you have archive pages, you don't need them in your analysis, so you can just filter them out. The same applies to a lot of boilerplate pages, depending on the CMS you're using and your use case.”
What's the next step in your process after that?
“Then, you're ready for analysis – which essentially means breaking things down and making them simple. You start with the complex and you make it simple.
Your analysis could involve describing data; finding the best pages of your website in terms of clicks, checking the unique query count, finding which pages are decaying, grouping pages, or creating labels for pages – for example, splitting pages into best performance and worst performance to analyse them and give actionable advice.
You are taking something raw, the data, and you are turning it into something that you or your client can use.”
Is this something that AI can help with?
“I mostly use AI for writing code, where it's very good. For the rest, I don't think it's really necessary. Most of the stuff we are talking about is straightforward once you do it.
The cool thing about analysis is that it requires a little bit of abstraction and machines, especially the latest AI tools, aren't very good at this because you still need some mathematical or statistical knowledge. There isn't a lot of valid training data online for the AI to learn from. I've never got trustworthy results using AI to interpret this kind of mathematical stuff. I would only advise using AI for tactical or operational-level stuff. For strategy, abstracting, or analysing a phenomenon, then do it yourself.”
What's the next step in your process?
“At this point, ideally, you would have some insights – and you usually will. Insights are essentially bits of truth: something you share with a client or with stakeholders that is actionable. Each insight should be tied to an action, something practical.
If your website got 100k clicks in August, so what? If it doesn't lead to anything actionable, then this is pure reporting. There is nothing to be done about it. However, if I tell you that 20% of your pages have zero clicks, you can implement a series of actions to improve those pages.”
How do you prioritise your insights?
“I only deal with B2C content websites, so it's very easy. In other cases, like e-commerce or SaaS, it can get a little trickier.
In my case, I usually start with articles belonging to a given cluster, if I have access to this information. If I can label articles within clusters, I can recommend prioritising certain articles belonging to a specific cluster (like ‘Fridges’, for example) because I know that this cluster brings in more money.
Otherwise, if it’s a very large website and I know there are specific individual pages that are decaying or following some kind of pattern, I can detect 5/10 pages and identify those. I judge each case and I either go by cluster and group articles to see what matters the most or I go by profit, if I have access to information like ad data, affiliate data, etc.”
Are you defining processes for your clients to follow as a result of the insights that you find?
“Yes, but also the other way around. Everyone has processes. Even if you don't write them down, you have a process for everything. Most of them can be improved unless you have hit a certain threshold.
I help content websites to improve their processes, where possible. Most of them don't use data, especially if they are big. You would think they all would. Adventurous big players do, but all the other big-but-not-so-big players often don’t. Even medium-sized websites don't really do it. This is a great opportunity to improve something.
Even something simple, like using data to show anomalies, identify which articles are decaying, or recognise seasonality. Maybe, during a certain period of the year, you find a strange association and you want to investigate. You might be talking about a certain product in your articles and you discover that it's popular between May and July. You don't know why so you want to investigate. To do this, first of all, you have to use your data to start investigating.
This is something that many people aren’t doing at this level. They stop at Google Sheets, using the Search Console interface, or using the software without using the APIs.
What is a decaying article?
“These are mostly articles that are losing traffic rather than rankings. Traffic is the main cause for concern because, at the end of the day, you care about what gets to that page. If you lose traffic, one reason could be that you have also lost rankings.
Decay is a natural fact; everything decays in nature. There is not one thing in nature that doesn't decay or doesn't change over time, and content is the same. The difference is that, in the case of content, we have more control over it. We can measure a lot of things and it’s quite easy to understand when content will decay.
If you know your niche or the type of articles you are working with, then you will already know. If you're doing walkthroughs or news content, that will decay faster compared to an evergreen informational article like ‘Types of Stone’.”
How do you prevent future damage?
“When you have a process, you also have some risks. This relates to the other steps in my process. The final steps are communication (communicating insights), execution (doing it), and prevention. Prevention, which is the last step, means avoiding damage in the future. Of course, you can't fully avoid ranking losses and you can't fully predict the future – it's impossible – but I'll give you a very practical example.
Let's say you have a blog that contains evergreen articles but you don't have any process for updating pages. You write articles, push articles, and you are doing well. If one day, a new competitor steals some of your traffic, it is going to take time to find which pages you need to update. You have so many of them, you have to identify the how and the why, you have to prioritise, and you have to know the process. If you tell your writers or editors to update a page, you have to give them instructions or they will get it wrong.
That’s why prevention is crucial. If you already know what to do, it's easier and you don't waste time. It's like war. If you already have a plan, it's easier to attack the enemy. If you don't, you're wasting precious time and you're wasting momentum. This is essentially what I am referring to when I talk about preventing or limiting future damage.
If you know your pages will eventually decay, or that some pages in a cluster or topic are more competitive, prepare some ideas in advance. Create a simple process that can be implemented quickly to update those pages before it's too late. This is planning in advance – and you don't even need any tools. Preventing future damage is mostly common sense. This sounds obvious because it is, but it's not so easy to implement in practice, and many people don't. Many people publish and they forget to update, or they update randomly. Then, a competitor will use this to their advantage.”
Do you have a favourite process or piece of software to alert you when something isn't performing as effectively as you want it to be?
“If you want something custom, there are some custom scripts or models that will tell you that something is going wrong. They can identify when clicks for certain pages show a downward trend, and send you an email. It is possible but it's not something you will use very much unless the project allows for it. You have to consider that, usually, you also have to maintain these solutions.
If the project is small and you can do something manually once per month, then do it manually. If it is more complex, then they have these systems in place where you get notified via somewhere like Slack if something is happening.
There are methods that I prefer for measuring these problems. To measure content decay, I created a very simple script with a friend of mine, Andrea D'Agostino, who is a data scientist. It's not complex. For every page of a website, we check the clicks, and we create a straight line showing the clicks on that page over time. Then, we take the line and identify the slope, which is a number that represents the rate of change along the line. Whether the number is positive or negative tells you whether there is growth or decay. If it is close to zero, nothing really important has happened or nothing has changed.
This is one of the simplest methods you can use to give every page a score. If you want it to be more accurate, you can add some weights and you can make it more complex, but the idea is more or less the same.
You can also use a method known as anomaly detection. Anomaly detection refers to a series of algorithms that are tasked with finding anomalies in your data. Why is one page or one day abnormal? You can use this to find out which of your pages, or your days, show abnormal patterns which you can then investigate.
For putting the actual system in place, once you have the script or model running, you can just set up some notifications for Slack or any other platform or tool. What’s important here is defining these problems and using the right metrics and data.”
If an SEO is struggling for time, what should they stop doing right now so they can spend more time doing what you suggest in 2024?
“Stop over-analysing. In many cases, we over-analyse things we don't have control over.
For example, you can't control Google updates. Okay, there is a core update, so what? At the end of the day, if your processes are stable and you know what you're doing, you shouldn't alter your strategy – update or no update. It's something you should account for, but you can't control it.
Also, stop obsessing over small details like meta descriptions. They might be ranking factors, but you should find what moves the needle. Focus on optimization outside of an SEO context. Focus on optimizing your processes, your time, and the time of others.
Optimize your processes without becoming an automaton. We are humans, so not everything should be automated. Focusing on scaling and improving processes can be very valuable because, at the end of the day, most of SEO is boring and repetitive. Understanding how to scale and how to put the proper systems in place requires a lot of time.”
Marco Giordano is an SEO specialist, a data and web analyst, and a consultant, and you can find him over at SEOtistics.com.