Make better use of your data by preparing a database
Olesia says: “Prepare a database for your brand, company, product, service, or whatever you are offering. You will be able to generate whatever you want out of it – be it a website, social media posts, a profile on a website, a Web 3.0 board, video, image, etc.
It is a good time to be doing this because we still have lots of overwhelming information that is not overly regulated. There aren’t too many prohibitions and it's still affordable to put this data online. You have access to the common crawl and lots of data sets out there, which will help you to understand your marketing positioning, your voice, and how to better serve your customers.
As SEOs, we still look at search results and those guidelines, but younger people no longer want to search through 10 blue links. They want their answer straight away; they don't want useless SEO junk. Now we have a real opportunity, not only to understand our users better but also to serve them in a better way – through a knowledge base and a dictionary.
In a ‘dictionary’, you will have the keywords from your industry, niche terms, synonyms, and other variations. A knowledge base consists of your company's knowledge in your industry or area.
You will be able to search for the most common questions customers ask and put those into FAQs. In one of the companies where I work, we have a sales and customer support channel where the data is published in a special format and template. Templates are hugely important when you collect your data sets because, when you have a definite template for everything that you do, it's much easier to work with that information, extract meaningful data, train and fine-tune the existing models on that data, and use it accordingly.
It can be easier if you organise your knowledge base into JSON files, as long as the information load is not too big. I use fields like title, text, image, video, and metadata. I also keep databases for brand information. Those include values, voice, tone, and brand guidelines because, when you generate posts or data, you want them to fit within your brand. For your services and products, you would include customer personas, experts, testimonials, and analytics – what performs better or what posts your company prefers.”
How do you create these kinds of databases and how do you organise them?
“You can do this with Python or any other tool. In a recent Google Cloud Next, they were showing how to do this with the data in Google Cloud. You could even use AI, and you might not even need the code for it to extract and work with this information.
You will usually organize this information in a hierarchy, using a tagging system so that the data is connected. Format it consistently, use templates for each data type, maintain clear naming conventions, and store all the supporting documentation. LLMs (large language models) don't usually know where they are getting their information from. You should know where you are getting your information from so that you can link to it whenever you need.”
Would this database be accessible to search engines as well?
“Of course. We don't know how search engines will develop. There are so many different search engines in places like YouTube and TikTok, and we now have generative search within Google itself as well.
We want to be adaptable and serve our clients wherever they are. Maybe they won't access our website at all, but we need to supply them with information about our products, services, and companies so that they can buy from us eventually, wherever they are looking.
Clients are not specifically looking for us. They are looking to solve their pain or problem. We need to integrate our brand information into the search so that they come to us with their problem, and we can successfully help them to solve it.”
Should a database be kept on your website?
“I would suggest not keeping it on your website because it should be a trading secret. You should not give anyone access to your database. Instead, you would use it to generate data in the way that you want.
That's why you have brand blogs and information blogs. You find out from the sales or customer team what questions people in your industry usually have that your product or service might help to solve. Then, you publish that information so that it's available to your users in the form that they want it. It's already personalised because you should already have information for every persona out there based on their interests, what they want, and the problems they face.”
How does the information for a database get collated?
“It depends on your resources and what you are planning to do with it. Most companies simply don't have that much good information in their blog posts. They’re rewritten from competitors or from somewhere on the internet, so it's unlikely that they need to collect that data at all. It's useless. They might want to generate something better.
The information that they need to collect for their databases is basically the products or services that they offer and the questions that these products or services can solve. It’s an SEO’s task to organise and prepare that information so the customer can find it on the internet.”
How often should the data be updated?
“One of the main problems with a knowledge base is keeping it up-to-date. Whenever you have a new product or a new service, or you discontinue a product or service, you should update the database.
It's much easier to maintain when you have a good tagging or labelling system. You don’t need to have a third person sitting there, manually feeding products into the database. The database should be organised. The SEO can say, ‘I need this information to be stored in this way so that we can work with it.’ Then, you can decide how to do that.”
How does an SEO justify spending a lot of time preparing a database instead of on more conventional SEO tasks?
“It doesn’t require too much time. Preparing and maintaining a database should be automated, and it's definitely not an SEO who manually inserts all the fields. SEOs use the data for their tasks – be that content, building backlinks, or analytics.
You may even integrate a chatbot into this data so that you can understand where you are, where you are heading, how many people are attracted to a page, how it's performing, etc.
It's impossible for an SEO to keep track of all this data, especially in a very big company. It will depend on the company’s SEO strategy - how often you publish, which questions you answer in your blog posts, etc. For products, it's often automated. When a new product appears, you have a template for how you put it on the website, how you optimize it, how you interlink it with other products, and which category it’s placed in.
It’s not something that an SEO will do on a daily basis. Your job is to check and verify that everything is correct, implemented and works fine.”
What software do you prefer to use?
“I use Python, but I am very hopeful about the Google Cloud services that have recently been introduced. I've already applied to DoiT AI and other things because I want to see what they're offering.
I hope these tools will make the job even faster by removing the need to write the code yourself. I want to see it first, though. Hopefully, something like that will be emerging from Google, Amazon, or somewhere else.”
What's the most practical use case for a database like this?
“For me, I primarily use it for generating content and supporting content within other channels – like social media channels, press releases, and other off-page spaces.
You will keep all of that consistent by using the same database to generate it. You will have a consistent message across all channels, and they will all support your SEO. You will have social signals for your content as well. You can fill up every channel and know that you are doing it right. If you always have a template for all of your content, it's much easier to make it, check it, and verify it within the company.
We are transforming into the age of Web 3.0, and we don't know how everything is going to look. Google is pushing the blockchain, which could be a problem for us. Whatever happens to the search next, having this information available on a database will help you to be prepared. You can be flexible and take on whatever comes. If there are websites in the future, you can build a website from it. If you need a social media profile with lots of posts, you can build that. If you need to build a Web 3.0 board, you can do that.”
Are any brands doing a particularly good job of this at the moment?
“I don't think so, because it's quite new. A lot of people say they are creating these data sets and generating information out of them but, from what I've seen, most are not very successful. In the news industry, some companies are managing to generate news posts already. However, I have not yet seen big companies doing this well.
We are mostly collecting data in batches, so we have a very small portion of what we publish. So far, I have mostly created learning centers for different companies out of this content, and it's performing well in search right now.”
If an SEO is struggling for time, what should they stop doing right now so they can spend more time doing what you suggest in 2024?
“As SEOs, we are given so much overwhelming information – just look at how many conferences we have and all the people saying, ‘Try this!’ or ‘Try that!’ You don’t have time to listen to everything, and then someone tells you to learn Python, Looker Studio, or whatever else.
Of course, you do need to learn these things, but when you try to grab all of them at the same time, you end up doing nothing. You won’t have enough return on investment for your time and resources. Stop running after every thread in SEO and stop trying to generate everything and write about everything. Concentrate on something small and start doing what you do.
The emergence and development of AI will help us to fill in the gaps where we don't have enough knowledge. Maybe you won't need to learn Python because AI will be able to generate the code for you so that you can just input it and have the same result as those who learned it. Think twice before investing your time into something because time is becoming more valuable.
I have a checklist now, for when I am trying to decide what is worth my time. The first question I ask myself is, ‘Is it bringing any results or is it just keeping me active?’ Then I ask, ‘If I do this, what is the expected result? Do I really need that result? Are there any other priorities?’ Those are the basic questions I try to ask myself.
You want to listen to Google Cloud Next, then you want to look into the new ChatGPT integration, and then there's Claude from Anthropic. You just cannot try all those things at once. You have to be focused on the results that you are bringing. It’s nice to try everything, but they are not always bringing much to the table just yet.”
Olesia Korobka is an SEO entrepreneur at Fajela, and you can find her over at OlesiaKorobka.com.