Guide · Chapter 03 11 min read

Keyword research

By Evgeni Asenov.

The short answer

Keyword research is the practice of finding the queries people actually search for, judging what those queries mean, and deciding which are worth writing for. The job is three steps: collect ideas from seeds and tools, classify each by intent, prioritize by difficulty and business fit. AI search splits one intent across many phrasings, raising the cost of picking wrong.

Keyword research is demand mapping, not guessing

Keyword research is the practice of taking a guess about what your audience searches for and turning it into a list of queries with measurable demand. The work has three stages, collect ideas from seeds and tools, classify each idea by what the searcher actually wants, prioritize the survivors by difficulty and business fit. Skip the stages and you ship a launch page that targets a phrase nobody types.

Builders skip this step more than any other because keyword tools feel marketer-coded. The cost of skipping is well documented, Ahrefs found 90.63 percent of pages get no organic Google traffic, and most of those pages target nothing in particular. A founder writes a launch post titled “Why we built X” and waits for traffic that never arrives, because nobody searches that phrase.

The fix is upstream of writing. Keyword research turns guesswork into a sorted list of queries you know exist, and it gives every page a job before the page is built. In the crawl, index, rank pipeline from chapter 1, the ranker can only choose from what it indexed, and what gets indexed is what you wrote about. Picking the query first is how you stop writing for an empty room.

Seed keywords are the only honest starting point

A seed keyword is a short phrase your audience uses for the problem you solve. Five to fifteen seeds is the right starting bench, written down without a tool open, in the words your users actually say. The seeds are not the final list, they are the input that everything else expands from.

The reason to start without a tool is that tools rank by search volume, not by relevance to your product. A volume-first list will push you toward generic terms that compete with everyone, and away from the narrow ones your real users type. The seeds anchor the research to the product first, the database second.

  1. 01

    Write down the product nouns

    The names of what you sell, as a customer would describe them, not the marketing pitch. A running shoe store has seeds like running shoes, trail shoes, marathon trainers, not the brand of any single model.
  2. 02

    Write down the problem verbs

    What the customer is trying to do, in plain words. Find shoes for flat feet, train for a first marathon, fix shin splints, pick a road shoe versus a trail shoe. Each verb becomes a query family later.
  3. 03

    Write down the competitor brand terms

    Hoka, Nike, Brooks, Asics, Saucony. Branded queries are where commercial intent concentrates, and people compare your category by typing two brand names side by side.
  4. 04

    Write down the customer language

    Open your site search logs, your support inbox, the questions your sales team answers on every call. The line on your homepage you keep rewriting is a seed. So is the phrase a customer used in the last review you read.

The seeds expand fast once they exist. A site search log alone usually surfaces fifty raw phrases, and competitor brand terms add another twenty. The first diagram below sketches how a single seed fans out into questions, modifiers, and adjacent jobs.

One seed becomes a family of queries once tools and SERPs expand it

Keyword tools turn one seed into a thousand candidates

A keyword tool is software that takes a seed and returns the related queries, with search volume and difficulty attached. Manual brainstorming caps at around fifty phrases. A defensible content plan needs five hundred or more, which is what tools deliver in a single export.

The category splits into four tiers. Free signals like Google Autocomplete , People Also Ask, and the related searches at the bottom of the SERP. Freemium tools like Mangools and Ubersuggest, with a few free lookups per day and modest databases. Enterprise dashboards like Ahrefs and Semrush, with a 26 billion keyword database across 142 countries on the Semrush side, paid only. And pay-per-request APIs like DataForSEO, which sit underneath most of the dashboards you pay for and sell the same raw data for fractions of a cent per call to anyone willing to wire up a script.

Free signals
Paid tools
Source
Google Autocomplete, People Also Ask, the related searches under the SERP
Curated databases with search volume and difficulty already calculated
Cost
Zero, you pay in time scrolling the SERP
0 to 50 USD a month for freemium, 100 to 500 USD a month for enterprise dashboards
Best for
Validating a single seed before you commit to writing for it
Building a 500-keyword universe for a content plan in one afternoon

The trap is reaching for the enterprise tool before you have any seeds. A database that returns ten thousand related phrases is useful only if you can throw out the nine thousand five hundred that do not fit. Without seeds and without intent, the list is noise.

Search intent is the metric that decides whether a keyword pays

Search intent is what the searcher actually wants to do when they type the query. The same query string can carry different intent depending on who is typing it, and the SERP usually reflects that ambiguity instead of resolving it.

Take “running shoes”. From a first-time buyer the query is informational, what to look for, how to size, what stability even means. From a marathoner whose pair just wore out, the same two words are transactional, which model to reorder and where to get it shipped fastest. Google cannot tell the two users apart from the query alone, so the SERP mixes listicles, product cards, brand pages, and a definition snippet on a single page, hedging across both jobs.

A keyword with high volume and the wrong dominant intent for your page will never convert, no matter how well written the page is. The four-way classification is the floor, and reading the SERP is the most reliable signal of which job Google currently thinks the query is doing.

The four categories cover almost every query. Informational queries want a definition or explanation, like “what are stability running shoes”. Navigational queries want a specific destination, like “hoka outlet near me”. Commercial queries are comparing options before buying, like “best running shoes for flat feet”. Transactional queries are ready to act, like “hoka clifton 9 price”.

The wording is a weak signal, the SERP is a strong one. If the top ten results for your target query are all listicles, the intent is commercial regardless of how the phrase reads. If they are all official docs pages, the intent is informational. The ranking algorithm has already done the classification work for you, by promoting whichever format users keep clicking.

The SERP format is the most honest signal of intent

Keyword Difficulty is a useful lie you should still respect

Keyword Difficulty, often abbreviated KD, is a 0 to 100 score that estimates how hard it is to rank for a keyword. The score is mostly a function of the backlink profiles of the pages currently in the top ten. KD is a useful lie because every tool computes it differently, and the absolute number is unreliable across vendors.

The lie is verifiable. Ahrefs Keywords Explorer defines KD as the number of referring domains the top-ranking pages have on average. Mangools KWFinder blends backlinks with domain authority and SERP characteristics. A keyword that scores 35 in one tool may score 52 in the other, and neither is wrong. Both are estimating different things.

0 to 100
the standard KD scale across major tools
Backlinks
the dominant input behind almost every KD score
1 in 10
rank cutoff every KD score is trying to predict

KD is best used as a coarse filter. Anything above 70 is a fight for established sites with hundreds of referring domains. Anything below 30 is reachable for a new site within a quarter, if the page is good and the topic fits. The middle band is where most of the real planning happens, and where matching intent and depth matters more than the score itself.

Long-tail keywords are where builders win

A long-tail keyword is a query of three or more words with lower search volume, lower difficulty, and higher conversion intent than the head term it descends from. “Running shoes” is the head term, “best running shoes for flat feet beginners” is the long tail. For a new site, the math only works on the long tail, because the head is owned by domains with a decade of backlinks.

The pattern is consistent across case studies. A Webflow keyword research case shows long-tail phrases pulling double-digit conversion rates while head terms barely move the needle. HubSpot recommends targeting keywords where your domain authority is within ten points of the average top-ten result, which for a six-month-old site almost always means the long tail.

The numbers
Your odds on a new site
Head term: 'running shoes'
Volume: 246,000 a month. KD: 82
Near zero within 12 months
Mid-tail: 'trail running shoes'
Volume: 14,800 a month. KD: 58
Possible with strong content and links
Long-tail: 'best trail running shoes for flat feet'
Volume: 320 a month. KD: 22
High if the page matches intent

The volume on a single long-tail term looks tiny. Twenty long-tail pages each pulling 320 a month is 6,400 a month, with conversion rates four or five times the head term, and a fraction of the writing effort per page. The total beats the head term in revenue terms for almost every product site under a year old.

The 2026 caveat is intent. A lot of long-tail queries are informational, “what is overpronation”, “how often should I replace running shoes”, “do I need stability shoes”, and informational is exactly the intent AI Overviews and chatbots resolve on the SERP itself. A page that ranks first for “what is overpronation” may still see most of its clicks intercepted by the AI summary stacked above the organic block.

The long tail still belongs to builders in 2026. The bet has just moved further down the funnel, toward the commercial and transactional phrasings the model has to send a user out to read.

The head term wins one query, the long tail wins the total area under the curve

AI search splits one intent across many phrasings

AI answer engines added a synthesis layer on top of retrieval, and that layer rewrites the user’s prompt into several search queries before it fetches anything. A single user intent fans out into a family of phrasings the model generates internally. The job of keyword research changed, you are no longer targeting one phrase, you are being the source for a family of phrasings.

The traffic mix moved with the platforms. HubSpot reports search behavior splits 88 percent Google, 31 percent social, 12 percent AI chatbots in 2025, with overlap because users hop between surfaces. The 12 percent figure is still small, but the queries that land in chatbots concentrate in commercial and transactional intent, which is the intent that converts.

A user asks ChatGPT “what are the best running shoes for a flat-footed beginner training for a first marathon”. The model rewrites the prompt into three or four query variants (“stability running shoes for flat feet”, “best marathon shoes for beginners”, “running shoes for overpronation”), retrieves results for each, then synthesizes one answer. As covered in the AI fourth step from chapter 1, being cited means being the source the model returns to across the family of rewrites, not just one of them.

One user prompt fans out into many queries before synthesis

Prioritization is the chapter, the rest is collection

Prioritization is the step that turns a five-hundred-keyword spreadsheet into a ten-page content plan. Collection is mechanical, prioritization is the judgment that decides whether the next quarter ships traffic or noise. Cut the bottom 95 percent, target 10 to 20 keywords this quarter, and let the rest wait.

The triage uses three axes on a fixed scale, volume, difficulty, business fit. Volume is the search volume number from any one tool, used as a relative ranking inside that tool. Difficulty is the KD score, also from one tool. Business fit is the most subjective and the most important, and Ahrefs documents it as Business Potential on a 0 to 3 scale, where 3 is “our product is the answer” and 0 is “no obvious tie to revenue”.

  1. 01

    Score every keyword on Business Potential

    Zero to three. Three means your product is the literal answer to the query. Two means the query is in your category. One means your audience reads it but does not buy. Zero means cut.
  2. 02

    Cut everything scored zero or one

    Even at low difficulty, a keyword that does not connect to revenue is a tax on the content calendar. A page that ranks for nothing your buyer searches is a page that pays nothing back.
  3. 03

    Sort the survivors by KD ascending

    Within the keepers, low KD comes first. A new site cannot fight for KD 70 keywords in the first quarter, no matter how perfect the business fit. Start where the SERP is winnable.
  4. 04

    Pick the top 10 to 20 for this quarter

    Anything above 20 is wishful thinking for a small team. Each keyword needs a dedicated page, and a quarter has roughly 12 weeks of writing capacity at one page per week.
  5. 05

    Tag each keyword with its intent and prompt

    Add the four-way intent label and the LLM prompt that wraps the keyword. The pair becomes the brief for the page, format from intent, copy from prompt, target query from keyword.

The output of triage is a list short enough to fit on a single screen. Twelve to twenty keywords, each tagged with intent, KD, business fit, and the prompt that wraps it. From the SEO basics work in chapter 2, each keyword now needs its own page, with a single canonical URL, internal links from sibling pages, and a place in the site’s topical cluster.

Keyword research did not change shape in 2026. The same three moves still carry the practice, collection, classification, triage.

What changed is the cost of getting it wrong. A page targeting the wrong phrasing used to fail once, on Google. Now it fails twice, on Google and on the prompt the model rewrites before it retrieves anything. The seeds matter more, the SERP read matters more, the triage matters more, because every misfire compounds across two surfaces instead of one.

Same method, thinner margin.

Contents
Table of contents