Thursday, 18 December 2025

AI's potential impact on web search, traffic and trademarks

Functionality powered by Artificial Intelligence (AI) is now a fully integrated component of a wide range of Internet and information technology systems, and the world of online search is no exception. In addition to the availability of a range of standalone generative AI ('gen-AI') tools and applications (such as ChatGPT), many major search engines now also provide an 'AI overview' response produced by gen-AI (displayed above the main ('organic') search results, in the case of Google). This evolution is having major impacts on consumers' search and browsing behaviours, and consequently the patterns and volumes of traffic being driven to brand-owner or other trusted websites are changing.

The potential implications of the increasing degree of adoption of AI in online search technologies are, in some cases, relatively predictable. The most likely scenario is that users will increasingly obtain the information they need from the AI overviews alone, without clicking through to the sites from which the information is sourced, and thereby leading to a decrease in volumes of traffic to these sites. However, these patterns may vary depending on the nature of the searches being carried out. For example, general questions may be more likely to generate satisfactory AI responses, whereas explicit searches for content on specific trusted sites are likely to continue to generate a higher click-through rate to these sites, regardless of whether or not the response links are served up by gen-AI. Overall, the evolving trends are likely to drive a desire by brand owners to maximise their chances of being referenced or promoted in AI overviews (Figure 1). This has led to the new concept of 'Generative Engine Optimization' ('GEO'), analogous to the older concept of Search Engine Optimisation (SEO) (referring to the process of ensuring, through suitable website configuration and/or the inclusion of appropriate content, that a website is findable, and highly ranked, by search engines).

Figure 1: Example of a search-engine response produced by generative AI (in this case, Google's 'AI Mode'), which references a number of specific brands

The consequences of many of these trends are already being observed; several studies[1,2,3,4] from early 2025 report huge decreases in 'classic'-search-generated click-through rates to websites, in response to the appearance of AI overviews which generally occupy the most prominent areas of the search engine results pages (SERPs). In one such study, the CEO of Expedia is quoted as stating that the company is "partnering with AI search companies to ensure our brands show up well across customer queries". This type of approach can be advantageous whether or not users click through to the promoted sites from the AI summaries, given the potential for a 'long-form' overview to provide a more detailed synopsis of a brand's 'value proposition' than a simple search abstract, and thereby result in a customer conversion rate which is likely to be higher[5].

In some respects, we might expect GEO - which, at a fundamental level, must involve brand owners actively ensuring that the underlying LLMs are trained on brand-favourable material - to work in similar ways to SEO. Indeed, one key element of successful GEO is a requirement for organisations to ensure that the brand is widely and favourably cited on the Internet, in locations which are popular and highly linked from other popular sites[6]. However, another part of this picture involves an understanding of the mix of sources from which AI tools primarily draw their data; a number of recent analyses[7,8] suggest that platforms heavily featuring user-generated content (such as Reddit, Wikipedia and YouTube) are highly favoured. Overall, an effective holistic programme of GEO can be a complex proposition - involving efforts to ensure that the right types of content are presented in the most appropriate formats, and in the right locations[9] - resulting in the emergence of a number of providers explicitly offering services to assist with the process (Figure 2).

Figure 2: Example of a website of a GEO service provider

The nature of GEO techniques also means there is potential for the system to be manipulated - as is also possible with SEO - such as via the practice of artificially 'seeding' the Internet with positive brand content. There are also suggestions that the current generations of AI tools have a low effectiveness for excluding low-quality sources, potentially making them more open to exploitation. One approach which could therefore correspondingly be employed by fraudsters might involve the creation of a network of false, but professional-looking sites, promoting an infringing brand. These types of attack have already been observed, having been employed for cryptocurrency, banking, and travel scams[10].

The trends described above are likely to necessitate a push towards brand-protection programmes being required to include a wider suite of mediating components, such as functionality to monitor LLMs explicitly. Growth in the extent of adoption of AI search may also drive a boost in the importance of other 'classic' brand-protection initiatives; one example might include a greater requirement to ensure that legacy official domains (which may still be cited by AI tools) are kept in brand owners' official portfolios, and there may also be stronger requirements to build a robust defensive domain portfolio and implement comprehensive proactive monitoring and enforcement measures[11].

The developing world of GEO - particularly when considered alongside the emergence of commercial advertisement placement offerings by AI search providers - might also result in a fundamental shift in the nature of search-related LLMs, towards a scenario where they are generating what are essentially 'paid results', rather than 'organic results'. Any evolution of trends along these lines may necessitate a push towards the implementation of new trade regulations, by which search tools may be required to explicitly distinguish between the different types of results, similar to the legislation which took place for 'classic' search results in the 1990s and 2000s. There is also a possibility that the commercialisation of AI search outputs might ultimately drive a general loss of trust by Internet users in these types of service.

As a final point, it is significant to note that industry commentators are also starting to suggest other possible implications of the growth of AI. For example, one recent study notes that product selection by consumers is increasingly being made on the basis of AI-generated personal recommendation, rather than being driven by brand recognition or specific brand searches. This trend is beginning to result in the emergence on marketplaces of products with unintelligible, nonsense brand names, which are purely optimised for algorithmic popularity based on user insights. The emergence of such 'brands' can fundamentally undermine the function of a trademark as a (memorable) indicator of origin, and may also require a restatement of concepts such as the roles of conceptual, phonetic and visual similarity. The study therefore concludes that AI assistants and algorithm-driven e-commerce marketplaces may lead to a fundamental change to the nature of brand recognition, and could have the potential to reshape trademark law as we know it[12,13].

References

[1] https://www.barrons.com/articles/ai-google-search-internet-economy-932092ef

[2] https://www.forbes.com/sites/torconstantino/2025/04/14/the-60-problem---how-ai-search-is-draining-your-traffic/

[3] https://startups.co.uk/news/ai-search-hospitality/

[4] https://technologymagazine.com/articles/how-googles-new-ai-mode-could-devastate-web-traffic-seo

[5] https://www.semrush.com/blog/ai-search-seo-traffic-study/

[6] https://ahrefs.com/blog/ai-overview-brand-correlation/

[7] https://www.semrush.com/blog/ai-mode-comparison-study/

[8] https://www.visualcapitalist.com/ranked-the-most-cited-websites-by-ai-models/

[9] Additional detail on GEO techniques used in practice can be found in a set of notes published as a follow-up to this article, at: https://www.linkedin.com/pulse/ai-web-search-part-2-optimising-world-david-barnett-g7eme/

[10] https://secureblitz.com/dark-side-of-llms/

[11] https://iptwins.com/2025/10/02/combatting-fake-information-in-ai-search-results-how-new-search-technologies-are-exploited/

[12] E. Bonadio and A. Rohatgi (2025). 'Trademarks and Artificial Intelligence: Some Preliminary Considerations', SSRN, http://dx.doi.org/10.2139/ssrn.5364772. (Available at: https://ssrn.com/abstract=5364772)

[13] https://www.worldtrademarkreview.com/article/ai-could-make-traditional-brand-signifiers-obsolete-new-study-warns

This article was first published on 18 December 2025 at:

https://www.iamstobbs.com/insights/ais-potential-impact-on-web-search-traffic-and-trademarks

Thursday, 27 November 2025

Br'AI've New World: Experimenting with the use of generative AI for brand protection analysis tasks

EXECUTIVE SUMMARY

Generative artificial intelligence ('gen-AI') tools offer the potential to assist with, and add efficiencies to, a number of analysis tasks, especially where these tasks meet certain criteria. These criteria typically include: (i) being computationally or analytically intensive (i.e. where there are large amounts of data to consider) and/or highly repeatable; (ii) of a type such that there may be subjective elements to the analysis (e.g. where the data to be analysed can be presented in a variety of different ways); and (iii) of a type where the analysis would be hard to achieve manually, but the output is easy to verify for accuracy.

Our latest study explores the application of gen-AI in a range of general areas related to the analysis of data pertaining to brand monitoring services. The experiments make use of a proprietary business workflow automation system (the AI 'tool'), incorporating AI and natural language capabilities to generate agentic functionality, with which interaction is possible in a 'chatbot' style.

The analysis considers two broad areas of analysis. The first area is related to sets of tasks required to carry out 'clustering' analysis (i.e. the establishment of links between related findings, on the basis of shared associated characteristics, which is advantageous for identifying priority targets associated with high-volume or serial infringers, demonstrating bad-faith activity by bad actors, allowing efficient bulk takedowns, and providing data for entity investigations). The specific tasks considered in this area fall into the categories of generalised 'scraping' functionality (i.e. the extraction of data points from arbitrary webpages) and the parsing (i.e. interpretation) of free-form data, such as that provided in domain name whois records. The second area of analysis relates to the high-level categorisation of results (i.e. potentially infringing webpages or other content) identified through brand monitoring services. 

The analysis shows that gen-AI does have significant potential in assisting with these broad task areas, although the specific use-cases must be carefully selected to ensure that the tools in question are a good fit to the tasks and the potential for undesirable outputs such as 'hallucinations' can be minimised. The specific AI tool used in the analysis is particularly applicable, as it incorporates functionality to save the necessary prompts as pre-defined 'tasks' (i.e. the creation of agentic systems), which should provide a basis for greater repeatability and efficiency going forward.

This article was first published on 27 November 2025 at:

https://www.iamstobbs.com/insights/braive-new-world-experimenting-with-the-use-of-generative-ai-for-brand-protection-analysis-tasks

* * * * *

WHITE PAPER

Introduction

Generative artificial intelligence ('gen-AI') technology is undoubtedly one of the hot topics over the last couple of years, and the Internet is awash with discussions regarding potential use-cases for gen-AI, concerns over its ubiquity and the risks this presents. Whilst many of the cited uses for generative AI are (arguably) questionable at best, there is certainly potential for thoughtful and measured application of AI capabilities to make possible - and build efficiencies into - critical tasks which may not be easily achievable using 'classic' or manual approaches. 

In considering appropriate use-cases, it is crucial to have a good understanding of what gen-AI is, and what it fundamentally can and cannot do. Firstly, gen-AI systems do not (and cannot) think or understand (at least, not in any useful conventional senses of the words), but merely take a probabilistic approach to generating a response to a prompt they have been given, based on the set of content (the large-language model, or LLM) on which they have been trained. As such, they are engineered to have (only) the appearance of an entity which is (truly) intelligent and, by extension (at least for tools built in the way which is currently most common), do not have any real scope for creativity (in the sense of generating responses which are wholly distinct from their training material). Furthermore, many such tools are (unless explicitly instructed otherwise) usually poor at giving an honest response when requested to perform a task which is beyond their capabilities, instead (by design) generally just giving a response which is plausible, even if a correct response is not available or possible - essentially, the concept of 'hallucinations'. That said, there are a number of areas where gen-AI applications are valuable, specifically where the task is (some or all of): (i) computationally or analytically intensive (i.e. where there are large amounts of data to consider) and/or highly repeatable; (ii) of a type such that there may be subjective elements to the analysis (as opposed to objective or deterministic tasks, where arguably a classic analytical approach (i.e. algorithmic, script- or code-based) is more appropriate; and (iii) of a type such that the output would be hard to achieve manually, but (perhaps most crucially) is easy to verify for accuracy. This type of appreciation is key to making a successful judgment of the types of task for which an application of gen-AI systems will be appropriate - typically, these are repeatable tasks in areas such as data analysis, pattern analysis, and extrapolation.

In this paper, I consider a number of such tasks which are applicable to the general area of analysis in brand monitoring programmes. As a general point, it is worth noting that many brand protection service providers have been citing the use of AI technologies for a number of years. In many cases, the associated functionality may be little more than deterministic (i.e. script- or algorithm-based) analysis or categorisation processes which, whilst technically applications of artificial intelligence (in that they involve systems making automated 'decisions'), would not be classed as AI according to the most common current interpretations of the term[1]. However, many providers are beginning to implement 'true' AI capabilities into their technologies. Examples across the set of product offerings from the Stobbs group of companies include computer vision capabilities (for logo and image analysis, and optical character recognition (OCR)), the ability to interrogate portals and databases using natural language, and predictive threat categorisation of potential infringements based on enforcement history. 

Conversely, however, this paper focuses specifically on some other distinct general brand protection analysis task areas, and comprises an exploration of the extent to which elements of these tasks might be automatable through a gen-AI approach, using simple sets of prompts. The experiments utilise a proprietary business workflow automation system (the AI 'tool'), incorporating AI and natural language capabilities to generate agentic functionality, with which interaction is possible in a 'chatbot' style. The tool makes use of API connections from multiple LLM providers, configured in such a way that no data from prompts or other inputs can be used for model training, so that data can be submitted and analysed on a secure and confidential basis, and the tool can also be utilised with local LLMs.

From this series of tests, the capabilities of gen-AI tools of this nature can be explored by considering the nature and accuracy of the responses.

Task types to be tested

'Clustering' of linked results from a large database of findings

This general problem, involving the establishment of links between related findings on the basis of shared associated characteristics, has a range of benefits (and is also the subject of a previous article[2] suggesting it as a candidate for AI applications). The benefits of clustering include the identification of priority targets associated with high-volume or serial infringers, and demonstration of bad-faith activity, the potential for allowing efficient bulk takedowns, and the provision of data for building a fuller picture of an underlying entity (i.e. investigations). The ability to cluster is itself dependent on a number of 'sub-tasks', such as the ability to extract and analyse the relevant pieces of information and characteristics from potentially rich datasets of a range of types. Some examples of such tasks are discussed below.

Generalised 'scraping' functionality

'Scraping' refers to the process of extracting key pieces of information from a webpage, and is most usually applied to features such as seller names, prices, quantities, etc. (typically used for the prioritisation of results, and for carrying out return-on-investment ('ROI')-style analyses) from e-commerce marketplace listing pages. In the traditional approach, this process requires the building of site-specific scraper scripts, which are reliant on the format of the pages (and thus the locations on the page of the relevant pieces of information) being known in advance and fixed. Gen-AI offers the potential both to extract information from arbitrary webpages, and to extract it from rich data types (such as where a piece of contact information may appear as a watermark in a product image). 

Tests: scraping of key features from e-commerce listings

(i) In an initial test relating to a potentially infringing product listing on Alibaba, the tool was able to correctly extract certain features (such as seller name and location of product origin), and also provide a high-level description of the product, together with an assessment of the potential for infringement (Figure 1), an area which is also relevant to the issue of categorisation of brand monitoring findings (see section below). 

Figure 1: Gen-AI assessment of a potentially infringing product on Alibaba

(ii) In a second test, the tool was provided with screenshots of two pages from e-commerce marketplaces (product listings / seller profiles), each of which shared a brand name (displayed within an image) in common (Figure 2), and the tool was able to correctly extract the name and establish its presence in both images (Figure 3), a key prerequisite for the ability to cluster. Note that the provision of the content as a screenshot may be necessary in cases where the tool is unable to access the material live via the provision of a URL (due, for example, to a requirement to be logged into the platform in question in order for the content to be visible), but does also guarantee that the tool is truly extracting the information from within the image itself, rather than relying on textual meta-data which may also be present on the page.

Figure 2: Examples of marketplace pages featuring content sharing a brand name in common

Figure 3: Gen-AI analysis of the content shown in Figure 2

(iii) In a third test, the tool was also able to provide a product description from just a screenshot of a product image from an e-commerce listing on social media, and also correctly extract the code from the image watermark (Figures 4 and 5). In another similar test, a product price was also successfully extracted from a product image watermark.

Figure 4: Image of a product listed for sale on social media

Figure 5: Gen-AI analysis of the content shown in Figure 4

Parsing of rich datasets featuring information in a range of different formats

Tests: analysis of domain name whois information

For this set of tests, the tool was provided with a spreadsheet containing just a list of domain names (pertaining to a fashion brand) and the series of raw whois records (comprising free-form data, providing ownership and other registration and configuration information) pertaining to these domains.

(i) The tool was able to successfully parse this raw dataset and extract key pieces of information for each domain (Figure 6), which can serve as the basis for further clustering, and (through the extraction of contact details pertaining to the relevant registrars) can assist with the analysis required for the subsequent sending of enforcement (takedown) notices, if appropriate. 

Figure 6: (Anonymised) output from the gen-AI analysis of a series of raw domain whois records

(ii) As a follow-up task, the tool was also able to cross-reference the identified registrar contact e-mail addresses across the (external) official websites of the registrars in question, and extract additional / alternative contact addresses where present, in addition to providing a list of links from which the information was sourced (Figure 7).

Figure 7: Gen-AI summary of analysis of additional registrar contact e-mail addresses

(iii) Based on the dataset produced in part (i), the tool was also able to establish the existence of clusters of linked domains, based on the presence of details in common between the results, and automatically visualise this information in a range of different ways (Figures 8 and 9). In some cases, (e.g. the use simply of the same privacy protection service provider) the characteristics used are likely not to be distinctive enough to assert the presence of a meaningful link between the results, but there was also some success in prompting the tool to make a distinction between genuine 'real-world' details and less diagnostic pieces of information such as the names of privacy services or other fields indicating that the record had been redacted.

Figure 8: Gen-AI overview of the registration timeline for the domains

Figure 9: (Anonymised) gen-AI summary of the presence of domain 'clusters' within the dataset

Categorisation of brand monitoring results

The categorisation (into high level content- or potential infringement types) of brand-related webpages - as identified through a programme of monitoring - is a key component of many brand protection technologies (with the aim of aiding with review and prioritisation). In many cases, this categorisation is achieved through a 'hard-coded' rules-based analysis using pre-defined keywords, but there is clear potential for the application of an AI- or machine-learning-style approach to provide greater flexibility and efficiency. 

Based on a simple initial prompt, the tool is able to offer a suitable framework and methodology for this type of categorisation process (Figure 10).

Figure 10: Gen-AI proposal for the categorisation process for brand monitoring results

Following a request for a more focused set of categories based on 'general Internet content' specifically (including a range of arbitrary types of brand references, which may or may not be 'correct' / 'legitimate', and where the nature and context of the way in which the brand is referenced is likely to be the factor of greater interest), the tool is again able to provide meaningful suggestions (Figure 11).

Figure 11: (Anonymised) gen-AI proposal for categories for 'general Internet content' results (for a specified financial services brand)

Based on a less granular suggested framework, the tool was successfully able to review a provided list of brand-related URLs and provide a categorisation summary in a tabular form, including a high-level content description and any other relevant notes (such as comments on assessed legitimacy or threat-level) in each case (Figure 12).

Figure 12: (Anonymised) gen-AI-based categorisation and description of a set of brand-related webpages pertaining to a financial services brand

In a similar vein, some success was also noted in suggestions for categorisations based just on features of a set of brand-related domain names (and with no account taken of the content of any associated websites) (Figure 13). 

Figure 13: Gen-AI proposal for the categorisation process (by potential threat level) for domain names

Application of a gen-AI approach for this sort of task might be most appropriate in cases where (say) the domain monitoring has been configured to search for brand variants (such as typos), rather than just domains containing the exact spelling of the brand. In such cases, it may otherwise be more complex to configure a deterministic, script-based (i.e. 'classic') basis for analysing and ranking the domains[3], due to the potentially wide range of different ways in which the brand name (or the approximations to it) can be presented.

Conclusions

Overall, the results from these initial tests are encouraging, in terms of the potential for gen-AI to provide meaningful assistance with certain standard brand protection analysis tasks (particularly for the specific examples presented in this paper), to augment existing capabilities. Other tests have, however, shown varying degrees of success, in terms of the accuracy and/or completeness of results (and the potential for 'hallucinatory' results). In some cases, these issues may be due to inaccessibility to the tool of the required information, but in other cases it may be possible to make significant improvements through continued refinement of the prompts used. As optimal prompts are identified, the functionality of the tool utilised in this study allows for the instructions for any given analysis type to be saved as pre-defined 'tasks' (i.e. agentic systems), which should provide a basis for greater repeatability and efficiency going forward.

References

[1] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 14: 'New developments'

[2] https://circleid.com/posts/braive-new-world-part-1-brand-protection-clustering-as-a-candidate-task-for-the-application-of-ai-capabilities

[3] https://www.iamstobbs.com/insights/exploring-a-domain-scoring-system-with-tricky-brands

This article was first published as a white paper on 27 November 2025 at:

https://www.iamstobbs.com/uploads/general/BrAIve-New-World-Gen-AI-in-BP-analysis-e-book.pdf

Friday, 21 November 2025

AI and web search (Part 2) - Optimising for web search in a world of AI

Notes from the B2B Marketing Live UK event, 19 - 20-Nov-2025, ExCel, London

(original version 21-Nov-2025; updated 27-Nov-2025)

Following on from our recent work[1] considering the general subject of the potential impacts of artificial intelligence (AI) on web search, web traffic and trademarks, these notes from the recent B2B Marketing Live event present additional detail on current opinions relating to Generative Engine Optimisation ('GEO'); that is, the general suite of techniques which can be employed by brand owners with the aim of trying to ensure that their brand and website content are referenced as strongly as possible in results from AI-powered search.

Key points

The search landscape is fundamentally changing

The appearance of AI overviews (AIO) in search results has recently seen huge growth; AIO are now used by 1.5B users per month in over 200 countries, with anywhere between 18% and 47%[2] (estimates vary) of searches generating AIO, and increases of around 116% in the prevalence of AIO since May 2025[3]. Furthermore, AIO typically cover 70% of the SERP (search-engine results page) screen space. Complex, decision-related queries (such as those associated with B2B business) may also be more likely to generate AIO.

Accompanying these trends are the facts that, although website impressions (i.e. appearances in organic search results) are typically found to be up by brand owners, there have been significant decreases in click-through rates (CTR) to websites (down by between 32% and 35% for the highest-ranked results)[4,5], with the CTR for the top-ranked (organic) search-engine result having been found to have decreased from around 28% to 20%. 

In addition, 'standalone' AI tools are also seeing large increases in usage, with many users increasingly not using 'classic' search tools at all, but instead being reliant on obtaining their answers 'in platform'. The most popular such tool, ChatGPT, now has 5.24B users per month. Furthermore, it is estimated that 58% of all consumers now rely on AI for product and service recommendations, and 89% of B2B buyers use gen-AI as part of their purchase decision-making process. Search engines are also beginning to introduce embedded functionality such as Google's 'AI Mode', analogous to a gen-AI tool (though with the Google example currently only seeing 1-2% adoption).

Marketing success can no longer be judged (just) by click-through rates (CTR)

The evolutions in the search landscape are driving an increased acceptance by brand owners of 'zero-click' and reduced website traffic. Conversely, however, the extent of citation by AI tools is becoming an ever-more important metric, and is itself associated with factors such as brand visibility, authority, customer engagement, and trust.

Also of key importance is the fact that inbound brand referrals from generative AI ('gen-AI') tools or AIO tend to be more likely to convert to successful sales than click-throughs from classic search (due to an implicit trust by users of AI sources), with brands typically seeing sales growing more quickly than their observed increases in AI-driven traffic[6]. Overall, brands cited in AIO receive 35% more organic clicks and 91% more paid click than those not cited. Furthermore, customers searching asking complex (i.e. later decision-stage) business queries via LLMs are also likely to be associated with higher conversion rates, because of their inherent greater degree of 'pre-qualification', and overall it is noted that 61% of purchasers are influenced by AI responses. These observations are leading to the emergence of a dual objective for brands: (i) increasing AI traffic (i.e. GEO) and (ii) monetising more of this traffic (a relatively easier problem because of the greater inherent conversion rate from AI traffic).

Discoverability by humans and AI is key

It is becoming increasingly clear that brand and website content needs to be optimised for discoverability by both human searchers and AI systems (agents, LLMs, etc.). Whereas human users traditionally have preferences towards website clarity, usability and value proposition, AI systems tend to favour a different set of characteristics for those websites to be used more frequently as their trusted sources. These might typically include:

  • Data being presented in a structured format - including mark-up, schemas, APIs, etc., plus an emphasis on plain-text content (and reduced usage of features such as Javascript rendering), use of semantic HTML, XML site-maps, technical signposting for crawlers (e.g. allowing access in the robots.txt file), etc.
  • Indicators of trust - e.g. author mark-up, source references, authoritative in-links, etc. Trust and credibility for brands can be boosted through initiatives such as blogging, providing guest posts for external websites, and publishing content in other formats (such as on YouTube).
  • Sites incorporating extensive volumes of original information - including comprehensive FAQ sections, etc. There is also some suggestions that "best" or "top" lists ('listicles') are favoured by LLMs in some cases. Brands can also increase the likelihood of being referenced in AI responses by including 'actionable IP' (brand-specific) content on their websites, particularly where this addresses key questions asked by buyers in their decision-making process and provides sales qualification information. The production of specific, textual content to address specific requirements for landing pages from pay-per-click (PPC) ads may also be beneficial.

Increasingly, brands are utilising segmented websites, incorporating both human-focused areas with extensive brand-heavy, visual and interactive ('conversion optimised') elements, and (often human-invisible) AI-favourable content, which is text-heavy and detail-, explanation- and answer-oriented. Traditional marketing techniques, such as offering downloadable information sheets (especially if 'gated' with a requirement for users to submit contact details) are becoming less popular, with users instead tending increasingly to source answers from (ungated) gen-AI tools.

Understanding how AI systems process queries is crucial - and highlights a continuing importance of classic SEO

When an AI tool is presented with a prompt or query, it typically splits these into smaller sub-queries for individual processing ('query fan-out'). Frequently, however, these tools will utilise classic search engines to source responses to these sub-queries (e.g. Gemini uses Google, ChatGPT uses Bing), although some make use of proprietary bots. It is also noteworthy that Google remains the market leader for organic search by a significant margin (with AI engines still seeing a market share of less than 1%).

As such, traditional search-engine optimisation (SEO) techniques are still of relevance, and a high volume of branded web mentions is still found to be the factor with the highest degree of correlation with the extent of AI references[7]. Overall, "good digital marketing" - i.e. the aim of achieving an extensive online range of structured, contextual brand references which address EEAT (expertise / experience / authoritative / trustworthiness) criteria - still sits at the intersection of SEO and GEO. 

Overall, however, having a brand presence across a wide range of channels and content types is key to being cited in AI responses. Key areas appear to be 'rich content' channels such as YouTube (particularly if titles, descriptions, etc. have been optimised for LLM readability) and user-generated-content or community channels, such as review sites. These types of insights can be drawn by analysing which sites are most frequently cited in AI responses, and this analysis consistently reveals that plaforms such as Reddit are favoured by LLMs. A diversification in platform focus for branded content is also particularly important in an era where many (particularly younger) users are becoming increasingly focused on 'social search' - i.e. utilising the native search functions in platforms such as TikTok, Pinterest, Reddit and Instagram in order to source answers to queries. Around one-quarter of users discover brands for the first time on social platforms, and many users utilise search across a range of platform types before making a final purchase decision. The aim for brand owners is therefore to optimise their own content across the same set(s) of platforms as are being used by their customer base, an initiative which is doubly beneficial given the increasing frequency with which classic engines such as Google are themselves also returning results from these types of platform. A key idea is the concept of the 'Day One' list, reflecting the fact that the most effective brand awareness for users is that generated by the results from the initial sets of searches carried out by them.

A key associated measurement factor when considering the likelihood of citation in AI responses in response to queries is 'share of model'. One recent analysis[8,9] suggests that the top 50 most highly-cited domains account for 48% of AI citations, but with the remainder of citations making reference to a large dataset of other sources. These sit within a range of website categories (including technology media, product vendor websites, educational resources, and consultancy providers), with the specific sources utilised being dependent on query type (e.g. commercial vs. informational vs. transactional intent), and highlighting the importance of niche expertise (as well as more classic indicators of 'authority') in increasing the attractiveness of web content to AI tools.

It may also be informative for brands to track their presence and sentiment in LLMs by posing direct queries to the associated tools ('AI model reporting'), which can include specific questions relating to areas known to influence customer preference such as price, customer service, product features, ease of use, etc. A useful follow-up to this type of analysis can also be for brands to post their own mediative content online, to address any issues where identified.

However, branded content must be authentic and non-generic in order to build credibility (both with human users and with AI crawlers). Google also increasingly rewards material with 'personal' and 'expert' content. As such, online placement by brands of authored content can be beneficial, but should ideally also be accompanied by positive references from brand advocates, influencers and employees - noting that the set of employees of a company typically together have a following around twelve times the size of the company's official profile page.

AI responses themselves are becoming increasingly commercialisable

Some AI tools (including AIO / Google AI Mode) are starting to offer the option for paid-ad placement within them - in many cases, this is currently only available in the US, but is likely to expand in scope. Initial use-cases are likely to be focused towards e-commerce, as certain categories of products and services (such as pharmaceuticals and gambling) may be considered 'restricted verticals'. Increasingly, we are also likely to see more flexibility in targeting options for these ads (such as Google's 'Broad Match', 'AI Max' and 'Performance Max'), relating to the ways in which they are to be served up in response to particular query types (rather than being keyword-based). 

Care must, however, be taken by brand owners with AI ad placement, as brand references in AI responses are less amenable to control over the context in which the brand is mentioned, which could be problematic if (for example) there are regulatory requirements regarding the way in which the brand must be promoted, or concerns about being referenced alongside competitors.

Additionally, some AI providers may be reluctant to offer paid boosting of brands, due to implications regarding trust, especially that of paying customers. OpenAI's Sam Altman recently stated that "ads on a Google search are dependent on Google doing badly; if [Google] were giving you the best answer, there'd be no reason ever to buy an ad above it"[10].

AI can also be leveraged to itself optimise brand content

Marketing success in the world of GEO is dependent on having brand content structured and presented in appropriate ways, with one estimate claiming that 90% of branded content will be synthetic by 2026. A final point to note is that AI can itself assist with many of these areas, including:

  • Segmenting content by relevance to specific audience demographics - e.g. tailoring to local language (noting that 76% of B2B buyers prefer to purchase in their native language) or adapting websites or paid-ads to local audiences - though these types of initiative invariably also require an element of human QA. Increasingly, users expect content to be highly personalised, rather than being tailored to broader market segments.
  • Offering capabilities for AI agents to carry out highly personalised tasks (such as brand audits, or ROI or pricing calculators).

  • Optimising paid-media (pay-per click links / sponsored ads), through prediction of queries and keywordless targeting.

References

Event presentations

  • 'Five tactics for driving more leads in an AI-powered global search landscape', C. McKenna and S. Oakford, Oban International
  • 'AI traffic is money traffic', J. Kelleher, SpotDev
  • 'Navigating the era of AI search', B. Wood, Hallam (hallam[.]agency)
  • 'The impact of AI on B2B marketing and five expectations for 2026', G. Stolton, ROAST
  • 'The future of organic search and SEO', J. Powley, Blue Array SEO
  • 'Decoding the future: AI and the impact on B2B marketing', A. Moon, FutureEdge Academy

Other

[1] https://www.iamstobbs.com/insights/ais-potential-impact-on-web-search-traffic-and-trademarks

[2] https://www.bcg.com/x/the-multiplier/the-future-of-discoverability

[3] https://ahrefs.com/blog/ai-overview-growth/

[4] https://ahrefs.com/blog/ai-overviews-reduce-clicks/

[5] https://ahrefs.com/blog/the-great-decoupling/

[6] N. Patel, NP Marketing (see e.g. https://www.instagram.com/p/DQz_jczEjlI/)

[7] https://www.semrush.com/blog/ai-mentions/

[8] https://wellows.com/insights/chatgpt-citations-report/

[9] N. Khan, Wellows, pers. comm. (https://www.linkedin.com/in/khan-neelam/)

[10] https://www.techinasia.com/sam-altmans-lesson-google-trust-ads

This article was first published on 21 November 2025 at:

https://www.linkedin.com/pulse/ai-web-search-part-2-optimising-world-david-barnett-g7eme/

Thursday, 20 November 2025

XARF-way To (Enforcement) Paradise

The process of enforcement is a key element of many brand-protection programmes. Often it involves the submission of some sort of complaint, report, or notice to a platform or service provider requesting the removal or takedown of infringing content. However, this process can be extremely time-consuming, particularly in cases where there are large numbers of enforcements to be processed, or if the specifics of each case require significant amounts of customisation of the individual notices in question. 

As part of a drive towards greater efficiency, it has often been found to be beneficial to investigate approaches such as automation or bulk actions, often in conjunction with efforts to identify top prioritised targets. For 'standard' takedowns involving the submission of a notice to a registrar or hosting provider, it is often possible to make use of a fixed type of letter template, in which only case-specific details need to be varied between each particular instance. This type of enforcement process is highly amenable to automation - using some sort of scripting approach, to action what is essentially a ‘mail-merge’ process - which can greatly aid with efficiency.

Increasingly, however, many Internet service providers (ISPs) are refusing to accept letter-based complaints and are instead directing brand owners or their representatives towards webform-based abuse-reporting systems. Whilst superficially this may appear efficient, the completion of these webforms can actually be a more time-consuming endeavour. Many such reporting systems require complaints to be submitted one by one, or include components such as a need to complete CAPTCHA codes, which can frustrate enforcement efforts.

Conversely, however, one promising development which may aid the process of complaint submissions and the protection of IP rights is an increasing level of support for 'XARF' ('eXtended Abuse Reporting Format'), an alternative abuse-reporting framework comprising a standardised format for submitting abuse reports to ISPs[1]. It allows the relevant pieces of information to be packaged in a highly-defined document format known as JSON (JavaScript Object Notation), which can be communicated via e-mail or other communication routes (such as via an API, or Application Programming Interface). XARF documents are particularly amenable to automated production, such that this protocol also lends itself to more efficient enforcement workflows. Furthermore, certain ISPs explicitly reference XARF as a supported and/or preferred communication type (Figure 1).

Figure 1: Example of a snippet from a response to a standard takedown notice type, from an ISP referencing XARF as a supported communication type for submitting abuse reports

A number of code libraries exist for generating reports in XARF format and creating JSON documents, and these resources frequently include sample templates showing the fields required for abuse reports of a range of types (e.g. trademark infringement, copyright infringement, phishing, malware, child-abuse content, etc.)[2,3,4], which greatly aids with the generation of the required content. 

As such, it can be a relatively simple matter to generate the required content, using a scripted automation approach which allows the reports to be generated in bulk. These can be combined with bulk-e-mailer scripts, to allow the bulk submission of XARF-format takedown notices to ISPs, utilising just an input document containing the details of the targets and the case-specific pieces of information to be varied between the individual reports (Figure 2).

Figure 2: (Redacted) example of an e-mail-based enforcement notice in XARF format, generated using an automated script

This approach is already being tested using live client services, and is generating responses from ISPs which are essentially identical to those produced by 'classic', letter-based submissions. Whilst this alternative approach is not likely to be applicable in all cases - particularly where highly-bespoke, complex notices are required - it does offer the potential for increased automation and efficiency in the brand-protection process, in cases where repeatable styles of notice for standard infringement types would normally be required.

References

[1] https://abusix.com/xarf-abuse-reporting-standard/

[2] https://github.com/abusix/xarf/blob/master/samples/positive/3/trademark_sample.json

[3] https://github.com/abusix/xarf/blob/master/schemas/3/trademark.schema.json

[4] https://www.w3schools.com/python/python_json.asp

This article was first published on 20 November 2025 at:

https://www.iamstobbs.com/insights/xarf-way-to-enforcement-paradise

Thursday, 23 October 2025

Playing with a simple revisitor script for monitoring changes to website content

Introduction

A key part of the analysis workflow in brand monitoring services is often the maintenance of a 'watchlist' of sites. This requirement arises most frequently in services comprising domain monitoring, which detect newly-registered names containing a brand name of interest, but which may not yet feature significant or infringing content.

In these cases, enforcement action may not immediately be possible or appropriate, but there might be a concern that higher-threat content may appear in the future. There is often therefore a need to monitor the domains for changes to their content and provide an alert when a significant change is identified. At that point, a decision can then be made regarding appropriate follow-up action. Requirements for 'revisitor' functionality along these lines can also arise in other brand-protection contexts, such as when enforcement action has already been taken against an infringing target (such as a website or marketplace listing), and the targeted page is then tracked to verify compliance to the takedown action. 

There exist a number of automated tools which track content in this way, but key components of a highly effective version include the ability to analyse an appropriate set of different characteristics of the websites in question, and options to set the sensitivity appropriately - it is not generally desirable (for example) for an alert to be generated every time any change to website content is identified, since many websites incorporate dynamic features which differ every time the webpage is called. Conversely, sometimes a change which is only small, or of a particular type (e.g. the appearance of an explicit brand reference) can be significant.

In this article, I briefly explore the development and use of a Python-based revisitor script to inspect and then subsequently review a set of domain names of potential interest (using data from a domain monitoring service for a retail brand, as a case study). Having a simple, easily deployed script of this nature can be advantageous, in terms of being quick and efficient to roll out, and being fully customisable regarding the specific website characteristics analysed and the sensitivity thresholds to be used. These types of tools generally can be highly useful in the cases of watchlists which may feature many hundreds or thousands of URLs to be reviewed, and can, of course, also be expanded to cover other website features and more complex types of site analysis.

Script specifics

The workflow is built on the basis of a 'site visitor' script, which inspects each of the domains in the watchlist, and extracts the following features (which are 'dumped' to a readable database file):

  • HTTP status[1] - a numerical code corresponding to the type of response received when the domain name is queried; a code of '200' indicates a live website response (i.e. potentially an active webpage)
  • Page title[2] (as defined in the HTML source code of the page)
  • Full webpage content[3] (all text, plus formatting features and other content such as embedded scripts - i.e. the full HTML content)
  • Presence / absence of each of a set of pre-defined keywords[4] - applicable keywords for analysis might typically include brand terms or other relevance keywords (e.g. for a retail brand, terms indicating that e-commerce content is present ('buy', 'shop', 'cart', etc.))
  • Final URL[5] - e.g. the destination URL (e.g. after following any site re-direct)

The basic element of the functionality of the revisitor is then to inspect the same list of sites at subsequent times, as required (or on a regular basis if configured to run accordingly), extract a list of the same features, and then compare these with the corresponding features from the same site from the previous round of analysis (as read from the database file). In an initial simple implementation of the script, the following are deemed to be significant changes (i.e. denoting that the site is now worthy of further (manual) inspection and consideration for follow-up action):

  • A change to an HTTP status of 200 (i.e. the appearance of a live website response)[6]
  • Any change to the page title
  • Any(*) change to the webpage content
  • Any instance of the appearance of a keyword of interest (where not previously present)
  • Any change to the final URL (e.g. the appearance or disappearance of a re-direct)

Of course, none of these changes guarantee that the website is now definitively of concern or infringing, but it does generate a 'shortlist' of sites generally then requiring manual review for a definitive determination of appropriate next steps (much more efficiently than having to review the whole set of sites in the watchlist manually on a regular basis). 

Considering content-change thresholds

As discussed above, one of the trickiest features is the determination of an appropriate 'threshold' for alerting to changes to webpage content. The simplest configuration is simply to trigger a notification for any change(*), but in some cases this option may turn out to be too 'sensitive' and might generate too many candidate sites for convenient further manual review (depending on the size of the watchlist and the interval between successive inspections).

As a further exploration, it is instructive to investigate a numerical basis for quantifying degrees of webpage change, and what these differing degrees 'look like' in practice. There are a number of potential algorithms for quantifying the degree of difference between two passages of text (as discussed, for example, in previous work on mark comparison[7]); however, the simple script discussed in this article employs the Python library module difflib.SequenceMatcher[8] applied to the full HTML of the page (split across spaces into individual 'words') to calculate a difference score. This simple score is based on the ratio of the number of 'similar matches' (i.e. words in common) between the two versions of the page in question, to the total number of elements (words). Furthermore, the script has been configured to also provide a more granular view of the exact nature of the change, comprising a summary of which elements (i.e. words in the HTML) have been removed from the (HTML of the) page between the two successive inspections, and which have been added (Figure 1).

(a)

(b)

Figure 1: Examples / illustrations of identified content changes for specific individual webpages between successive inspections: 

  • a) a change to a single dynamically generated string (in this case, Javascript elements) 
  • b) a change from showing an error message to featuring distinct (Javascript) content

Discussion and Conclusions

The examples in Figure 1 provide some initial illustration that the nature of the identified changes are potentially much more important in any determination of significance than (for example) a numerical quantification of the extent of the change (as a proportion of the website as a whole). The first example (i.e. 'a') - a change to a dynamically generated string - is potentially something which might be seen on every occasion the site is inspected and might not correspond to any material change to the page (the visible site content may be entirely unaffected, for example). Conversely, the second example ('b'), representing a change from a simple error message (which, in this case, comprised essentially the content of the website in its entirety) to the appearance of some sort of live, script-generated content (potentially wholly different website content), might be much more significant. 

However, these differences may not be apparent from an inspection of just the numerical 'size' of the change on the page (i.e. the 'difference score'); a variation in a piece of scripted content (such as in Figure 1a) might, for example, just pertain to a small element on a much larger page, or could constitute the dominant component of the webpage as a whole. For example, in a sample dataset, examples of single changes similar to that shown in Figure 1a were found to be equivalent - across the examples of websites in the dataset - to anywhere between less than 5%, or more than 50%, of the whole content of the website in question.

For these reasons, there is always some danger in specifying a specific threshold below which degrees of change to the page are disregarded. In some senses, it is safer to conduct a more detailed inspection of all pages which show any change in content between successive revisits, so as to avoid missing significant cases. However, depending on the numbers of sites under review, this may not be feasible. Accordingly, in future developments or more sophisticated versions of the script, it may be appropriate to refine the scoring algorithm to reflect the nature and/or content of any change. 

However, regardless of the specifics, the general approach discussed in this article is generally able to build efficiency into the review process of sites of future possible concern, potentially filtering down large numbers of sites to be reviewed into much smaller 'shortlists' of candidates identified for deeper inspection and analysis on any given occasion.

References

[1] Using Python library module: urllib.request.urlopen([URL]).status

[2] Using Python library module: bs4BeautifulSoup([URL],’html.parser’).title.text

[3] Using Python library module: urllib.request.urlopen([URL]).read()

[4] Using Regex matching (Python library module: re.search) as applied to the full webpage (HTML) content

[5] Using Python library module: urllib.request.urlopen([URL]).url

[6] However, care must also be taken to distinguish a 'real' change in site status from an 'apparent' change which can arise in instances where (for example) the connection speed to the site is slow, and a connectivity time-out may be mistaken from a real case of site inactivity.

[7] https://www.linkedin.com/posts/dnbarnett2001_measuring-the-similarity-of-marks-activity-7331669662260224000-rh-R/

[8] https://www.geeksforgeeks.org/python/compare-sequences-in-python-using-dfflib-module/

This article was first published on 23 October 2025 at:

https://www.iamstobbs.com/insights/playing-with-a-simple-revisitor-script-for-monitoring-changes-to-website-content




Friday, 10 October 2025

How the growth of AI may drive a fundamental step-change in the domain name landscape

by David Barnett and Lars Jensen (ShortDot)

Introduction

The rate of adoption of artificial intelligence (AI) systems over the last few years, particularly in online and technology-related contexts, has been striking. Automated web-based queries now account for over half of all traffic (51% as of 2024)[1], and nearly three-quarters (74%) of webpages now include some AI-generated content[2]. Overall, traffic generated by AI technologies saw a growth of over 500% in the five months to May 2025[3], and a 2025 study of 3,000 websites found that 63% of them already receive traffic from AI-generated referrals[4]. Looking forward it is predicted that, by 2028, AI-powered search and recommendation engines will drive more web traffic than traditional search[5].

Looking more generally at the landscape, it is estimated by Gartner and other sources that, by 2026 or 2028, 20% of online transactions will be carried out by AI agents[6,7,8,9]. Furthermore, by the end of 2026, 40% of enterprise applications may be integrated with task-specific AI agents, potentially generating 30% of enterprise application software revenue by 2035[10]. Additionally, by 2030, there may be in the region of 500 billion to 1 trillion connected devices, comprising the wider ecosystem of the 'Internet of Things' (IoT)[11,12,13] and (in the absence of mediating factors[14]) this will almost invariably result in an enormous growth in the proportion of DNS traffic categorised as 'machine-to-machine' communication.

It is likely that a significant proportion of these connected entities will require unique DNS identifiers, and many industry commentators are increasingly of the opinion that there will be a desire for a many - particularly agentic AI systems - to be associated with unique domain names[15]. These names could serve as a 'birth certificate' or 'trusted identity' for the systems in question, helping to establish user confidence and familiarity. Any evolution along these lines would have an enormous impact on the overall size of the domain landscape (currently around 350 million names), and it may not be unreasonable to suggest that, by 2050, there may be of the order of 10 to 50 billion registered domains. This propounded evolution of the landscape echoes previous studies suggesting that, in the future, the growth of agentic AI will demand a new layer of verifiable identify infrastructure[16] and that it may be desirable for each distinct AI agent to be tied to an 'immutable root' (i.e. identifier)[17]. This trend would be in some ways analogous to the transition from the IPv4 to the IPv6 system for allocating IP addresses, which created a step-change in capacity from 232 (around 4 billion) to 2128 (around 3 × 1038) possible combinations.

Of course, the shape of the AI-related domain name landscape is already changing. Numbers of .ai domains (for example) have massively spiked since the launch of ChatGPT (notably also driving a fundamental boost to the revenues of parent country Anguilla)[18]. Across the full domain name landscape more generally, there are many tens of thousands of examples featuring keywords pertaining to popular and emerging technologies ('ai', 'crypto', etc.), and this demand is only likely to grow. Such trends may emerge in parallel with the forthcoming second phase of the new-gTLD (generic top-level domain) programme, which might see a push towards the availability of much larger numbers of new brand-, industry- or technology-specific domain-name extensions. Other possible evolutions in business behaviour - such as a possible move towards technology entrepreneurs taking advantage of greater opportunities for AI use and automation, so as to establish and run much larger numbers of businesses - may also drive increased demand in the domain-name landscape.

These comments must also be considered against the backdrop of the fact that the current domain landscape is already - in some regards - beginning to run low on capacity. Whilst the total proportion of all possible domain names which are actually registered is still extremely tiny, there is a relative shortage of short, memorable domain names (particularly those comprising dictionary terms) across popular domain name extensions (TLDs). For example, there are currently essentially no .com domains of 4 characters or fewer available for registration, and very few (short) dictionary terms[19]. These observations are already generating a push towards the use of alternative domain name styles and emerging TLDs, in addition to distinct channels altogether (such as blockchain domains and the Web3 environment)[20].

In terms of the overall landscape of web addresses associated with (agentic) AI systems specifically, what might these trends look like? Two possible directions for development include: (a) the emergence and growth of dedicated domain names for specific AI agents (potentially of the form (for example) [role]AI.[TLD]), with the name signifying the function of the system in question; or (b) the increasing use of AI-specific subdomains (say, AI.[site].[TLD]) within the trusted webspaces (i.e. hosted on the primary domain names) of popular companies, to host agentic systems or other AI functionality. Companies are likely predominantly to continue to use popular legacy TLDs such as .com for the foreseeable future but - as part of these evolving trends - may start to branch out into other existing TLDs, or new extensions emerging from phase two of the new-gTLD programme. Exactly which extensions do succeed will ultimately depend on issues around usability and trust (rather than necessarily just comprising an AI-specific label).

Case studies - the current landscape

As illustrations of the current state of the landscape pertaining to the two specific possibilities discussed above, we consider two datasets, as outlined below.

1. Agentic-AI-style domain names

For this analysis, we consider a list of 100 keywords relating to professions or industry areas (with a specific focus, where possible, on examples where AI applications may be relevant). For each of these, we consider whether a domain name consisting of the keyword, either prefixed or suffixed by the string 'ai', is registered, across each of the top-50 largest existing gTLDs (by size of the domain name zone file, i.e. the data file containing the names and configuration information of all registered domains). Therefore, for 'accountant' (for example), on .com, the analysis looks to determine whether accountantai[.]com or aiaccountant[.]com are registered as domain names. This methodology thereby yields 200 possible (or 'candidate') domain names for consideration, across each of the 50 TLDs, or 10,000 candidate domain names in total.

The analysis shows that, of the 10,000 possible domain names of this format, 2,053 (20.5%, or just over one in five) are already registered. A more granular analysis is shown in Figure 1, showing a 'registration map' of which names are already registered (shown in red), versus those which are absent from the zone file (and therefore potentially unregistered and available) (in green).

Figure 1: 'Registration map' for 'agentic-AI-style' domain names (red = registered, absent from zone file = green), where the second-level name (SLD) (i.e. the part of the domain name to the left of the dot) is shown on the vertical axis and the TLD (domain name extension) is shown on the horizontal axis. The dataset is sorted by (vertically, decreasing from top to bottom) the number of TLDs (out of 50) across which the SLD exists as a registered domain, and (horizontally, decreasing from left to right) the total number of SLDs (out of 100) which exist as a registered domain across the TLD in question. Results are shown for the top 50 most commonly registered SLDs.

The top five most commonly registered SLD strings in the dataset are aiagent (with 'agent' likely referring to its technical, AI-related definition in most cases), agentai, aiart, aimusic, and aimarketing, existing as registered domains across 47, 41, 38, 38, and 36 (respectively) of the 50 TLDs considered in the analysis. Only three of the 200 strings do not appear as the SLDs of registered domain names across any of the 50 TLDs.

The top TLD in the dataset is .com (for which 197 of the 200 considered strings exist as the SLDs of registered domains), followed by .net (144), .org (139), .xyz (134), and .app (107). Only one TLD of the 50 (.ovh) does not feature any of the considered SLD strings as registered domains.

Some examples of some of the registered .com domains which also resolve to live website content are shown in Figure 2. Many of the remainder resolve to lower-threat content such as placeholder and parking pages, suggesting perhaps that they have been proactively registered for future intended use, or may be being held as tradable commodities in their own right, given the potential use-cases for these types of name. aiagent[.]com (for example) resolves to a page offering the domain name for sale and requesting offers in excess of $1.5 million, and aibanking[.]com, aibarrister[.]com, aicontroller[.]com, aidesigner[.]com, and aiinvestment[.]com are all explicitly soliciting offers in excess of $100k.

Figure 2: Examples of 'agentic-AI-style' .com domain names resolving to live website content: aiaccountant[.]com, aianalyst[.]com, aidoctor[.]com, aiparalegal[.]com, aiphotographer[.]com, aireceptionist[.]com

2. AI-specific subdomains

The second piece of analysis considers the extent of the existence of AI-related subdomains (taking the specific example of URLs of the form AI.[site].[TLD]), on each of a series of the most popular (i.e. highest traffic) websites across the Internet. In particular, we consider the 47 most highly visited websites generally, derived from data from Similarweb and Semrush[21] (truncated from a top-50 list, but considering only examples comprising full, second-level domain names), and a dataset of the top 20 information technology (IT) company websites (according to Semrush[22]) - i.e. one example of an industry vertical where AI may be particularly relevant (noting that two domains, live.com and office.com, appear in both lists).

The analysis shows that a specific hostname of the form AI.[site].[TLD] was found to resolve (i.e. is configured with an active DNS entry) for 20 of the top 47 websites globally (i.e. 43%, with 19 of these explicitly also generating a live HTTP (i.e. website) response) (Figure 3), and for 8 of the top 20 IT websites (40%, with 6 also showing a live HTTP response). This does not, of course, preclude the existence of other AI-specific areas of the websites which may use alternative naming conventions, such that these figures represent very much a lower limit on the proportion of these sites already featuring dedicated AI-related sections.

Figure 3: Examples of AI-specific subdomains (of the form AI.[site].[TLD]) on domains within the top-50 list of most popular websites: ai.google.com (re-directs to ai.google), ai.facebook.com (re-directs to ai.meta.com), ai.baidu.com, ai.microsoft.com (re-directs to microsoft.com/en-us/ai)

Discussion and conclusions

Many of the points discussed in this article are reminiscent of terminology used in the futurology study 'From Malthus to Mars'[23]; the work describes certain emerging capabilities as '10x technologies', referring to their capacity to be ten times more effective than their predecessors, and expand accessibility to a far wider audience. Furthermore, some of the predictions referenced in this article are even more significant, and potentially have the ability to push 'from 10x to 100x' growth, representing a fundamental step-change in capabilities and with the power to drive fundamental evolutions of the online landscape.

As AI continues to evolve in an ever-more-interconnected online ecosystem, it is likely that domain names will remain a foundational component of the overall landscape, comprising a permanent, trusted layer which is able to give every connected entity a unique identifier.

Some of these trends are already being observed, even across the existing legacy infrastructure, with significant growth in the numbers of registered domains with specific relevant name structures and/or containing relevant keywords. It will be interesting to see how near-future developments, such as the forthcoming second phase of the new-gTLD programme, the inevitable continued growth and evolution of AI technologies, the increasing interconnectedness of online channels, and the ongoing emergence of new AI use-cases and other areas of online technology, will contribute to this overall picture.

References

[1] https://www.imperva.com/blog/2025-imperva-bad-bot-report-how-ai-is-supercharging-the-bot-threat/

[2] https://ahrefs.com/blog/what-percentage-of-new-content-is-ai-generated/

[3] https://searchengineland.com/ai-traffic-up-seo-rewritten-459954

[4] https://ahrefs.com/blog/ai-traffic-study/

[5] https://www.semrush.com/blog/ai-search-seo-traffic-study/

[6] https://www.linkedin.com/pulse/2026-one-five-retail-transactions-completed-ai-agent-question-amit-6wl1e/

[7] https://onereach.ai/blog/agentic-ai-adoption-rates-roi-market-trends/

[8] https://www.gartner.com/en/documents/6894066

[9] https://www.pymnts.com/artificial-intelligence-2/2024/ai-to-power-personalized-shopping-experiences-in-2025/

[10] https://www.gartner.com/en/newsroom/press-releases/2025-08-26-gartner-predicts-40-percent-of-enterprise-apps-will-feature-task-specific-ai-agents-by-2026-up-from-less-than-5-percent-in-2025

[11] https://www.cisco.com/c/dam/global/fr_fr/solutions/data-center-virtualization/big-data/solution-cisco-sas-edge-to-entreprise-iot.pdf

[12] https://pmc.ncbi.nlm.nih.gov/articles/PMC11085491/

[13] N. Quadar, A. Chehri, G. Jeon, M.M. Hassan, G. Fortino (2022). Cybersecurity Issues of IoT in Ambient Intelligence (AmI) Environment. IEEE Internet Things Mag., 5, pp. 140-145. doi: 10.1109 / IOTM.001.2200009.

[14] https://pdfs.semanticscholar.org/f6fb/3f56f29f23cb8724fce2a7667f08e1641eb4.pdf

[15] For example, from Domain Summit Europe 2025:

[16] https://www.kuppingercole.com/watch/future-of-identity

[17] 'A Novel Zero-Trust Identity Framework for Agentic AI: Decentralized Authentication and Fine-Grained Access Control'; https://arxiv.org/html/2505.19301v2

[18] https://www.imf.org/en/News/Articles/2024/05/15/cf-an-ai-powered-boost-to-anguillas-revenues

[19] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 9: 'Domain landscape analysis'

[20] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 13: 'Analyzing trends in Web3'

[21] https://en.wikipedia.org/wiki/List_of_most-visited_websites

[22] https://www.semrush.com/website/top/global/information-technology/

[23] https://frommalthustomars.com/

This article was first published on 9 October 2025 at:

https://circleid.com/posts/how-the-growth-of-ai-may-drive-a-fundamental-step-change-in-the-domain-name-landscape

AI's potential impact on web search, traffic and trademarks

Functionality powered by Artificial Intelligence (AI) is now a fully integrated component of a wide range of Internet and information techno...