Thursday, 27 November 2025

Br'AI've New World: Experimenting with the use of generative AI for brand protection analysis tasks

EXECUTIVE SUMMARY

Generative artificial intelligence ('gen-AI') tools offer the potential to assist with, and add efficiencies to, a number of analysis tasks, especially where these tasks meet certain criteria. These criteria typically include: (i) being computationally or analytically intensive (i.e. where there are large amounts of data to consider) and/or highly repeatable; (ii) of a type such that there may be subjective elements to the analysis (e.g. where the data to be analysed can be presented in a variety of different ways); and (iii) of a type where the analysis would be hard to achieve manually, but the output is easy to verify for accuracy.

Our latest study explores the application of gen-AI in a range of general areas related to the analysis of data pertaining to brand monitoring services. The experiments make use of a proprietary business workflow automation system (the AI 'tool'), incorporating AI and natural language capabilities to generate agentic functionality, with which interaction is possible in a 'chatbot' style.

The analysis considers two broad areas of analysis. The first area is related to sets of tasks required to carry out 'clustering' analysis (i.e. the establishment of links between related findings, on the basis of shared associated characteristics, which is advantageous for identifying priority targets associated with high-volume or serial infringers, demonstrating bad-faith activity by bad actors, allowing efficient bulk takedowns, and providing data for entity investigations). The specific tasks considered in this area fall into the categories of generalised 'scraping' functionality (i.e. the extraction of data points from arbitrary webpages) and the parsing (i.e. interpretation) of free-form data, such as that provided in domain name whois records. The second area of analysis relates to the high-level categorisation of results (i.e. potentially infringing webpages or other content) identified through brand monitoring services. 

The analysis shows that gen-AI does have significant potential in assisting with these broad task areas, although the specific use-cases must be carefully selected to ensure that the tools in question are a good fit to the tasks and the potential for undesirable outputs such as 'hallucinations' can be minimised. The specific AI tool used in the analysis is particularly applicable, as it incorporates functionality to save the necessary prompts as pre-defined 'tasks' (i.e. the creation of agentic systems), which should provide a basis for greater repeatability and efficiency going forward.

This article was first published on 27 November 2025 at:

https://www.iamstobbs.com/insights/braive-new-world-experimenting-with-the-use-of-generative-ai-for-brand-protection-analysis-tasks

* * * * *

WHITE PAPER

Introduction

Generative artificial intelligence ('gen-AI') technology is undoubtedly one of the hot topics over the last couple of years, and the Internet is awash with discussions regarding potential use-cases for gen-AI, concerns over its ubiquity and the risks this presents. Whilst many of the cited uses for generative AI are (arguably) questionable at best, there is certainly potential for thoughtful and measured application of AI capabilities to make possible - and build efficiencies into - critical tasks which may not be easily achievable using 'classic' or manual approaches. 

In considering appropriate use-cases, it is crucial to have a good understanding of what gen-AI is, and what it fundamentally can and cannot do. Firstly, gen-AI systems do not (and cannot) think or understand (at least, not in any useful conventional senses of the words), but merely take a probabilistic approach to generating a response to a prompt they have been given, based on the set of content (the large-language model, or LLM) on which they have been trained. As such, they are engineered to have (only) the appearance of an entity which is (truly) intelligent and, by extension (at least for tools built in the way which is currently most common), do not have any real scope for creativity (in the sense of generating responses which are wholly distinct from their training material). Furthermore, many such tools are (unless explicitly instructed otherwise) usually poor at giving an honest response when requested to perform a task which is beyond their capabilities, instead (by design) generally just giving a response which is plausible, even if a correct response is not available or possible - essentially, the concept of 'hallucinations'. That said, there are a number of areas where gen-AI applications are valuable, specifically where the task is (some or all of): (i) computationally or analytically intensive (i.e. where there are large amounts of data to consider) and/or highly repeatable; (ii) of a type such that there may be subjective elements to the analysis (as opposed to objective or deterministic tasks, where arguably a classic analytical approach (i.e. algorithmic, script- or code-based) is more appropriate; and (iii) of a type such that the output would be hard to achieve manually, but (perhaps most crucially) is easy to verify for accuracy. This type of appreciation is key to making a successful judgment of the types of task for which an application of gen-AI systems will be appropriate - typically, these are repeatable tasks in areas such as data analysis, pattern analysis, and extrapolation.

In this paper, I consider a number of such tasks which are applicable to the general area of analysis in brand monitoring programmes. As a general point, it is worth noting that many brand protection service providers have been citing the use of AI technologies for a number of years. In many cases, the associated functionality may be little more than deterministic (i.e. script- or algorithm-based) analysis or categorisation processes which, whilst technically applications of artificial intelligence (in that they involve systems making automated 'decisions'), would not be classed as AI according to the most common current interpretations of the term[1]. However, many providers are beginning to implement 'true' AI capabilities into their technologies. Examples across the set of product offerings from the Stobbs group of companies include computer vision capabilities (for logo and image analysis, and optical character recognition (OCR)), the ability to interrogate portals and databases using natural language, and predictive threat categorisation of potential infringements based on enforcement history. 

Conversely, however, this paper focuses specifically on some other distinct general brand protection analysis task areas, and comprises an exploration of the extent to which elements of these tasks might be automatable through a gen-AI approach, using simple sets of prompts. The experiments utilise a proprietary business workflow automation system (the AI 'tool'), incorporating AI and natural language capabilities to generate agentic functionality, with which interaction is possible in a 'chatbot' style. The tool makes use of API connections from multiple LLM providers, configured in such a way that no data from prompts or other inputs can be used for model training, so that data can be submitted and analysed on a secure and confidential basis, and the tool can also be utilised with local LLMs.

From this series of tests, the capabilities of gen-AI tools of this nature can be explored by considering the nature and accuracy of the responses.

Task types to be tested

'Clustering' of linked results from a large database of findings

This general problem, involving the establishment of links between related findings on the basis of shared associated characteristics, has a range of benefits (and is also the subject of a previous article[2] suggesting it as a candidate for AI applications). The benefits of clustering include the identification of priority targets associated with high-volume or serial infringers, and demonstration of bad-faith activity, the potential for allowing efficient bulk takedowns, and the provision of data for building a fuller picture of an underlying entity (i.e. investigations). The ability to cluster is itself dependent on a number of 'sub-tasks', such as the ability to extract and analyse the relevant pieces of information and characteristics from potentially rich datasets of a range of types. Some examples of such tasks are discussed below.

Generalised 'scraping' functionality

'Scraping' refers to the process of extracting key pieces of information from a webpage, and is most usually applied to features such as seller names, prices, quantities, etc. (typically used for the prioritisation of results, and for carrying out return-on-investment ('ROI')-style analyses) from e-commerce marketplace listing pages. In the traditional approach, this process requires the building of site-specific scraper scripts, which are reliant on the format of the pages (and thus the locations on the page of the relevant pieces of information) being known in advance and fixed. Gen-AI offers the potential both to extract information from arbitrary webpages, and to extract it from rich data types (such as where a piece of contact information may appear as a watermark in a product image). 

Tests: scraping of key features from e-commerce listings

(i) In an initial test relating to a potentially infringing product listing on Alibaba, the tool was able to correctly extract certain features (such as seller name and location of product origin), and also provide a high-level description of the product, together with an assessment of the potential for infringement (Figure 1), an area which is also relevant to the issue of categorisation of brand monitoring findings (see section below). 

Figure 1: Gen-AI assessment of a potentially infringing product on Alibaba

(ii) In a second test, the tool was provided with screenshots of two pages from e-commerce marketplaces (product listings / seller profiles), each of which shared a brand name (displayed within an image) in common (Figure 2), and the tool was able to correctly extract the name and establish its presence in both images (Figure 3), a key prerequisite for the ability to cluster. Note that the provision of the content as a screenshot may be necessary in cases where the tool is unable to access the material live via the provision of a URL (due, for example, to a requirement to be logged into the platform in question in order for the content to be visible), but does also guarantee that the tool is truly extracting the information from within the image itself, rather than relying on textual meta-data which may also be present on the page.

Figure 2: Examples of marketplace pages featuring content sharing a brand name in common

Figure 3: Gen-AI analysis of the content shown in Figure 2

(iii) In a third test, the tool was also able to provide a product description from just a screenshot of a product image from an e-commerce listing on social media, and also correctly extract the code from the image watermark (Figures 4 and 5). In another similar test, a product price was also successfully extracted from a product image watermark.

Figure 4: Image of a product listed for sale on social media

Figure 5: Gen-AI analysis of the content shown in Figure 4

Parsing of rich datasets featuring information in a range of different formats

Tests: analysis of domain name whois information

For this set of tests, the tool was provided with a spreadsheet containing just a list of domain names (pertaining to a fashion brand) and the series of raw whois records (comprising free-form data, providing ownership and other registration and configuration information) pertaining to these domains.

(i) The tool was able to successfully parse this raw dataset and extract key pieces of information for each domain (Figure 6), which can serve as the basis for further clustering, and (through the extraction of contact details pertaining to the relevant registrars) can assist with the analysis required for the subsequent sending of enforcement (takedown) notices, if appropriate. 

Figure 6: (Anonymised) output from the gen-AI analysis of a series of raw domain whois records

(ii) As a follow-up task, the tool was also able to cross-reference the identified registrar contact e-mail addresses across the (external) official websites of the registrars in question, and extract additional / alternative contact addresses where present, in addition to providing a list of links from which the information was sourced (Figure 7).

Figure 7: Gen-AI summary of analysis of additional registrar contact e-mail addresses

(iii) Based on the dataset produced in part (i), the tool was also able to establish the existence of clusters of linked domains, based on the presence of details in common between the results, and automatically visualise this information in a range of different ways (Figures 8 and 9). In some cases, (e.g. the use simply of the same privacy protection service provider) the characteristics used are likely not to be distinctive enough to assert the presence of a meaningful link between the results, but there was also some success in prompting the tool to make a distinction between genuine 'real-world' details and less diagnostic pieces of information such as the names of privacy services or other fields indicating that the record had been redacted.

Figure 8: Gen-AI overview of the registration timeline for the domains

Figure 9: (Anonymised) gen-AI summary of the presence of domain 'clusters' within the dataset

Categorisation of brand monitoring results

The categorisation (into high level content- or potential infringement types) of brand-related webpages - as identified through a programme of monitoring - is a key component of many brand protection technologies (with the aim of aiding with review and prioritisation). In many cases, this categorisation is achieved through a 'hard-coded' rules-based analysis using pre-defined keywords, but there is clear potential for the application of an AI- or machine-learning-style approach to provide greater flexibility and efficiency. 

Based on a simple initial prompt, the tool is able to offer a suitable framework and methodology for this type of categorisation process (Figure 10).

Figure 10: Gen-AI proposal for the categorisation process for brand monitoring results

Following a request for a more focused set of categories based on 'general Internet content' specifically (including a range of arbitrary types of brand references, which may or may not be 'correct' / 'legitimate', and where the nature and context of the way in which the brand is referenced is likely to be the factor of greater interest), the tool is again able to provide meaningful suggestions (Figure 11).

Figure 11: (Anonymised) gen-AI proposal for categories for 'general Internet content' results (for a specified financial services brand)

Based on a less granular suggested framework, the tool was successfully able to review a provided list of brand-related URLs and provide a categorisation summary in a tabular form, including a high-level content description and any other relevant notes (such as comments on assessed legitimacy or threat-level) in each case (Figure 12).

Figure 12: (Anonymised) gen-AI-based categorisation and description of a set of brand-related webpages pertaining to a financial services brand

In a similar vein, some success was also noted in suggestions for categorisations based just on features of a set of brand-related domain names (and with no account taken of the content of any associated websites) (Figure 13). 

Figure 13: Gen-AI proposal for the categorisation process (by potential threat level) for domain names

Application of a gen-AI approach for this sort of task might be most appropriate in cases where (say) the domain monitoring has been configured to search for brand variants (such as typos), rather than just domains containing the exact spelling of the brand. In such cases, it may otherwise be more complex to configure a deterministic, script-based (i.e. 'classic') basis for analysing and ranking the domains[3], due to the potentially wide range of different ways in which the brand name (or the approximations to it) can be presented.

Conclusions

Overall, the results from these initial tests are encouraging, in terms of the potential for gen-AI to provide meaningful assistance with certain standard brand protection analysis tasks (particularly for the specific examples presented in this paper), to augment existing capabilities. Other tests have, however, shown varying degrees of success, in terms of the accuracy and/or completeness of results (and the potential for 'hallucinatory' results). In some cases, these issues may be due to inaccessibility to the tool of the required information, but in other cases it may be possible to make significant improvements through continued refinement of the prompts used. As optimal prompts are identified, the functionality of the tool utilised in this study allows for the instructions for any given analysis type to be saved as pre-defined 'tasks' (i.e. agentic systems), which should provide a basis for greater repeatability and efficiency going forward.

References

[1] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 14: 'New developments'

[2] https://circleid.com/posts/braive-new-world-part-1-brand-protection-clustering-as-a-candidate-task-for-the-application-of-ai-capabilities

[3] https://www.iamstobbs.com/insights/exploring-a-domain-scoring-system-with-tricky-brands

This article was first published as a white paper on 27 November 2025 at:

https://www.iamstobbs.com/uploads/general/BrAIve-New-World-Gen-AI-in-BP-analysis-e-book.pdf

No comments:

Post a Comment

Br'AI've New World: Experimenting with the use of generative AI for brand protection analysis tasks

EXECUTIVE SUMMARY Generative artificial intelligence ('gen-AI') tools offer the potential to assist with, and add efficiencies to, a...