Friday, 10 February 2023

Assessing and mediating the digital risk landscape for a brand

Introduction

For modern businesses, an Internet presence is a key part of day-to-day operations, both in terms of their own corporate infrastructure and their interaction with customers. However, the ease and low cost of access to online content, combined with the ubiquity of Internet use, means that online channels present a very attractive environment for bad actors to abuse trusted brands for their own gain. Traditional areas of infringement - including counterfeiting, online fraud (such as phishing) and piracy - continue to remain popular, in addition to other types of content which can be damaging to company revenue, reputation, or customer experience[1]. These might typically include categories such as traffic misdirection, false affiliation, potential brand confusion, or negative comment and activism[2]. Additionally, new types of content (such as 'Web3' areas including NFTs and blockchain domains[3]), and new online channels, are continually emerging. 

Other kinds of digital risk, such as the growth in the prevalence of malware and ransomware, instances of DNS attacks and distributed denial-of-service (DDoS) attacks, and other direct attacks against company or employee infrastructure (such as social engineering and BEC (business e-mail compromise) attacks, are also of concern. Frequently, these areas are also linked, with (for example) phishing increasingly recognised as the most common attack vector for malware distribution[4].

This threat landscape produces an environment in which the protection of company brands is ever more important. A robust cybersecurity posture needs to encompass protection of official corporate portfolio domains (i.e. domain-name management and security) and tackling third-party activity on the open Internet ('outside the firewall') (i.e. 'classic' brand protection - which itself needs to consist of monitoring for threatening content and enforcement against infringements). Only by having both elements working successfully together can an organisation be confident that they are adequately protected from cybersecurity risks[5]

Before an effective ongoing programme can be put in place, it is essential to conduct an assessment of the pre-existing landscape, to determine the main issues and areas of risk, and work out where remediating action will be required going forward. Additionally, it is often necessary to attempt to numerically quantify the scale of the issue, so that determination can be made of the likely return on investment of a cybersecurity programme, and thereby justify the spend to management.

Assessing the landscape

i. The corporate domain name portfolio

The set of domain names owned by a company will typically include the 'core' or 'critical' domains used in the day-to-day execution of business (such as those providing infrastructure for their websites and e-mail) and a broader set of 'tactical' domains, held to prevent them from being acquired by third parties, or intended for future use relating to new brand or product launches, or geographical expansion. 

When considering the protection of corporate domains, there are a number of security products available, particularly where the domain-name management is overseen by an enterprise-class registrar. These products are designed to address a range of security issues, such as the risks of DDoS attacks (addressed by DNS hosting redundancy), hacking and site re-direction (by DNSSEC[6]), spam, e-mail spoofing, and phishing (by SPF[7], DMARC[8] and DKIM[9]), unauthorised DNS changes and domain hijacking (by MultiLock), and use of unauthorised certificates (mediated by the use of CAA[10] records). Implementing all of these measures for all corporate domains would generally be prohibitively costly, and therefore it is important to be able to identify the most business-critical domains, where the greatest levels of protection are required. An effective enterprise-class registrar will typically have technology to assist brand owners with making these determinations.

It is also generally advisable for brand owners to review their portfolio of tactical domains and determine where additional registrations may be required. Again, registrars geared towards the provision of services for large corporations will generally be able to assist with this determination. Typically, this assessment will include a domain availability analysis, where domain names consisting of key brand terms (particularly the brand name itself) as the second-level domain name (SLD)[11], across the full set of possible domain-name extensions (top-level domains, or TLDs), are investigated to see whether they are currently owned by the brand owner, are owned by third parties, or are available for registration. Where domain names are owned by third parties, it may be appropriate to launch enforcement or acquisition actions, depending on a number of factors (such as whether the domain name constitutes an infringement of intellectual property, or whether the brand owner would like to take ownership of the domain), or to monitor the domain name for future changes to site content or configuration features. These might include the presence of (for example) MX[12] records, which can indicate that the domain is being 'weaponised' for use of its e-mail functionality.

When considering which available domains may be desirable to register, there are several points to bear in mind. The most relevant domain names are likely to be those where the SLD consists just of the brand name itself. In general, it will be advisable to register these domains across all common TLDs and those corresponding to countries where the company has current or planned business operations. However, previous studies have established that there are also specific ('high-threat') TLDs which tend to be generally popular with fraudsters and infringers[13] (for example, those where the domain-name providers offer low- or no-cost domain registrations or have lax registration security policies[14]), so it may also be advantageous to secure key domain names across these extensions to prevent them being utilised elsewhere. Outside of these areas, domain names including high-relevance keywords, or typo / character-substitution variants commonly used by bad actors, may also be worth securing. For example, accented or non-Latin characters which appear visually similar to ordinary Latin characters, or replaced characters which are adjacent on standard keyboard layouts, are commonly used by infringers to create typo domain names which are deceptive or designed to collect misdirected web traffic[15,16]. However, a defensive policy can only take a brand owner so far; in practice, there are infinite variations of official corporate domain names which can be registered by bad actors, so management of a comprehensive corporate domain portfolio should always be accompanied by a robust brand-protection solution to monitor potentially harmful third-party activity.

ii. The Internet landscape

The first step in an online brand protection programme is often a one-off landscape audit, to determine the scope and scale of the issues, and determine which Internet channels present the greatest problems. Some of this assessment can be carried out using simple manual searches - for example, it may be beneficial (for brand owners associated with the manufacture of physical goods which may be subject to counterfeiting and other types of e-commerce infringements) to carry out a 'marketplace sweep'. In its simplest form, this involves carrying out searches for the brand name across a range of popular e-commerce marketplaces, to determine how many results are returned. Using assumptions about the proportions of listings which are typically infringing on each site (potentially combined with additional filtering if, for example, the brand name is a relatively generic term in its own right), it is possible to make a high-level estimation of the total number and value of infringing listings on each site, which can be used as an input for a potential return-on-investment calculation[17]

However, in practice, a full assessment of the infringement landscape can be carried out only by a comprehensive brand-monitoring scan. Brand-protection service providers will typically use monitoring technology to carry out searches, which may incorporate keyword-based and image-search or matching components, often with elements of artificial intelligence or machine learning, to automatically categorise and prioritise results. In many cases, this is then accompanied by manual analysis, to exclude false positives and extract the most significant findings. 

When carrying out any kind of online brand monitoring project, there are a number of factors to consider:

  • It is generally advisable to approach the problem holistically and cover as many channels as possible, typically incorporating (where appropriate) general Internet content (including branded domain names), social media, e-commerce platforms, mobile apps, and so on. This is important both because these areas are essentially just different environments where the same types of infringement can occur, and also because these channels are becoming increasingly interlinked, and the distinctions between them increasingly blurred.
  • Monitoring should make use of multiple data sources, in order to ensure that coverage is as comprehensive as possible. Typically these might include:
    • Internet metasearching and web crawling - This involves the submission of brand- and industry-related terms to search engines, analysing the pages returned, and crawling hyperlinks. Whilst it is not possible to identify all potentially threatening content via this route, it does address the areas of the Internet which have highest 'visibility' and are most likely to be encountered by general web users.
    • Domain-name zone-file analysis - Many brand-protection service providers will have access to zone files, which are datafiles published by the operators of registry organisations (the entities overseeing the infrastructure of specific domain-name extensions, or TLDs), and contain lists of all registered domain names across the TLD in question. By downloading and analysing these zone-files on a regular (typically daily) basis, and comparing current with previous versions, it is possible to identify the registration of new domains with names containing strings of interest (such as a brand name). This technique can yield comprehensive and timely detection of relevant branded domain names across TLDs where zone files are available; however, for some domain extensions (particularly country-specific TLDs), registries are not obliged to publish the zone files, so the information may be incomplete or unavailable. Some of these gaps can be filled in using other techniques, such as parallel look-ups (checking for the existence of 'cousin' versions of detected domains (i.e. those with the same SLD) across other domain extensions), or searches across the full set of TLDs for the existence of domains with specific name-strings (SLDs) of interest. The most sophisticated domain-monitoring solutions are also able to intelligently search for domains containing brand variants (such as fuzzy matches - including missing, additional, replaced, or transposed characters - or soundalike versions)[18]
    • Direct site querying - For sites known to be of interest in advance of the monitoring (such as specific social-media platforms, e-commerce marketplaces, mobile app stores, etc.), it may be possible to search the sites directly using their own in-built search functionality or (if available) via an API, which provides relevant data in a structured, database-like format.
    • Phishing detection techniques - Some of the most egregious infringements (such as fake sites soliciting for the input of customer credentials) may not be accessible through any of the above monitoring techniques (for example, if the infringing site does not feature the brand name in the domain name, and is not linked-to from other sites which are indexed by search engines), and may be designed to receive traffic only (for example) via links in specifically-constructed e-mails. In these cases, it may be appropriate to augment 'classic' brand-monitoring methodologies with other techniques, such as the use of spam traps and honeypots, and other data feeds like customer abuse mailbox data and webserver logs[19].

Determining where ongoing action is required

An initial audit of the pre-existing infringement landscape will help to identify the immediate areas of priority for ongoing monitoring, though it is generally advisable to maintain an element of holistic monitoring for the appearance of new threats across all relevant channels, combined with partnership with one or more consultancy providers who can advise on the emergence of new areas of potential concern.

In order to deal with identified infringements, the central idea is the use of enforcement techniques to ensure the deactivation of threatening content. The most effective enforcement programmes will offer a 'toolkit' of possible approaches, from low-cost, low-complexity, primary actions such as cease-and-desist (C&D) notices, through secondary approaches like host-level content removals or registrar- or registry-level suspensions, up to longer-term, complex tertiary approaches like domain-dispute processes and legal actions. This allows the brand owner to select the most cost-effective and efficient approach in any given case, whilst reserving other options for when escalation is required[20]. Enforcement not only protects the brand and its customers, but can provide a deterrent effect to infringers and can be a pre-requisite for retaining IP protection.

A key element of this aspect of the programme is the ability to prioritise the identified findings. Typically, a brand-monitoring solution will identify large numbers of findings of potential interest, and it is therefore essential to be able to determine which present the greatest level of actual (or potential future) threat, so as to be able to focus the initial enforcement efforts (or the targets for ongoing monitoring) in the most impactful places. 

This prioritisation process can incorporate a range of different ideas:

  • To a high level, infringements can be categorised into a number of types (comprising categories such as phishing, traffic diversion, negative brand association, potential brand confusion, false affiliation, etc.), which can themselves be assigned severity classifications. The exact specifications may vary from one brand owner to another (based on industry area, individual views on level of threat, risk tolerance, and so on), but might broadly be categorised as lower-threat 'brand abuse' (covering unenforceable content or simple breaches of corporate guidelines), through 'brand infringement' (constituting contravention of intellectual property protection), up to 'brand fraud' (where the brand usage is actively criminal, such as phishing or the sale of counterfeits).
  • Individual infringements can also be analysed using algorithms to quantify the level of threat they pose (or are likely to present in the future). This methodology can use a variety of the website’s characteristics as its inputs, many of which can be applied even when there is not yet any active site content. These characteristics might include features such as:
    • (If present,) the nature of any active content on the website.
    • The similarity of the domain name to that of the brand owner's official website (or the presence of brand-name variations, typos, etc.).
    • The domain name extension (TLD) – of particular concern will be the use of a 'high-threat' TLD.
    • The amount of web traffic received by the site.
    • Characteristics of the registrant, registrar or ISP (hosting provider) (specifically taking account of features such as the use of privacy-protection services, webmail e-mail addresses, high-threat or non-compliant service providers, locations in countries where enforcement is difficult, etc.).
    • The presence of an MX record - this indicates that the domain has been configured to be able to send and receive e-mails and could therefore be associated with phishing activity.
  • Clustering technology can be used to group together related infringements on the basis of shared characteristics. This can be advantageous as it can help identify instances of serial infringers which may be targets for prioritised enforcement action, can reveal evidence of bad-faith activity (e.g. multiple distinct brands being targeted) which can help build a stronger case for enforcement, and raises the possibility of efficient bulk takedowns in a single action.

Conclusion

A thorough brand risk assessment is essential to ensure that the resources associated with a cybersecurity and brand-protection programme are focused in the correct places. Typically, this process involves:

  • An initial assessment of the key areas of concern, using:
    • (For the official domain portfolio,) a review of all domains in the portfolio, to determine: (a) which are the critical 'core' domains; (b) which security measures are currently in place for each domain; and (c) what are the 'gaps' in the domain portfolio.
    • (For general Internet content,) a broad sweep across a wide range of channels and using a comprehensive set of data-collection techniques.
  • Filtering and prioritisation of the findings to identify the targets for follow-up action (i.e. additional domains security measures to be deployed, requirements for additional defensive domain registrations, and the key third-party infringements requiring enforcement action or future monitoring).
  • Implementation of the above actions and ongoing monitoring.

It is also generally appropriate to include a subsequent periodic review process, to assess the impact of the programme and realign strategies where necessary. Impact is often measured through some sort of return-on-investment calculation, which can typically be much more robust once specific measurable mediating actions (such as enforcement takedowns) have been carried out[21].

The choice of a suitable service provider with which to partner is frequently a crucial component of this process. An enterprise-class provider will typically be able to offer a more holistic range of security products and solutions, will avoid practices (such as the operation of domain marketplaces, and monetisation of trademarked domain names using pay-per-click links) which can contribute to fraud and brand abuse, and will typically operate under strict internal security policies to reduce the risk of hacks and data breaches[22,23,24,25]. Use of an enterprise-class provider also generally improves the security posture of a brand owner, increasing the ease of access to - and lowering the cost of - cyberinsurance[26,27]. Brand owners should also be mindful of their selection of suppliers and vendors more generally, as bad actors frequently target corporations via the weakest point in their supply chain, particularly at times of heightened vulnerability arising from external real-world events[28,29]

References

[1] https://www.cscdbs.com/blog/brand-abuse-and-ip-infringements/

[2] https://www.linkedin.com/pulse/holistic-brand-fraud-cyber-protection-using-domain-threat-barnett/

[3] https://www.linkedin.com/pulse/rise-nft-david-barnett

[4] https://www.cisa.gov/stopransomware/general-information

[5] https://securityboulevard.com/2022/07/online-brand-abuse-is-a-cybersecurity-issue/

[6] DNSSEC = Domain Name System Security Extensions

[7] SPF = Sender Policy Framework

[8] DMARC = Domain-based Message Authentication, Reporting and Conformance

[9] DKIM = Domain Keys Identified Mail

[10] CAA = Certification Authority Authorisation

[11] The second-level domain name (SLD) is the part of the domain name to the left of the dot

[12] MX = Mail exchange

[13] https://www.cscdbs.com/blog/the-highest-threat-tlds-part-2/

[14] https://www.cscdbs.com/blog/the-highest-threat-tlds-part-1/

[15] https://www.cscdbs.com/en/resources-news/threatening-domains-targeting-top-brands/

[16] https://www.cscdbs.com/assets/pdfs/Domain_Security_Report_2021.pdf

[17] https://www.worldtrademarkreview.com/anti-counterfeiting/return-investment-proving-protection-pays

[18] https://www.worldtrademarkreview.com/global-guide/anti-counterfeiting-and-online-brand-enforcement/2022/article/creating-cost-effective-domain-name-watching-programme

[19] https://www.cscdbs.com/blog/the-continued-rise-of-phishing-and-the-case-of-the-customizable-site/

[20] https://www.cscdbs.com/blog/four-steps-to-an-effective-brand-protection-program/

[21] https://www.linkedin.com/pulse/calculation-return-investment-brand-protection-thoughts-david-barnett/

[22] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[23] https://vmblog.com/archive/2023/01/11/csc-2023-predictions-staying-secure-in-2023-and-making-it-the-year-of-action.aspx

[24] https://techbeacon.com/enterprise-it/enterprise-it-predictions-2023

[25] https://www.cscdbs.com/en/resources-news/domain-security-report/ (2022)

[26] https://securityscorecard.com/resources/the-impact-of-enterprise-class-domain-registrar-utilization-on-overall-security-ratings

[27] https://www.wsj.com/articles/buying-cyber-insurance-gets-trickier-as-attacks-proliferate-costs-rise-11659951000

[28] https://www.cscdbs.com/en/resources-news/supply-chain-report-form/

[29] https://www.csoonline.com/article/3672155/global-companies-say-supply-chain-partners-expose-them-to-ransomware.html

This article was first published on 10 February 2023 at:

https://www.linkedin.com/pulse/assessing-mediating-digital-risk-landscape-brand-david-barnett/

Thursday, 9 February 2023

Calculation of return on investment for brand-protection programmes: Thoughts towards a new paradigm

Pre-existing ideas

Numerous previous studies have considered methodologies for calculating the return on investment (ROI) of brand-protection programmes which incorporate components of monitoring and enforcement. These ideas can be important both to justify the spend on a programme in the first place, and to assess its impact once established. Correspondingly, 'classic' ROI calculations can be categorised into two main types: the first (known as 'a priori' calculations) consider the probable infringement landscape in advance of the implementation of a brand-protection programme; the second aims to quantify the actions taken as part of an active enforcement initiative[1]. It is the latter category with which we are primarily concerned in this article.

To a very high level, many ROI calculation methodologies use a formulation along the lines of:

R = C × E

where R, the ROI (within a given timeframe) (i.e. the benefit of the brand-protection programme, to be offset against the associated spend) is equal to the product of C, the 'cost' of a pre-existing infringement being active, and E, the number of infringements removed through enforcement as part of the brand-protection programme (in the same timeframe).

Very many assumptions are typically required in order to estimate these figures. In some methodologies, the assumed 'cost' associated with a live infringement may be reflective of an estimate of its direct financial impact (e.g. the typical loss from a phishing incident); in others it may be calculated as the proportion of lost revenue which is reclaimable following deactivation of the infringement (i.e. the 'cost' in the above formulation essentially reflecting the pre-enforcement impact of not yet having taken the infringement down). In these types of approaches, it is very rare that these figures can be measured directly and therefore a number of assumptions (or 'proxies' for the data) are required. In cases of domain acquisition, for example, it may be appropriate to make use of figures such as web traffic when quantifying impact; for marketplace listings, it is typically necessary to consider factors such as price and quantity of items in the listings removed. In both cases, the methodology needs to consider assumed conversion rates (i.e. the proportion of customers who can be 'monetised' by the brand owner - e.g. those who will make a legitimate purchase once the source of infringements is removed)[2,3]. Even this part of the process is far from simple; complications include factors such as: 

  1. The conversion rate will be (strongly) dependent on the nature and price of the item (e.g. it will be much lower for (say) an obvious counterfeit, such as an item passing off as a high-end luxury brand but with a very low price point)[4].

  2. The conversion rate for customers knowingly navigating to an official brand website will potentially be different to that for those Internet users intending to visit a third-party standalone e-commerce site (if we are considering the case where this domain may subsequently have been acquired by the brand owner and its traffic re-directed to their official site) - this consideration involves taking account of a principle sometimes referred to as the 'substitution effect'[5]

Alternative proxies for the above figures may also need to be utilised, depending on the web channel under consideration (e.g. where absolute estimates of web traffic are not available or appropriate). For example, on social media, the 'exposure' or 'reach' of content can be estimated using numbers of 'likes' or followers; for mobile apps, the number of downloads may be relevant; for file sharing, it may be appropriate to consider the number of individuals accessing the content (e.g. 'seeds' and 'leechers' for BitTorrent). 

Numerous other approaches can also be taken. The ultimate objective when estimating the 'value' of a website is the identification a direct measure of the revenue it generates (e.g. via direct sales of products, for an e-commerce site). In practice, this information is almost never publicly available, though it is sometimes possible to make estimations via shipping or logistics information available through third-party databases. Some methodologies will utilise web-analytics tools to estimate value based on factors such as advertising spend by the site owner, or will analyse outgoing site traffic (e.g. to payment service provider platforms) to estimate customer volume and/or conversion rates[6]

It has also previously been noted that sometimes determination of ROI can reflect more qualitative goals (i.e. the statements of 'what success looks like' for a brand-protection programme). For example, a brand owner may consider a programme 'successful' once there are no infringing results returned on the first page of search-engine results, or in pages of search results on a range of key marketplace sites, in response to brand-specific queries. Similarly, the 'ownership of the buy button' (i.e. being the first vendor listed for a particular product on an e-commerce marketplace site) might be a key aim.

The success of a brand-protection initiative can also be judged based on other (again, more quantitative) metrics which may be only available to the brand owner themselves (as opposed to, say, a brand-protection service provider partner). These might include factors such as increases in the numbers of visitors to physical stores, or in volumes of traffic to official websites (as might be directly measurable using the brand owner's webserver log information).

Beyond this, wholly different methodologies can also be applied. Some will take account of 'intangible' factors such as brand value[7], considering the spend on brand protection to be a business cost necessary to lower the risk of damage to the brand. This type of approach is also not straightforward - higher levels of abuse can be considered an indicator that the brand is a desirable one, which can actually be reflective of greater brand value. Other factors, such as new product launches, can also affect the visibility of the brand and its likelihood of being targeted, all of which can serve to further complicate the landscape. 

However, in this article, we will primarily consider the simpler approaches discussed in previous work, and look at how they can potentially be modified to better account for the overall impact of a brand-protection programme.

Variations over time in the infringement landscape

Part 1: Single-brand analysis

In this section, we consider an extremely simplified model looking at changes in the infringement landscape over time for a brand, considering in the first instance the example of a newly-launched brand. In this case, the growth in the number of infringements over time might look something like that shown in Figure 1.

Figure 1: Mock-up of the changing infringement landscape over time for a newly-launched brand

The above framework is formulated using a timeframe expressed as numbers of months for convenience, though the timescales observed in practice may vary hugely. There is also a deliberate choice to avoid stating any quantitative numbers for the volumes of infringements, as these will also be dependent on any number of different factors - one brand may see tens or hundreds of infringements; other may see many thousands or more. Beyond these points, the construction of the above trend lines is based on the following scenario:

  • Following the launch of the brand (in month 1), there is a ramp-up in the number of monthly numbers of new infringements ('N') appearing online, up to a constant level.
  • There is also a (slower) ramp-up in the rate of infringements disappearing naturally from the Internet ('natural removal', 'R') even in the absence of any enforcement activity. This will arise through a combination of factors, including: content which is deactivated by the infringer following a period of use; domains expiring after their registration period; older content gradually dropping down search-engine rankings (and potentially therefore eventually ceasing to have any damaging impact), and so on.
  • There is a resulting growth in the cumulative number of active online infringements ('I'), caused by the difference between the monthly values of 'N' and 'R'. 
  • Finally, it seems reasonable to assume in most cases that 'I' will eventually reach a steady state, rather than continuing to grow indefinitely. This implies that 'R' will eventually 'catch up' with 'N' (possibly in part due to the fact that 'N' may also drop off slightly over time, after an initial peak in infringement activity). 

Of course, in practice the exact balance between the above numbers will be dependent on an enormous range of factors, including considerations such as the type of Internet channel. For example, marketplace listings will typically have a shorter 'lifetime' than domain registrations (affecting, for example, the rate at which 'R' catches up with 'N').

Let us now consider the case where a brand-protection programme, incorporating the introduction of enforcement actions for the removal of infringing content, is added into the picture (say, after the landscape has reached steady state in month 12) (Figure 2).

Figure 2: Mock-up of the changing infringement landscape over time, with an enforcement programme introduced in month 12

In this case, we use the following formulation:

  • In month 12, the enforcement programme is introduced, which incorporates a particular level of resource sufficient to action a certain maximum number of takedowns each month. This number will of course need to be greater than the rate at which new infringements appear, if the programme is to be successful. 
  • Following the introduction of the enforcement programme, the rate of natural removal ('R') of infringements will quickly drop off to zero (essentially, the infringements are being removed via enforcement quicker than the rate at which they would otherwise naturally disappear).
  • As enforcement progresses, the cumulative number of infringements drops off from its pre-existing level, until we reach a steady state (the 'whackamole' phase[8]) where the monthly number of enforcements ('E') simply needs to 'keep up' with the rate at which new infringements appear ('N'). In other words, each month a certain number of new infringements appear and these are all removed through the actions of the enforcement programme. (N.B. Equivalently, at this point we could express the 'cumulative number of infringements' ('I') as zero, depending on the point in the month at which we carry out the calculation (i.e. whether pre- or post-enforcement).)

In reality, the situation is likely to be far less straightforward, with a number of additional factors complicating the picture, including (but not limited to) the facts that:

  • The types of infringements actioned over time may change (potentially starting with higher-impact or easier takedowns).
  • Monitoring will inevitably start to uncover lower visibility and/or lower severity infringements once the initial high-visibility, high-impact infringements have been taken down.
  • The rate of appearance of online infringements may change in response to the enforcement programme (e.g. infringers turning their attention to easier targets).
  • The infringers may change their tactics in response to the enforcement programme (e.g. describing goods in different ways) - accordingly, both the monitoring approach and the enforcement methodologies may need to evolve in order to account for this.

Nevertheless, the above very simplistic picture does reflect some of the top-level trends typically seen in a brand-protection programme, with an initial period of 'cleaning up' the pre-existing backlog of infringements followed by a steady-state period of lower required activity, just keeping pace with new infringements as they appear. 

This being the case, we can look to this model to draw insights into how our classic ROI calculation methods could be augmented to provide a fuller picture. In many of the traditional approaches, monthly ROI calculation methodologies make use just of the total monthly numbers of enforcements carried out ('E'). Although the drop-off in the numbers of pre-existing infringements is reflected in the ROI calculations associated with the enforcements carried out during the 'ramp-down' phase itself, it is usually not reflected in the ongoing calculations during the subsequent 'whackamole' phase. Really, it may be preferable to make use of the difference between the ongoing number of infringements ('X') and that observed at the start of the programme ('Y'), if we are to fully assess the impact of the brand-protection programme. In other words, rather than using the number 'X' as the basis of our monthly ROI calculation, it might instead be better to use 'Y – X'. This number instead provides a measure of the value of the ongoing brand-protection programme - essentially, reflecting the difference in the ongoing number of infringements (with the associated 'cost' of them being live) compared with that which would have been observed if the programme were not in place. In practice, determination of these numbers will require the brand-protection initiative to incorporate a comprehensive programme of monitoring (as well as enforcement) throughout, incorporating a full landscape 'audit' at the outset.

Part 2: Benchmarking and the use of controls

To further complicate the situation, what the above approach fails to consider is any changes to the infringement landscape which would have occurred if the brand-protection programme were not being carried out. This is known as the 'attribution' issue in the physical sciences. Of course, once enforcement starts being carried out, we lose the ability to see what would have happened to the numbers of infringements if they were not being actively taken down. It is well established that external factors can significantly change the infringement landscape. For example, numerous previous studies show that real-world events can drive spikes in resulting infringement activity[9]

One way in which this problem can be addressed is via comparison with another 'control' brand of a similar type, operating in a similar industry area, but for which brand-protection activity is not being carried out. In practice, a brand owner can never be completely sure what any given competitor is doing, so a more realistic scenario is the use of analysis a group of industry peers, across which the infringement trends over time can be averaged to create a 'benchmark'. Of course, this requires active monitoring across all these brands, and so may be far from straightforward.

In this case, we may end up with a scenario such as that shown in Figure 3, where the control or benchmark brand (actually ideally an average of the data collected across multiple third-party brands) - which we have to assume reflects external drivers in infringement trends in the absence of enforcement initiatives - shows a change in the infringement landscape since the start of the programme for the brand being protected.

Figure 3: Mock-up of the changing infringement landscape over time, with an enforcement programme introduced for the customer brand in month 12, and compared against a (pre-existing, established) benchmark brand(s)

In the above example, the control brand shows a ramp-up in infringements during the period of the brand-protection programme, perhaps driven by an external event of some sort. Additionally, by using a benchmark comprising data from across numerous brands, we reduce the likelihood that the change is driven by some characteristic specific to one brand (such as a new product launch) and increase the likelihood that the change is representative of the industry landscape in general. 

In this case we can assume that, in the absence of a brand-protection programme, the infringement landscape for the customer brand would have increased by the same proportion as that seen for the benchmark brand(s). Therefore, instead of our ROI calculation being a function (' f ') of 'Y – X' (written as 'ROI = f [Y – X]'), we can say that:

ROI = f [ ( (B/A) × Y ) – X ]

Essentially, we are saying that, had the brand-protection programme not been in place, we might have expected the 'background' level of infringements for the customer brand also to have increased by a factor of ('B/A') by the end of the monitoring period, and so the benefit of the programme is in reducing it from this value to the value observed ('X').

Of course, the same approach can also be used if the benchmark shows a decrease in infringements across the monitoring period.

Discussion

The calculation of ROI for brand protection is fiendishly complicated, and no single approach will be applicable in all cases. In any selected methodology, it is necessary to make use of a wide range of assumptions and proxies for the data to which we would ideally like to have access. Nevertheless, there are some general industry-accepted standards for these calculations, many of which utilise metrics around ongoing levels of enforcement activity. In this article, we have considered some approaches which could be taken to modify these methodologies towards a new framework of ideas, involving the following two fundamental changes:

  • Considering the difference between the ongoing levels of enforcement (as a measure of the ongoing level of infringement activity), and those seen at the outset of the programme, as a measure of the overall impact of the brand-protection programme (rather than just considering the ongoing levels of enforcement in their own right).
  • Considering the use of one or (ideally) more benchmark brands, to separate out the observed change in infringement levels (for the customer brand) arising from the enforcement activity, from other background or landscape changes applicable to the industry vertical in general

Even then, there are still other factors to consider - the customer brand may also have experienced (company-specific) issues (such as product launches, changes in sales channels or target markets, etc. etc.) which themselves could have driven changes in the number of infringements, even in the absence of an enforcement programme or industry issues. All of this can further complicate the calculations to be carried out.

Additionally, I anticipate that the general philosophy behind ROI calculations may need to evolve further to reflect other issues more directly tied to cybersecurity, as the importance of this area becomes more widely appreciated. A former colleague of mine recently asked in a LinkedIn posting[10]:

""So what's the cost?" is a frequent question I hear. Rather than thinking about the budget required, brands need to consider the financial and reputational costs of repairing the damage when they are impacted."

The key point here is thinking about proactive rather than reactive measures. This issue is particularly relevant when it comes to domain security, where a range of products are available to allow corporations to secure their domains from external attack vectors which can be highly damaging (from both financial and reputational points of view)[11]. The matter is of even greater urgency in a landscape where we still see significant proportions of the world's top companies failing to adequately protect themselves[12]

The expected financial loss ('L') per year due to (say) cybersecurity issues (an 'attack') is given[13] by:

L = patt × Catt

where patt is the probability of an attack occurring during the year, and Catt is the financial cost (the 'damage') resulting from the attack. From this, we can say that, if the probability of an attack can be reduced (from pattwithout_security to pattwith_security) through the implementation of domain security measures, the saving ('S') to the organisation can be written as:

S = ( pattwithout_security – pattwith_security ) × Catt

Whilst easy to formulate, this can be much harder to quantify. However, a recent study showed that 88% of organisations were subject to some form of DNS attack in 2021, with each attack costing the enterprise an average of almost $1 million[14]. If, then, the risk of an attack can be (conservatively) reduced from (say) 10% to 1% though the introduction of security measures, this equates to an equivalent annual saving to the company of the order of $90k. If the cost of implementing the security measures is less than this value, the return on investment will be positive. If we factor in also the implications for access to - and cost of - cyberinsurance cover, the importance of domain security products and services becomes ever clearer.

Acknowledgements

Thanks must go to Angharad Baber, Mark Barrett and David Riley for their feedback and input into this article.

References

[1] https://www.worldtrademarkreview.com/anti-counterfeiting/return-investment-proving-protection-pays

[2] https://www.worldtrademarkreview.com/global-guide/anti-counterfeiting-and-online-brand-enforcement/2022/article/creating-cost-effective-domain-name-watching-programme

[3] https://www.cscdbs.com/blog/four-steps-to-an-effective-brand-protection-program/

[4] https://circleid.com/posts/20220726-calculating-the-return-on-investment-of-online-brand-protection-projects

[5] 'Digital Brand Protection: Investigating Brand Piracy and Intellectual Property Abuse' by Steven Ustel (2019). Chapter 17: 'Accounting and Accountability'

[6] 'Digital Brand Protection: Investigating Brand Piracy and Intellectual Property Abuse' by Steven Ustel (2019). Chapter 9: 'Pivots'

[7] https://www.cscdbs.com/blog/brand-abuse-and-ip-infringements/

[8] By 'whackamole' in this context, I am referring to a consistent state in which infringements are reactively taken down as quickly as they appear (rather than implying a random or disordered approach).

[9] https://www.linkedin.com/pulse/four-new-case-studies-domain-registration-activity-spikes-barnett/

[10] https://www.linkedin.com/posts/stuart-fuller-17a7411_what-cisos-can-do-about-brand-impersonation-activity-7027979839747846144-E6ak

[11] https://www.linkedin.com/pulse/holistic-brand-fraud-cyber-protection-using-domain-threat-barnett/

[12] https://www.cscdbs.com/en/resources-news/domain-security-report/ (2022)

[13] This follows from the fact that, mathematically, the expected value ('Ex') of a variable ('X') is given by:

Ex = Si ( p(Xi) × Xi ), where p(Xi) is the probability of X taking the ith value

[14] https://www.efficientip.com/wp-content/uploads/2022/05/IDC-EUR149048522-EfficientIP-infobrief_FINAL.pdf

This article was first published on 9 February 2023 at:

https://www.linkedin.com/pulse/calculation-return-investment-brand-protection-thoughts-david-barnett/

Wednesday, 8 February 2023

Hyphenated-domain infringements

Introduction

In this latest study, I consider domain-name infringements consisting of close matches to official brand websites, but differing only in the addition of a hyphen within the domain name. This follows on from previous studies looking at highly-convincing deceptive URLs, such as those utilising exact matches, homoglyphs or fuzzy matches[1], or hostname-based infringements[2]. An example of this type of infringement being used for fraudulent purposes was identified in November 2022, for a financial-services brand. The scam comprised a phishing attack utilising a SMS message as the attack vector; a mock-up of the SMS message (represented using the fictitious brand financebrand.com) is shown in Figure 1.

Figure 1: Mock-up of an SMS phishing message utilising a hyphenated-domain infringement

The scam - which utilises the infringing domain name financebran-d.com - has been cleverly designed to take advantage of the tendency of mobile SMS clients to split URLs after the '-' symbol, thereby creating the appearance of the official domain name (financebrand.com) split across a line-break with a breaking-hyphen (as is seen in the other text at the start of the message).

Methodology

To investigate the popularity of this type of infringement, I considered domain registration activity in which the domain name is an exact match to the name of any of the top ten most valuable brands in 2022 according to Interbrand[3], but including a hyphen between any pair of adjacent characters (e.g. for Google, I searched for 'googl-e', 'goog-le', 'goo-gle', 'go-ogle' and 'g-oogle')[4]. The analysis encompasses new registrations ('N'), re-registrations ('R') and drops (domain lapses) ('D') (collectively referred to as 'events').

In practice, the types of variation considered in this study would be covered by the 'fuzzy' match category included within sophisticated domain monitoring technologies, when simply searching for the brand string itself.

Findings

The dataset included 252 distinct domain registration activity events for the brand variations under consideration, representing 140 distinct domain names (of which 83 were still registered as of the time of analysis[5] - i.e. those for which the most recent event was not a 'D'). The breakdown of these domains by targeted brand and TLD is shown in Figures 2 and 3.

Figure 2: Breakdown of the 140 distinct hyphenated domain variants by targeted brand

Figure 3: Breakdown of the 140 distinct hyphenated domain variants by TLD

Of the 140 domain names, only 14 (10%) are explicitly registered to the associated brand owner (where the domains are registered and whois information is available), with the remainder registered to third parties and/or utilising privacy-protection services or having redacted information. 11 of the 14 officially owned domains have been configured to re-direct to the main brand website (with the other three not resolving to any live site).

The following is a summary of the characteristics of the 126 remaining sites:

  • 27 (21%) are configured with active MX records, indicating that they have been configured to be able to send and receive e-mails, and could potentially be used for phishing attacks.
  • One (no longer live) displays a browser warning indicating that dangerous content was formerly present.
  • Two are configured to re-direct to the corresponding official brand website.
  • The remainder display a range of content types, as shown in Figure 4.

Figure 4: Overview of content types on the 126 non-official domains in the dataset

Of the 73 possible permutations of .com domains (i.e. those with the greatest potential for confusion with the primary official .com site for the respective brand in question), 30 are present in the dataset, of which only 9 are registered to the brand owner, and 9 are configured with active MX records (of which only one is officially owned). 

Figure 5 shows examples of some of the unofficial sites within the overall dataset found to resolve to live content of potential concern.

Figure 5: Examples of live sites hosted on hyphenated domain-name variants targeting the Nike (top), Amazon (middle), and Microsoft (bottom) brands

Summary and recommendations

The analysis shows that the registration of hyphenated domain-name variants targeting the most valuable brand names, by entities other than the brand owners, is a significant issue and may be growing (as 24 of the 71 third-party domains for which creation dates are available were registered in 2022, compared with 17 in 2021, 6 in 2020, and 24 across all earlier years). 

Around one in five of the domains are configured with active MX records, and of the domains resolving to live content, a range of types of site content were identified. These include examples where web traffic is misdirected to third-party content, and others where the sites are being monetised through the inclusion of pay-per-click links or offers to sell the domain name. This indicates that not only do these domains present the potential for convincing attack vectors in phishing activity, but they may also be taking advantage of misdirected traffic arising from mis-typed search queries or browser requests. It is also noteworthy that the list of top TLDs within the dataset includes a number of new-gTLDs, many of which have previously been noted as being popular with infringers[6,7,8,9].

These findings highlight the importance for brand owners carrying out proactive and comprehensive programmes of brand monitoring and enforcement, to identify and takedown infringing third-party content. Additionally, brand owners may wish proactively to consider defensively registering hyphenated variants of their core domain names, to prevent them being registered by third parties for fraudulent or infringing use.

References

[1] https://www.cscdbs.com/en/resources-news/threatening-domains-targeting-top-brands/

[2] https://www.linkedin.com/pulse/exploring-domain-hostname-based-infringements-david-barnett/

[3] https://interbrand.com/best-global-brands-2022-download-form/; the brands are: Apple, Microsoft, Amazon, Google, Samsung, Toyota, Coca-Cola, Mercedes-Benz, Disney, Nike

[4] N.B. I exclude from this study any variants where the hyphen appears in the same location as a hyphen or space in the brand name itself (i.e. 'coca-cola' and 'mercedes-benz'), since these are considered exact matches to the brand name, rather than hyphenated variants. I do, however, consider the existence of variants such as 'coca-col-a' and 'cocacol-a'.

[5] All observations correct as of 22-Nov-2022

[6] https://www.cscdbs.com/blog/branded-domains-are-the-focal-point-of-many-phishing-attacks/ 

[7] https://circleid.com/posts/20210908-credential-hinting-domain-names-a-phishing-lure

[8] https://unit42.paloaltonetworks.com/top-level-domains-cybercrime/ 

[9] https://www.cscdbs.com/blog/the-highest-threat-tlds-part-2/

This article was first published on 8 February 2023 at:

https://www.linkedin.com/pulse/hyphenated-domain-infringements-david-barnett/

Tuesday, 7 February 2023

Exploring the domain of hostname-based infringements

Introduction

As noted in numerous previous studies, one of the main objectives in the construction of a deceptive infringement (such as a phishing site) may be the use of a URL which appears similar to that of the official site being targeted. 

One way in which this can be achieved is by constructing a hostname (consisting of a subdomain and domain name combination) which is identical (apart from an additional dot) to that of the genuine brand site. Active use of this technique has been observed in numerous cases - e.g. considering the case of the fictitious banking brand bankbrand.com, the use of a URL such as ba.nkbrand.com to target the bank's customers with a phishing attack. In order to put this type of attack into practice, the infringer needs to register a domain name which is a truncated form of the official brand site (in the above case, nkbrand.com), allowing them to construct the full hostname by configuring the required subdomain (in this case, 'ba.').

Study methodology

In order to investigate the scale of this practice being used for fraud and other brand infringements, I consider hostname-based variations of each of the top 50 most popular brand websites on the Internet[1] (see Appendix). For example, for the domain google.com, I investigate whether any live content exists at any of the following hostname-based variations:

  • g.oogle.com
  • go.ogle.com
  • goo.gle.com
  • goog.le.com
  • googl.e.com

This approach (i.e. checking the subdomain specifically) is more robust than simply checking whether the truncated versions of the domain names (e.g. oogle.com, ogle.com, etc.) have been registered, since some of these (particularly the shortest domain names) may be in use by unrelated third parties. 

Findings

Of the 262 candidate URLs[2] (i.e. the hostname-based variants of the top 50 domain names), 89 (34%) have active A records (indicating that they point at a live IP address) and 37 (14%) have active MX records (indicating that they have been configured to be able to send and receive e-mails), as shown in Figure 1. Significantly (where whois information is available), only six (2.6%) of the 233[3] truncated domain-name variants are registered to the brand owner who could be targeted using an associated hostname infringement. 

Figure 1: Breakdown of URLs by presence of A and MX records

Of the 89 URLs with active A records, a range of content types were observed, including:

  • Live third-party content - Pages where the URL resolves or re-directs to content unrelated to the brand in question (i.e. traffic misdirection)
  • PPC - Sites monetised through the inclusion of pay-per-click links
  • Domain-for-sale pages - Pages where the domain name is explicitly being offered for sale

A breakdown of the numbers is shown in Figure 2.

Figure 2: Breakdown of URLs with active A records by content type

It is worth noting that some of the instances of URLs resolving to live content may arise through the use of wildcard DNS records[4] (i.e. where the domain has been configured such that any arbitrary subdomain will resolve, rather than the specific subdomain having been explicitly configured). However, any URL pointing to a live IP address raises the potential for fraudulent or infringing use. At the time of analysis, none of the 262 URLs resolved to live phishing sites targeting the brand in question; however, it has been previously noted that in many cases, sites are left in a dormant state - in some cases, for an extended period of time - before being weaponised[5,6]. Consequently, many of the sites resolving to parking, holding or inactive pages may be worthy of monitoring for future changes in content. Furthermore, some of the identified instances of URLs resolving to third-party content may be of particular concern to the brand owner, if they misdirect web-users to competitor content or provide an undesirable brand association. Some examples include:

  • Hostname-based variant of google[.]com  Resolves to a page promoting a VPN product
  • Hostname-based variant of yandex[.]com  Re-directs to a flight-sales website
  • Hostname-based variant of xvideos[.]com  Resolves to a third-party adult website
  • Hostname-based variant of pornhub[.]com  Re-directs to a third-party adult website
  • Hostname-based variant of linkedin[.]com  Resolves to a gambling-site portal page
  • Hostname-based variant of ebay[.]com  Re-directs to the Google website

Additionally, the frequency of PPC pages within the dataset indicates the popularity to infringers of monetising domains whilst in their dormant state. Furthermore, the fact that many of these examples display content unrelated to the brand in question may also suggest that they have been configured to attract web traffic arising from mistyped browser requests, rather than being intended as explicitly deceptive variants of the brand domain name in question.

As a final observation, we can compare the date of registration with the length (in characters) of the second-level domain (SLD) name string (i.e. the portion of the domain name prior to the TLD, or domain extension), for each of the 233 potentially infringing domain names in the dataset (where these are registered and have whois information available) (Figure 3).

Figure 3: Comparison of date of registration with length of the SLD name, for the domains comprising (right-) truncated versions of the top 50 most popular domain names

The dataset shows that the domains in question have been registered over an extended period, between 1986 and 2022. The shorter domain names - i.e. those which are more likely to have been used for unrelated third-party or generic use - tend to comprise the oldest registrations. However, many of the domains with longer SLD string lengths - i.e. those less likely to be associated with 'accidental' brand collisions, and more likely to have been registered specifically to create hostname-based infringements - tend to have been registered over the last few years, highlighting a potential growth in popularity over time of this particular attack vector.

Summary and recommendations

The proportion of hostname-based infringements resolving to live content, or configured with active A and/or MX records - combined with previous observations of the use of this type of infringement as a phishing attack vector - highlights the scale of this infringement type as a potential source of concern. Consequently, brand owners may wish to consider proactively registering or acquiring domain names comprising truncated versions (where the right-hand end is retained) of their core domain name, to prevent registration and abuse by a third party. In cases where acquisition is not possible, it may be advisable to monitor the hostname-based infringements for future changes in content and - if and when active infringing content is detected - launching a timely enforcement action for the takedown of the material.

Appendix

Top 50 most popular websites according to Similarweb (October 2022).

Rank
              
Website
 
Category
 
1   google[.]com   Computers Electronics and Technology → Search Engines
2   youtube[.]com   Arts & Entertainment → Streaming & Online TV
3   facebook[.]com   Computers Electronics and Technology → Social Media Networks
4   twitter[.]com   Computers Electronics and Technology → Social Media Networks
5   instagram[.]com   Computers Electronics and Technology → Social Media Networks
6   baidu[.]com   Computers Electronics and Technology → Search Engines
7   wikipedia[.]org   Reference Materials → Dictionaries and Encyclopedias
8   yandex[.]ru   Computers Electronics and Technology → Search Engines
9   yahoo[.]com   News & Media Publishers
10   xvideos[.]com   Adult
11   whatsapp[.]com   Computers Electronics and Technology → Social Media Networks
12   pornhub[.]com   Adult
13   amazon[.]com   eCommerce & Shopping → Marketplace
14   xnxx[.]com   Adult
15   yahoo[.]co[.]jp   News & Media Publishers
16   live[.]com   Computers Electronics and Technology → Email
17   netflix[.]com   Arts & Entertainment → Streaming & Online TV
18   docomo[.]ne[.]jp   Computers Electronics and Technology → Telecommunications
19   tiktok[.]com   Computers Electronics and Technology → Social Media Networks
20   reddit[.]com   Computers Electronics and Technology → Social Media Networks
21   office[.]com   Computers Electronics and Technology → Programming and Developer Software
22   linkedin[.]com   Computers Electronics and Technology → Social Media Networks
23   dzen[.]ru   Community and Society → Faith and Beliefs
24   vk[.]com   Computers Electronics and Technology → Social Media Networks
25   xhamster[.]com   Adult
26   samsung[.]com   Computers Electronics and Technology → Consumer Electronics
27   turbopages[.]org   News & Media Publishers
28   mail[.]ru   Computers Electronics and Technology → Email
29   bing[.]com   Computers Electronics and Technology → Search Engines
30   naver[.]com   News & Media Publishers
31   microsoftonline[.]com   Computers Electronics and Technology → Programming and Developer Software
32   twitch[.]tv   Games → Video Games Consoles and Accessories
33   discord[.]com   Computers Electronics and Technology → Social Media Networks
34   bilibili[.]com   Arts & Entertainment → Animation and Comics
35   pinterest[.]com   Computers Electronics and Technology → Social Media Networks
36   zoom[.]us   Computers Electronics and Technology → Other Computers Electronics and Tech.
37   weather[.]com   Science and Education → Weather
38   qq[.]com   News & Media Publishers
39   microsoft[.]com   Computers Electronics and Technology → Programming and Developer Software
40   globo[.]com   News & Media Publishers
41   roblox[.]com   Games → Video Games Consoles and Accessories
42   duckduckgo[.]com   Computers Electronics and Technology → Search Engines
43   news[.]yahoo[.]co[.]jp   News & Media Publishers
44   quora[.]com   Reference Materials → Dictionaries and Encyclopedias
45   msn[.]com   News & Media Publishers
46   realsrv[.]com   Adult
47   fandom[.]com   Arts & Entertainment → Other Arts and Entertainment
48   ebay[.]com   eCommerce & Shopping → Marketplace
49   aajtak[.]in   News & Media Publishers
50   ok[.]ru   Computers Electronics and Technology → Social Media Networks

References

[1] https://www.similarweb.com/top-websites/ (data correct for October 2022) 

[2] All observations correct as of 11-Nov-2022

[3] Excluding duplicates

[4] https://en.wikipedia.org/wiki/Wildcard_DNS_record

[5] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[6] https://unit42.paloaltonetworks.com/strategically-aged-domain-detection/

This article was first published on 7 February 2023 at:

https://www.linkedin.com/pulse/exploring-domain-hostname-based-infringements-david-barnett/

Unregistered Gems Part 6: Phonemizing strings to find brandable domains

Introduction The UnregisteredGems.com series of articles explores a range of techniques to filter and search through the universe of unregis...