Tuesday, 6 December 2022

Threatening domains targeting the top ten most valuable brands (homoglyph and typo domains)

Key findings and highlights

  • Between August 2021 and August 2022, CSC identified 8,552 unique domain names comprising a very close match to the brand names of the global top ten most valuable companies, with more than 99% of those (where information is available) registered by third parties, leaving those brands vulnerable to targeted attacks by bad actors. Domains definitively determined to be official were excluded from the remainder of the analysis.
  • Where registration information is available, two thirds of the domains used domain privacy services - indicating an intention by the owner to mask their identity - or have redacted information.
  • 56% of the domains still registered at the time of analysis resolved to a live webpage. Among the live sites, we observed a range of high-concern content types, including fraud issues like potential phishing sites, and other brand infringements.
  • 35% of the registered domains were configured with active MX (mail exchange) records, indicating their ability to send and receive e-mails, making them capable of launching phishing attacks.
  • Many of the domain names are chosen to appear deceptively similar to official brand domain names or feature common typo variants to catch misdirected web traffic. Domains using non-Latin characters raise the greatest potential for confusion (and thereby for fraudulent use) as they can appear almost identical to their Latin equivalents, for example:
    • amaƶon.com 
    • ɑpple.com 
    • faceboo.com 
    • gᴑᴑgle.com 
    • mıcrosoft.com

* * *

Methodology of the analysis

In this analysis, we dive into potentially the most egregious and threatening set of domains targeting the top ten most valuable company brands in 2022[1]. These domains are those where the second-level domain name (SLD) - the part of the domain name to the left of the dot - consists solely of an exact or very close match to the targeted brand name.

We used CSC’s 3D Domain Security and Enforcement technology - powered by the DomainSecSM platform, which uses proprietary Machine Learning Deep Search (MLDS) technology and combines machine learning, artificial intelligence, and clustering technology to identify leading indicators of compromise - to conduct the analysis. We considered new registrations (N), re-registrations (R) or drops (D) - collectively referred to as domain registration activity events - over the period August 2021 to August 2022.

The study focuses on domains containing any of the following brand variants as the SLD:

  • Exact matches - where the SLD is identical to the brand term under consideration, but with a different extension (TLD) to the official brand website (i.e. a 'cousin' domain).
  • Homoglyph matches - where one or more characters in the brand term are replaced by a visually similar, non-Latin character.
  • Fuzzy matches - featuring typos (misspellings) affecting a single character, covering missing characters, additional characters, transposed characters, and other character replacements.

Each of these domain names therefore appears extremely similar to that of the brand's official site and raises significant potential for targeted attacks. These variants may have been deliberately selected by those registering the domains to be confusing or to circumvent detection efforts by brand owners, who may be monitoring only for exact matches to the brand string. Such activity presents significant threat vectors to the targeted brands, as those domains can be used for a variety of purposes, including active fraudulent use (e.g. creating phishing sites) or potential brand confusion and traffic misdirection, by taking advantage of the status of and consumer trust in the brand being infringed, or attracting traffic from mis-typed browser requests or search engine queries.

Findings

1. Domain activity

Over the analysis period, CSC identified more than 11,000 domain registration activity events across the ten brands under consideration, focusing only on the very close matches as described in the methodology. As with a previous CSC study considering deceptive domains with names beginning 'www' or 'http'[2], we saw continuous activity across the period (Figure 1).

Figure 1: Daily numbers of new registrations (N), re-registrations (R) and dropped (D) domains with names consisting of a close match to any of the ten brand names under consideration

In some cases, individual spikes in activity can be tied to specific events or coordinated registration campaigns. For example, a batch of 56 domains featuring misspellings of 'mcdonalds.co.uk' was registered on 27-Aug-2021, followed by 40 domains on 01-Sep-2021 which comprised 'microsoft' typos across the .biz and .one extensions, and then 46 'google.fr' and five 'amazon.fr' typo variants on 16-Oct-2021. Additionally, a set of domains featuring 'amazon' (or typos) and 'ten-cent' as the SLD was registered on 22-Apr-2022 across a range of different new gTLD extensions. Previous research by CSC[3] established that a peak in registrations of domain names featuring a key pharmaceutical brand with energy-related keywords took place immediately coinciding with the launch of the Energize programme[4].

In total, we identified 8,552 unique domain names in the dataset. For the active domains where whois information was available, only 72 domains (<1%) were explicitly registered by the official brand owners - presumably as defensive registrations or acquired domains to prevent third-party use. The remaining domains were registered by third parties. The analysis presented here focuses on these 8,480 third-party domains.

2. Brand variant types

Several types of brand variants (listed below) were present in the set of third-party SLDs, with their relative proportions within the dataset shown in Figure 2.

  • Exact match (i.e. 'cousin' domains)
  • Missing character
  • Extra character
  • Transposed characters (i.e. a swapped pair of adjacent characters)
  • Replaced characters - either non-Latin homoglyphs or other character replacements, with the number of replaced characters shown in Figure 2
N.B. Those domains with multiple character substitutions are generally homoglyph (internationalised-character) domain names, since the fuzzy-match searches (covering other character substitutions) explicitly focus on close matches with a single character differing from the search term.

The high potential for confusion between these domain names and those of the official brands' websites poses a significant threat to their security postures, as well as the risk of infringing use by bad actors for phishing activity, or other traffic misdirection.

Figure 2: Frequency of brand variant types used in the dataset of unique domains

Within the dataset, 3% of the domains featured an exact match to the targeted brand name (i.e. cousin domains), 44% featured a missing or extra character, 3% featured transposed characters, and 50% featured one or more replaced characters. In total, just over 3% of the dataset featured homoglyph domains, i.e. those incorporating non-ASCII characters (characters other than the Latin alphabet and other standard characters).

3. Observations on frequently occurring features within the dataset of unique domains

A. Top TLDs

The following are the most popular TLDs represented within the dataset of 8,480 third-party domains.

TLD
                          
No. domains
                          
  .com 856
  .xyz 564
  .net 302
  .top 295
  .shop 260
  .co.uk 236
  .online 221
  .live 205
  .store 205
  .info 185
  .org 184
  .site 184
  .fr 162
  .club 159
  .work 150

As has been seen with previous studies[5,6,7], this list of extensions is dominated by popular TLDs (e.g. .com), and new gTLDs (e.g. .xyz and .top), which are proven to be popular with infringers.

B. Brand variants consisting of transposed characters

Within the dataset of unique domains, the following are the most observed brand variants where a pair of adjacent characters have been swapped. In total, 253 of the domains were found to incorporate transposed characters.

Brand variant
                                  
No. domains
                                  
  googel 29
  appel 25
  goolge 23
  micorsoft 17
  faecbook 15
  amzaon 14
  mircosoft 13
  amazno 13
  gogole 13
  amaozn 11

C. Most popular character replacements

The dataset of unique domains encompassed more than 3,000 individual replaced characters[8].

The top ten most common replacements with Latin alphabet characters were:

Character replacement
                                             
No. instances
                                             
l → i 93
n → m 59
o → i 52
a → o 47
c → k 47
o → a 45
a → e 42
o → g 40
g → d 38
g → b 37

The top ten most common replacements with other characters (non-Latin characters and numbers) were:

Character replacement
                                             
No. instances
                                             
o → 0 78
l → 1 24
o → õ 20
o → ó 17
o → ö 15
l → ł 15
e → 3 13
e → é 12
z → - 10
o → ᴏ 10

In many of these cases, the characters have been replaced with visually similar alternatives to create a convincing lookalike domain name. There are instances, however - notably in the Latin alphabet character replacements - where characters have been replaced with characters that are adjacent to them on a standard QWERTY keyboard (e.g. n  m, o  i, g  b, etc.) presumably to try to catch misdirected traffic from browser requests containing common typing errors. It is also worth noting that some of the common character replacements observed in the dataset may relate to true third-party brand use or unrelated terms (e.g. all the observed g  m replacements in the dataset appear as the term 'moogle', which could refer to a creature in the Final Fantasy gaming series).

D. Top registrant and registrar characteristics

Among the dataset, we found whois (domain registration) information for 4,513 domains, of which two thirds (3,007) used domain privacy services or had redacted registration information. Use of anonymisation services demonstrates a domain owner's attempt to mask their identity and could indicate nefarious intentions[9].

The following tables show the top organisations and countries given in the sets of registration contact details for the domains.

Most common registrant organisations:

Organisation
 
No. domains
                                  
  Domains By Proxy, LLC 752
  Privacy service provided by Withheld for Privacy ehf 401
  See PrivacyGuardian.org 245
  Contact Privacy Inc. Customer 7151571251 153
  Private by Design, LLC 134

Most common registrant countries:

Country
 
No. domains
                                  
  US (United States) 1,809
  CN (China) 624
  IS (Iceland) 447
  CA (Canada) 235
  JP (Japan) 94
  RU (Russia) 91
  DE (Germany) 71
  GB (United Kingdom) 71
  VN (Vietnam) 43
  UA (Ukraine) 41

The following table shows the top registrars through which the domains in the dataset were registered.

Most common registrars:

Registrar
 
No. domains
                                  
  GoDaddy.com, LLC 839
  Namecheap, Inc. 485
  NameSilo, LLC 286
  Dynadot LLC 263
  Alibaba Cloud Computing Ltd. d/b/a HiChina 148
  DNSPod, Inc. 142
  Porkbun LLC 176
  Name.com, Inc. 111
  PDR Ltd. d/b/a PublicDomainRegistry.com 106
  Google LLC 97

The registrar landscape within the dataset is dominated by consumer-grade providers, a trend which has also been previously seen in other CSC studies of domains registered for potentially infringing use[10].

4. Cybersecurity, fraud protection, and brand protection observations

Of the 8,480 unique third-party domains in the dataset, 4,552 (54%) were still registered at the time of analysis (i.e. those domains where the most recent activity event was a registration or re-registration). Just less than a third of the domains - equivalent to 56% of the registered domains - were found to produce a live website response[11]. Furthermore, 1,590 domains (19% of the dataset, or 35% of the registered domains) were configured with active MX records, indicating that they can send or receive e-mails (e.g. for use in a phishing attack). This is possible even when there is no live site content. Additionally, as noted in previous studies[12], dormant domains also have potential to be used fraudulently in the future, so monitoring for changes to configuration status and site content is advisable.

Live sites in the dataset featured a range of content, including: lookalike sites; websites using potentially unauthorised official branding; third-party sites using similar branding; sites featuring content relating to third parties operating in a similar business area as the infringed brand; and gambling-related or adult material. Additionally, many displayed pay-per-click links - as a means of monetising the web traffic - or holding pages offering to sell the domain names, i.e. potential cybersquatting. Many displayed browser warnings indicating threatening content is or has been present.

Figure 3 shows examples of some of the most significant infringements identified within the dataset, with the associated SLD (domain-name string) shown in square brackets in each case. These include websites trying to pass off as the brand in question (as part of a fraudulent brand-impersonation attack, for example), unauthorised use of official intellectual property (IP) - particularly concerning if the content provides an undesirable brand association - or generating revenue for a third party operating in a similar business area through misdirecting web traffic which is arguably intended for the official brand.

(a)

(b)

(c)

(d)

(e)

(f)

Figure 3: Examples of brand infringements identified within the dataset: (a) Lookalike site [googe]; (b) Potential phishing [mc-donalds]; (c) Potential unauthorised use of branding [aamazon]; (d) Similar branding [armazon]; (e) Third-party content in related business area [fgoogle]; (f) Other brand infringement [fakebook]

Across the dataset, 3% of the domains included at least one character outside a standard Latin (ASCII) character set (i.e. domains that can be represented in Punycode[13] format). Many of these are visually almost identical to the official domain name for the brand in question, with no additional, missing or transposed characters, and therefore could be used as highly convincing attack vectors. Around 10% of these were configured with active MX records, indicating e-mail capability.

Examples of highly convincing lookalike domain names (replaced characters highlighted in bold):

  • amazoņ.com
  • amaƶon.com
  • aþþle.com
  • ɑpple.com
  • faceboo.com
  • ƭacebook.com
  • googɫe.com
  • gᴑᴑgle.com
  • mcdonaldƨ.com
  • micrsoft.com
  • mıcrosoft.com

As shown earlier in Figure 2, the majority of domains incorporating character replacements featured a single replaced character. In part, this reflects the fact that the fuzzy searches explicitly focused on matches only one character different. In the dataset, we identified 33 domains where at least half of the characters were replaced by homoglyphs, e.g. gõõgłé.com and fäcëböök.com.

The full list includes a number of examples where all of the character replacements make use of non-Latin versions that look almost identical to the original characters (e.g. ɡᴏoɡle.com), or where the alternative version simply appears visually to be just in a different font (e.g. Fᴀᴄᴇʙᴏᴏᴋ.com). These could form the basis of extremely deceptive infringements.

We found 183 distinct characters (including non-Latin homoglyphs from numerous alphabets and character sets) being used as character replacements across the full dataset, as shown here.

0 1 2 3 4 5 6 7 8 9 - ǀ a ᴀ á à ȧ â ä ǎ ă ā ã å ą ạ ɑ ɒ æ b ʙ ḃ ɓ ḇ ƅ c ᴄ ć ċ ĉ ç ƈ d e ᴇ é è ė ê ë ě ē ę ɇ ẻ ẹ ᶒ ḛ ǝ ə ɘ ɛ f ꜰ ḟ ƒ g ɢ ǵ ġ ĝ ǧ ğ ḡ ģ ɠ ɡ h i ı í ì ǐ ī į ɨ ỉ ị ɩ j k ᴋ ĸ ḱ ķ ƙ ḵ l ĺ ɫ ļ ł ɭ ḻ ꞁ m ṃ n ń ñ ņ ᶇ ṇ o ᴏ ᴑ ó ò ȯ ô ö ŏ ō õ ő ɵ ø ỏ ơ ọ ở œ p ṗ q r ʀ s ƨ ẝ t ť ƭ ṭ ṯ þ u ü v w ƿ ʍ x y z ᴢ ź ž ƶ ʑ ʘ а е і ї р ꓐ ꓑ ꓖ ꓗ ꓜ ꓝ ꓟ ꓠ ꓡ ꓮ ꓰ ꓳ  

Summary

CSC's findings highlight the degree to which trusted brands can be targeted by bad actors, which presents significant threats to their security postures, revenue, and reputations. It highlights the need for domain intelligence as part of a layered security approach, including comprehensive domain monitoring as the foundation of a holistic brand protection initiative.

CSC's 3D Domain Security and Enforcement solution (powered by our DomainSec platform) can provide this overview, enabling brand owners to have visibility of domain activity encompassing detection of brand variants, including fuzzy matches like typos and homoglyph domains, and soundalike variations, across a range of domain name extensions. The system also incorporates MLDS technology to intelligently modify monitoring based on previously identified infringements.

The closeness of the match of a third-party domain name to that of the brand owner's official name is also one key input into algorithms that can help determine the threat level that may be posed in the future by that domain. These concepts are central to the idea of threat scoring, which can be used to prioritise infringements for analysis, future monitoring, and enforcement.

References

[1] https://www.kantar.com/inspiration/brands/what-are-the-most-valuable-global-brands-in-2022; the brand terms used in our analysis are apple; google; amazon; microsoft; tencent; mcdonalds; visa; facebook; alibaba; vuitton.

[2] https://www.cscdbs.com/blog/registration-patterns-of-deceptive-domains/

[3] 'Domain registration patterns analysis' (unpublished)

[4] https://www.se.com/ww/en/about-us/newsroom/news/press-releases/10-global-pharmaceutical-companies-launch-first-of-its-kind-supplier-program-to-advance-climate-action-6182848cf01af478b619ddd4

[5] https://www.cscdbs.com/blog/branded-domains-are-the-focal-point-of-many-phishing-attacks/

[6] https://circleid.com/posts/20210908-credential-hinting-domain-names-a-phishing-lure

[7] https://unit42.paloaltonetworks.com/top-level-domains-cybercrime/

[8] In this analysis, we exclude examples where the replaced-character version forms an alternative word in its own right (e.g. 'apply' or 'ample' for 'apple'), and all replacements for the Visa brand (since the small number of characters means that many of the variations are significantly different from the brand name and may pertain to unrelated third-party use), as the replaced versions in these cases may not be intended to be deceptive variants of the brand name in question.

[9] https://www.cscdbs.com/en/resources-news/supply-chain-report/

[10] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[11] Those returning an HTTP status code of 200

[12] https://unit42.paloaltonetworks.com/strategically-aged-domain-detection/

[13] https://en.wikipedia.org/wiki/Punycode

This article was first published on 6 December 2022 at:

https://www.cscdbs.com/en/resources-news/threatening-domains-targeting-top-brands/

Thursday, 20 October 2022

The Highest-Threat TLDs – Part 1

by Justin Hartland and David Barnett

A domain name consists of two main elements: the second-level domain name to the left of the dot - often consisting of a brand name or relevant keywords - and the domain extension or top-level domain (TLD) to the right of the dot. Domain names form the key elements of the readable web addresses allowing users to access pages on the Internet and also allow the construction of e-mail addresses.

There are different types of TLDs, including generic or global (gTLDs), that were originally intended to provide a description of the site type, such as .com for company websites or .org for charitable organisations. There are also country-code TLDs (ccTLDs) for specific countries, e.g. .co.uk for the UK, .fr for France, etc. Finally there are a range of new gTLDs that have launched since 2013[1], usually relating to specific content types, business areas, interests, or geographic locations (e.g. .shop, .club, .tokyo). Each TLD is overseen by a registry organisation, which manages its infrastructure.

Domain names are associated with the full spectrum of Internet content, from legitimate use by brands or individuals, to infringing or criminal activity. CSC has observed that certain TLDs get used more for egregious content.

There are several possible reasons why particular TLDs are more attractive to infringers, including the cost of domain registration, and difficulties in conducting enforcement (takedown) actions against infringing content. TLDs operated by certain registries, like those offering low- or no-cost domain registrations or those with lax registration security policies, are more likely to be used for infringing activities. Additionally, domain extensions lacking well-defined, reliable enforcement routes like .vn (Vietnam) and .ru (Russia) prove to be especially high risk. Other factors are also significant; for example, a country's wealth affects the levels of technical expertise of Internet service providers (ISPs) and therefore the likelihood of domains being compromised.

In this two-part blog post, we aim to quantify the threat levels associated with specific domain extensions, i.e. the likelihood that a domain on a particular TLD might be registered for fraudulent purposes.

Part 1: Phishing site TLDs

Determining the overall threat frequency for each TLD is useful in several ways:

  • Helping to prioritise results identified via a brand protection service. For example, the TLD can be used to identify top targets for future tracking for content changes.
  • Identifying TLDs where it is advisable to register domains featuring key brand-related strings defensively to avoid them being registered by third parties with malicious intent.
  • Identifying TLDs where it is advantageous for brand protection service providers to offer blocks or alerts when, for example, a third party attempts to register a domain containing a brand-related term.

Analysis and discussion

For this first post, we analysed data from CSC's Fraud Protection services to uncover the TLDs associated with domains used for phishing activity. The analysis covers all sites detected between November 2021 and April 2022 for those TLDs with more than 10 phishing cases and where domain-based phishing cases were recorded (as opposed to subdomain-based). This yielded results for 115 distinct TLDs.

In addition, we also consider the frequency of domain use associated with threatening content across the TLD in question. We do this by expressing the raw numbers as a proportion of the total number of domains registered across the TLD[2]. We then normalise the data, so the value for the highest-threat TLD is 1, with all other values in that dataset scaled accordingly. It is important to note that this value reflects the proportion of malicious domains across each TLD, rather than absolute numbers. Some other TLDs see high numbers of infringements by virtue of the total numbers of domain registrations across these extensions. Table 1 shows the top 20 TLDs represented in CSC's phishing dataset (by absolute numbers), together with the normalised threat frequencies for these TLDs.

TLD
                   
% of total
phishing cases
                       
Total no. of
regd. domains
across TLD
                           
Normalised
threat frequency
within dataset
                           
  .com 45.7% 221,858,334     0.014
  .org 6.9% 15,550,733     0.031
  .app 6.2% 1,155,807     0.377
  .net 4.8% 19,773,315     0.017
  .xyz 2.5% 10,841,304     0.016
  .ru 2.5% 10,627,033     0.016
  .co 2.1% 4,110,132     0.035
  .cn 1.7% 25,147,816     0.005
  .me 1.3% 1,669,800     0.054
  .dev 1.2% 391,929     0.222
  .br 1.2% 5,519,378     0.015
  .top 1.2% 8,830,142     0.009
  .io 1.1% 923,588     0.085
  .in 1.1% 3,271,337     0.023
  .page 1.0% 368,474     0.195
  .id 0.9% 760,240     0.080
  .icu 0.8% 7,956,385     0.007
  .info 0.8% 7,852,896     0.007
  .de 0.7% 22,881,115     0.002
  .ke 0.7% 165,907     0.288

Table 1: Top 20 TLDs represented in CSC's phishing dataset, by absolute numbers

We have observed similar patterns in other analyses of threatening content. Interisle's 'Malware Landscape 2022' study found that the top 10 TLDs associated with malware domains also featured a mix of legacy gTLDs (.com at position one, .net at five, .org at six, and .biz at 10), new gTLDs (.xyz at position two, .club at seven, and .top at nine) and ccTLDs (.br, .in, and .ru at positions three, four and eight, respectively)[3]. Eight of these 10 extensions feature in the top 14 of CSC's phishing list above. Similarly, the Anti-Phishing Working Group's (APWG's) 'Phishing Activity Trends Report' for Q4 2021 analysed top phishing TLDs, with a top nine including new gTLDs .xyz, .buzz and .vip, and ccTLDs .br and .in, alongside legacy gTLDs.

New gTLDs were more than twice as extensively represented in the dataset as would be expected purely based on the total number of domains registered across these extensions[4]. A Q1 2022 study by Agari and PhishLabs also showed similar patterns, where the top 10 TLDs abused by phishing (by number of sites) included the new gTLDs .vip, .xyz and .monster, and ccTLDs .br, .ly, and .tk[5,6].

Table 2 shows the pattern is rather different when looking at the top TLDs by their normalised threat frequency; the list is dominated by a distinct set of ccTLDs, a smaller number of new gTLDs, and excludes many of the more popular TLDs shown previously.

TLD
                   
Normalised
threat frequency
within dataset
                           
Total no. of
regd. domains
across TLD
                           
% of total
phishing cases
                       
  .gd 1.000 3,306     0.05%
  .gy 0.910 4,037     0.05%
  .ms 0.739 9,440     0.10%
  .zm 0.531 4,838     0.04%
  .app 0.377 1,155,807     6.21%
  .ly 0.356 25,801     0.13%
  .ke 0.288 165,907     0.68%
  .dev 0.222 391,929     1.24%
  .page 0.195 368,474     1.03%
  .ug 0.187 10,810     0.03%
  .sn 0.187 9,842     0.03%
  .do 0.176 30,215     0.08%
  .bd 0.127 37,465     0.07%
  .sbs 0.120 44,222     0.08%
  .np 0.112 57,379     0.09%
  .sh 0.110 25,070     0.04%
  .ng 0.097 240,668     0.33%
  .io 0.085 923,588     1.11%
  .id 0.080 760,240     0.86%
  .sa 0.079 60,246     0.07%

Table 2: Top 20 TLDs represented in CSC's phishing dataset, by normalised threat frequency

In the second article in this series, we compare these findings with those from additional datasets to produce an overall measure of TLD threat frequency, considering a range of fraudulent uses. We then consider cybersecurity implications, discuss mediation measures, and cover how CSC can help with this process.

References

[1] https://newgtlds.icann.org/en/program-status/delegated-strings

[2] https://domainnamestat.com/statistics/tldtype/all (statistics correct as of 13 June 2022)

[3] https://interisle.net/MalwareLandscape2022.pdf

[4] https://docs.apwg.org/reports/apwg_trends_report_q4_2021.pdf

[5] https://info.phishlabs.com/hubfs/Agari%20PhishLabs_QTTI%20Report%20-%20May%202022.pdf

[6] https://www.tripwire.com/state-of-security/security-data-protection/phishing-threat-trends-intelligence-report/

This article was first published on 20 October 2022 at:

https://www.cscdbs.com/blog/the-highest-threat-tlds-part-1/

Also published at:

https://circleid.com/posts/20230112-the-highest-threat-tlds-part-1

Tuesday, 18 October 2022

Energy-crisis-related scams highlight how bad actors seek to capitalise on global events

Fraudsters can be counted on to be quick to take advantage of those who may be struggling, and the latest example is the cost-of-energy crisis. Our uncovering of related scams in the UK follows numerous previous studies illustrating how real-world events can trigger associated spikes in online infringement activity, including efforts focused on the invasion of Ukraine[1] and the pandemic[2]

Events such as the war in Ukraine and associated supply-chain issues have triggered huge rises in the cost of energy, resulting in support programmes being introduced by governments. In the UK, for example, the Energy Price Guarantee[3] (which reduces energy unit costs to consumers) and Energy Bills Support Scheme[4] (providing an automatic energy payment rebate), come into effect in this month, in addition to energy price caps for corporations. 

In response to these initiatives, bad actors have instigated a range of phishing campaigns designed to harvest users' personal information, under the guise of soliciting applications for participation into the schemes.

In the two examples shown below, we identified SMS messages of a similar style (sent on 26 September), directing users to phishing sites hosted on the domains via-rebate-scheme[.]com and energy bills-support[.]com. 


Figure 1: Examples of SMS messages directing users to phishing sites related to the UK Energy Bill [sic] Support Scheme

The two domains in question had been registered in the previous few days (25 and 21 September, respectively), and both had redacted whois records. Neither of the sites was active by the time of analysis (on 26 September). 

Searches for reports of other scams featuring similar text revealed that several additional domain names had also been utilised in scams of this type, with a selection of examples listed below:

  • energy-bill-online[.]com
  • energy-bill-support[.]com
  • energybills-rebate[.]com
  • my-energybill-online[.]com
  • mygov-energy-help[.]com
  • online-energybill-rebate[.]com
  • rebate-application[.]com
  • support-rebatescheme[.]com
  • energy[.]bill-rebate[.]com

The majority of these sites were inactive by the date of analysis; however, two of the above domains were found still to resolve to active sites – displaying very convincing lookalikes of the government's official 'gov.uk' sites, including webforms prompting for the input of names, dates of birth, mobile numbers, and addresses.

Figure 2: Phishing site content visible on fake UK government domains mygov-energy-help[.]com and rebate-application[.]com (live as of 26 September 2022)

Considering the above observations, we utilised our monitoring technology to look for patterns in the registration of domains with names containing the strings ‘energy’ and ‘rebate’, in the period to 26 September. Analysing the raw data, we found that there has been continuous activity (in terms of the registration, re-registration and lapse of relevant domain names) across the preceding year, with numerous 'noisy' peaks and troughs, and no obvious trends. 

This is perhaps unsurprising given the generic nature of the keywords under consideration, and the numerous different ways they can be utilised in domain names unrelated to the programmes and scams of interest. However, our tools allow us to look at specific match types, and thereby drill down more closely into examples which are more likely to be of direct relevance. Accordingly, we next considered only those domain names containing a 'word match' for the keywords 'rebate', 'energy', 'energybill' or 'energybills' (i.e. those domains where these terms appear in isolation, or are separated from the remainder of the domain name by hyphens - i.e. similar patterns to those appearing in the known examples of the scam domains listed above). 

For 'energy' domains, this still yields a rather noisy dataset. However for the (somewhat more distinctive) keyword 'rebate', there is a much clearer ramp-up in activity in the latter part of September 2022, in the lead-up to the launch of the related UK government scheme.

Figure 3: Five-day centred rolling averages of the total daily number of registrations (including re-registrations) of domains with names containing 'energy' (top) and 'rebate' (bottom) (as 'word matches'), between March and September 2022

Of the 39 distinct 'rebate' (word-match) domains registered in the final two weeks of the analysis period, a significant proportion featured additional keywords suggesting that they may have been registered with similar scams in mind - seven referenced 'energy', six 'scheme', three 'application' and two 'claim'. 

This dataset included three additional domains (energy-bill-rebate[.]com, mytax-rebate-application[.]com and rebate-applications[.]com) resolving to active 'gov.uk' branded phishing sites as of 26 September, together with several more which (though inactive) still featured the 'gov.uk' favicon. Six further examples featured browser-level warnings that they had previously featured 'dangerous' content. 

Five of the domains were found to have been both registered and then lapsed within the two-week period (with delays of between one and five days between the two events).

These observations once again highlight how real-world events can trigger peaks in infringement activity by bad actors wishing to take advantage of difficult situations for their own financial gain, at the expense of their victims. 

The phishing campaigns highlighted in this analysis make use of domains which are specifically registered for use in the campaign, and are typically used only for a short period (potentially in an attempt to circumvent detection and takedown efforts), before being allowed to lapse. 

Phishing activity generally is most effectively detected through the implementation of product sets - which incorporate use of spam traps and honeypot accounts, and other feeds such as brand-owner webserver logs - as a complement to other detection methodologies. 

However, the findings presented here also highlight how nimble infringers can be and, for example in the case of organisations and not-for-profits involved in responding to crisis and global events, why it is important to ensure particular vigilance when mission-related incidents occur. 

References

[1] https://www.cscdbs.com/blog/how-to-manage-the-online-effects-of-the-ukraine-war/

[2] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[3] https://www.gov.uk/government/publications/energy-bills-support/energy-bills-support-factsheet-8-september-2022

[4] https://www.gov.uk/guidance/getting-the-energy-bills-support-scheme-discount

This article was first published on 17 October 2022 at:

https://www.worldtrademarkreview.com/article/energy-crisis-related-scams-highlight-how-bad-actors-seek-capitalise-global-events

Tuesday, 4 October 2022

The continued rise of phishing and the case of the customisable site

As noted in previous CSC studies[1,2,3], phishing continues to be an extremely popular threat vector with bad actors and shows no signs of subsiding - in part, because of the COVID-19 pandemic and the rise in popularity of remote working. Indeed, the most recent figures from the Anti-Phishing Working Group (APWG)[4,5] show that the numbers of phishing attacks are higher than ever before, with the quarterly total of identified unique phishing attacks exceeding 1 million for the first time in Q1 2022, and over 600 distinct brands attacked each month.

Figure 1: Total monthly numbers of unique phishing attacks from Q1 2018 to Q2 2022, as reported by APWG[6]

An earlier report by APWG[7] noted that over 80% of phishing sites were found to be employing SSL (secure socket layers) or TLS (transport layer security) certificates (allowing use of HTTPS) - an increase from around 5% at the end of 2016 - and 90% of these certificates had been issued by free providers, such as cPanel and Let's Encrypt.

Furthermore, Interisle's 2022 Phishing Landscape study[8,9] reported the detection of over 1.1 million phishing attacks between April 2021 and April 2022, with over 2,000 brands targeted, but the majority targeting just 10 top brands. Overall, 69% of attacks made use of specifically registered domains, with the attacks disproportionately concentrated on new generic top-level domains (gTLDs). Additionally, a small number of registrars dominate the malicious registrations. Around 41% of domains reported for phishing were found to have been used within 14 days of their registration, and most of these were reported within 48 hours.

Modern phishing is driven by the desire for credential theft and business impersonation, but it is also increasingly recognised as the gateway for launching malware and ransomware attacks, which often lead to serious compromises of corporate systems and other security issues, such as DNS (domain name system) attacks.

The customisable phishing site

Central to many phishing attacks is the use of a fraudulent lookalike site mimicking the appearance of the official site of the brand being targeted - often including a log-in form prompting the input of sensitive customer information which thereby falls into the hands of the fraudster. In a classic phishing attack, the site will impersonate a specific brand, and cybercriminals will send e-mails to a wide group of users driving them to the site. This strategy uses the assumption that a certain portion of the recipients will be genuine customers of the targeted brand and may be fooled.

However, over the last two years, CSC has noted the emergence of a much more egregious style of phishing site, the appearance of which is dynamically tailored to the specific recipient in each case and can successfully target a much broader portion of recipients from a single campaign.

An example was first identified in February 2020, using a URL of the form https://[fraudsite.com]/[directory]/?usr=[string], where 'string' was a series of apparently random characters. The site appeared to target the user of a specific corporate e-mail address relating to a brand owner, with the address pre-populated into the log-in form on the page. The background of the site displayed a framed version of the official company website, giving the appearance that the user was logging into their own corporate site. All of this content appeared to be hard-coded into the HTML of the phishing site.

However, closer inspection revealed that the content actually appeared to be dynamically generated, with the string in the URL comprising a Base64-encoded version (a standard method of converting binary data, such as a string of standard ASCII characters, into an alternative text format) of the recipient e-mail address.

To determine how the phishing site handled this information in practice, a modified URL was generated, replacing the previous Base64 string with an encoded version of a CSC employee e-mail address. This produced the page shown in Figure 2, for which the HTML source code again appeared to be hard coded when viewed.

Figure 2: A version of the phishing site constructed by modifying the string in the original URL, showing how it would appear if targeted towards the user of a specific CSC corporate e-mail address (obfuscated in the screenshot for privacy purposes)

The implication is that the site is presumably running a script to dynamically generate the HTML of the page, based on the content of the Base64 string within the URL. This provides the potential to generate a very convincing, customised phishing attack whereby, given a recipient e-mail address, the fraudulent site is configured to display a framed version of the host domain of the e-mail address, overlain by a log-in box pre-populated with that address. Consequently, the same phishing e-mail could potentially be sent to large numbers of e-mail addresses, with no further requirement to customize the e-mail or the corresponding phishing site to the recipients in question - beyond ensuring that a Base64-encoded version of the recipient e-mail address is appended to the link in the phishing e-mail in each case (which could easily be automated via use of a script).

It was also established that the behaviour of the site appears to be dependent on exactly how and where it is viewed, with the site appearing inactive when viewed in a virtual machine environment. This type of configuration has previously been noted as a technique used by fraudsters to thwart forensic analysis of their sites by security professionals, who often work in virtual environments.

It is also notable that this type of site would be very difficult to detect using traditional brand monitoring approaches. Aside from the fact that the site may have been set up as an unindexed island site, intended to be accessed only via links in spam e-mails, there is potentially no reference to any brand in the site content itself, with brand-specific content being generated dynamically in the HTML only when a specific URL is accessed. In this type of case, detection would be dependent on the ability of CSC's anti-fraud engine, working in conjunction with web referrer information provided by the brand owner, to identify when the phishing site draws information from the brand owner's official site when the framing process is carried out.

A study in July 2022[10,11] reported the identification of an extremely similar style of attack, in this case using a bit-for-bit mirror of the official site of the brand being targeted.

Conclusions

These findings highlight the importance of a comprehensive phishing detection and enforcement programme, able to identify threats of a variety of types. Detection should incorporate domain monitoring (to identify phishing sites where the brand name - or a variant - is included in the domain name) and Internet monitoring (to identify other fraudulent sites linked from content indexed by search engines) components. However, other data sources - such as spam traps and honeypots, and other data feeds like customer abuse mailbox data and webserver logs - should also be used to identify phishing sites that are unindexed or feature content that is more dynamically generated.

However, even this is only part of the solution. As noted above, phishing attacks often form the basis for subsequent malware attacks or other security incursions. Accordingly, a robust security posture should also include the deployment of a range of domain security measures - such as those offered by an enterprise-class registrar - to protect critical corporate domains. It is also advisable for brand owners to avoid the use of service providers who allow unsavoury practices such as typosquatting, domain name auctions, and name spinning (the sale of domains containing brand variations) - all of which can facilitate phishing attacks.

References

[1] https://www.cscdbs.com/blog/branded-domains-are-the-focal-point-of-many-phishing-attacks/

[2] https://www.worldtrademarkreview.com/global-guide/anti-counterfeiting-and-online-brand-enforcement/2022/article/going-phishing-countering-fraudulent-campaigns

[3] https://www.cscdbs.com/blog/going-phishing-countering-fraudulent-campaigns-2/

[4] https://docs.apwg.org/reports/apwg_trends_report_q1_2022.pdf

[5] https://docs.apwg.org/reports/apwg_trends_report_q2_2022.pdf

[6] https://apwg.org/trendsreports/

[7] https://docs.apwg.org/reports/apwg_trends_report_q2_2021.pdf

[8] https://interisle.net/PhishingLandscape2022.html

[9] https://interisle.net/PhishingLandscape2022.pdf

[10] https://www.darkreading.com/endpoint/apt-phishing-mirrors-landing-pages-credential-harvesting

[11] https://www.avanan.com/blog/mirroring-actual-landing-pages-for-convincing-credential-harvesting

This article was first published on 4 October 2022 at:

https://www.cscdbs.com/blog/the-continued-rise-of-phishing-and-the-case-of-the-customizable-site/

Also published at:

https://circleid.com/posts/20221010-the-continued-rise-of-phishing-and-the-case-of-the-customizable-site

Tuesday, 27 September 2022

Four steps to an effective brand protection programme

by Elliott Champion and David Barnett 

Internet use has become ever more pervasive. With around five billion global users[1], it generates an economy of around 15%[2] of global GDP (gross domestic product)[3]  - around $15 trillion, and a figure which is growing 2.5 times faster than GDP itself. This makes the Internet an attractive channel for infringers.

Phishing and other fraud tactics, selling counterfeit goods online, and digital piracy are primary areas of concern. Unauthorised branded content use, traffic misdirection, false affiliation claims, or negative comment and activism are also significant issues. All of these can directly affect a brand's revenue, reputation and value.

This makes a comprehensive, holistic brand protection programme crucial for any brand owner, including monitoring to identify potentially damaging third-party content, and using enforcement strategies to take down infringing material. A brand protection programme should also cover a range of online channels like Internet content, branded domain names, social media, mobile apps, e-commerce marketplaces, etc., as these areas are becoming increasingly interlinked, providing different environments where the same kinds of infringements happen.

An effective enforcement programme not only addresses issues directly affecting company revenue, but also makes a protected brand a less attractive target for criminals. Furthermore, it helps protect customers and official online partners, can positively affect brand reputation and value, satisfies regulatory requirements, and can be a pre-requisite for retaining intellectual property (IP) protection.

Below we outline a four-step process for the effective and efficient implementation of a holistic brand protection programme.  

1. Evaluate the landscape and establish goals

Step one is to determine where any problems lie, and what to focus on. An initial brand snapshot or landscape audit will establish this, conducting a series of brand-related searches across all relevant channels. A marketplace sweep is also beneficial, as it looks at the numbers of results returned in response to brand-specific searches on a range of key e-commerce marketplaces. Results can be prioritised via threat scoring, clustering technology (to identify serial or high-volume infringers), web-traffic analysis, sales volume information, and so on.

It is essential to ensure that the focus areas of any programme align with the organisation's business plan and strategic goals. These might relate to geographic areas where the company has operations (or is planning to expand), and the channels where it has online presence. We advise appointing a digital governance team, including representatives from marketing, IP and legal, security, and domain operations, to ensure that brand protection is a collaborative interdepartmental effort.

It is also necessary to review the organisation’s IP protection portfolio to ensure that relevant brand terms are protected (e.g. trademarks registered in the appropriate product classes and geographic jurisdictions). Having the correct IP protection is vital for an effective enforcement programme. It is also useful to have an overview of official websites and partners, so they can be added to an 'allow' list for monitoring, and any pre-existing compliance issues can be addressed. If this information is unavailable, compiling such lists can be an objective of the monitoring programme.

Finally, it is vital to set out the overall goals of the brand protection initiative in advance, to measure its effectiveness. This can be as simple reducing infringements - that is, removing significant numbers of them from the top e-commerce marketplaces and social media sites, or cleaning search engine results to eliminate infringements appearing on the first page. It may also be relevant to see an increase in web traffic to official channels or to pre-empt infringement activity against planned new product or brand launches.

2. Monitor critical online channels

Having established the key focus areas, the next step is to agree on monitoring parameters. This includes determining which channels and platforms to monitor and assessing which search terms to use. The minimum requirement is to search for the brand name itself - essentially mirroring customer searches - to identify the most visible content. It is often useful to search for content featuring brand variants like typos, abbreviations, and character replacements. This helps capture content where the infringer has deliberately not used the exact brand name in order to evade detection, or used confusing or deceptive brand variations. Additionally, it may be necessary to configure search terms incorporating other brand terms or industry- or product-specific keywords. This helps identify relevant material when the brand name itself is a generic term. Conversely, some monitoring services use exclusion keywords, where content is actively ignored if terms are found that imply the brand name is being used in a non-relevant context.

It is essential to extract as much rich data as possible from the webpages and listings identified through monitoring, because it allows findings to be prioritised and triaged effectively, and then clustered to identify key targets and infringers. Data extraction can be done in several ways, including rules-based analysis of page content, scraping to extract relevant data from known locations on a page (especially effective on e-commerce marketplaces, social media sites, mobile app stores etc., where the page structure is known in advance), or using an API (application programming interface) provided by the monitored site. With e-commerce marketplace listings, for example, relevant data points include seller information, item quantity and supply, price, item description, etc. Aggregated historical data for individual sellers - like numbers of previous infringements and enforcement history- can also provide a measure of overall seller risk. Finally, in cases of particular concern, carrying out detailed entity investigations can build a fuller picture of a particular seller or organisation's online profile and associated activities.

Furthermore, visible characteristics (e.g. counterfeit indicators) in the product image can help determine whether a listing is infringing. This can be achieved using both automated image analysis and manual inspection by an analyst.

It is generally also useful to ensure that the visible page content for any relevant results is recorded using snapshots or page caching; it provides evidence of the presence of an infringement at the point of discovery.

3. Enforce using the most impactful strategies

A key element of a brand protection programme is removing infringing content that would otherwise result in lost revenue for a brand, or damage its integrity and reputation. To avoid a 'whack-a-mole' approach, identifying and tackling the highest value targets first and then using the most efficient takedown method - which varies depending on the channel, platform, and the nature of the infringement - creates the greatest impact.

Certain platforms have specific IP protection programmes to remove brand-damaging content (e.g. AliProtect for Alibaba Group sites and VeRO for eBay). Clever use of these programs can help achieve greater impact, like aggregating takedowns in batches to take advantage of a marketplace's 'three strikes' policy, and result in quicker seller suspension. Some platforms also have good-faith programmes where brand owners or their representatives can achieve rapid takedowns by having a low false-positive rate in submitted infringements. The Amazon Brand Registry and Brand Gating schemes are examples of programmes where brand owners can proactively reduce the appearance of infringements.

For other Internet content, having a toolkit of enforcement approaches is beneficial - from low-cost, low-complexity, rapid primary actions, like cease-and-desist notices, through secondary tactics like host-level content removals or registrar- or registry-level suspensions, up to longer-term, complex tertiary approaches like domain-dispute processes and legal actions. In some cases, other techniques like payment gateway suspensions or search engine de-listings may be appropriate. Having a range of enforcement options allows a brand owner to select the most cost-effective and efficient approach, reserving others for escalation. Some of the more complex dispute or acquisition options may only be appropriate when the brand owner wants to reclaim a domain for their own use.

Other supplementary actions can help build the most efficient and impactful enforcement programme, e.g. test purchases to prove a product is counterfeit, engagement with local law enforcement, or establishing reseller policy agreements.

4. Evaluate impact and realign strategies

As a brand protection programme matures, brand owners can evaluate its impact using a variety of techniques, many of which measure the financial return-on-investment (ROI) of the actions taken. This calculation can involve the total value of infringing goods removed from e-commerce marketplaces, the total amount of web traffic received by infringing sites, or both. Determining the amount of lost revenue that is reclaimable after successful enforcement is key to demonstrating ROI. For e-commerce de-listings, for example, this considers the conversion rate of customers who will buy a legitimate item when the counterfeit version is made unavailable. This conversion rate depends on the item's price[4] -  or more specifically, the price differential between the genuine item and a counterfeit. Conversely, with a successful domain acquisition, the traffic for the infringing site can be re-directed to the brand owner's official website (and thereby monetised) once the domain is added to the brand owner's official portfolio[5,6].

Following a successful enforcement programme, brand owners can also directly measure other positive results, including increases in web traffic and sales volumes for their official network of sites, resellers, and partners. It may also be possible to see a clean presence on search engines and other platforms, with no infringing content returned for brand-specific searches.

Knowing how a brand is being referenced online through a monitoring solution can have other less tangible benefits, even where enforcement is not possible. For example, intelligence on negative customer comments allows brand owners to make informed decisions on their marketing and product development strategies. Monitoring can also uncover issues like brand confusion and brand dilution[7].

Combining a brand protection programme with factors like customer education and the use of product verification tools also protects the consumer base from exposure to non-legitimate products and content. Overall, this can have a positive impact on trust levels, and ultimately on the intrinsic value of the brand.

Reviewing the process and realigning strategies in response to observations, trends, or changes in business strategy is also beneficial. New channels or platforms may emerge, or additional takedown techniques may become available (e.g. the introduction of a new IP protection programme). A brand owner may introduce new brands and products, change their geographic footprint, or increase their portfolio of protected IP, e.g. through registering new trademarks. Infringement patterns may also change over time, as sellers move to different marketplaces or change the way they describe the products (sometimes in response to the enforcement actions of the brand owner). Finally, the emergence of new technologies or significant world events can also affect the infringement landscape[8,9].

Any of the above factors can necessitate changes to how a holistic brand protection programme is executed, to keep it focused, relevant, and effective. For this reason, the approach should always be circular and iterative, with brand owners keeping a close eye on activity and trends, and constantly evolving their methods to respond to any changes.

References

[1] https://www.statista.com/statistics/617136/digital-population-worldwide/

[2] https://www.worldbank.org/en/topic/digitaldevelopment/overview

[3] https://data.worldbank.org/indicator/NY.GDP.MKTP.CD

[4] https://circleid.com/posts/20220726-calculating-the-return-on-investment-of-online-brand-protection-projects

[5] https://www.worldtrademarkreview.com/anti-counterfeiting/return-investment-proving-protection-pays

[6] https://www.worldtrademarkreview.com/global-guide/anti-counterfeiting-and-online-brand-enforcement/2022/article/creating-cost-effective-domain-name-watching-programme

[7] https://securityboulevard.com/2022/07/online-brand-abuse-is-a-cybersecurity-issue/

[8] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[9] https://www.cscdbs.com/en/resources-news/supply-chain-report-form/

This article was first published on 27 September 2022 at:

Also published at: 

https://circleid.com/posts/20221005-four-steps-to-an-effective-brand-protection-program

Experimenting with a new domain data source to identify hard-to-find web content

Introduction The monitoring component of brand protection services aims to identify infringing web content relating to a particular brand, w...