Thursday, 20 October 2022

The Highest-Threat TLDs – Part 1

by Justin Hartland and David Barnett

A domain name consists of two main elements: the second-level domain name to the left of the dot - often consisting of a brand name or relevant keywords - and the domain extension or top-level domain (TLD) to the right of the dot. Domain names form the key elements of the readable web addresses allowing users to access pages on the Internet and also allow the construction of e-mail addresses.

There are different types of TLDs, including generic or global (gTLDs), that were originally intended to provide a description of the site type, such as .com for company websites or .org for charitable organisations. There are also country-code TLDs (ccTLDs) for specific countries, e.g. .co.uk for the UK, .fr for France, etc. Finally there are a range of new gTLDs that have launched since 2013[1], usually relating to specific content types, business areas, interests, or geographic locations (e.g. .shop, .club, .tokyo). Each TLD is overseen by a registry organisation, which manages its infrastructure.

Domain names are associated with the full spectrum of Internet content, from legitimate use by brands or individuals, to infringing or criminal activity. CSC has observed that certain TLDs get used more for egregious content.

There are several possible reasons why particular TLDs are more attractive to infringers, including the cost of domain registration, and difficulties in conducting enforcement (takedown) actions against infringing content. TLDs operated by certain registries, like those offering low- or no-cost domain registrations or those with lax registration security policies, are more likely to be used for infringing activities. Additionally, domain extensions lacking well-defined, reliable enforcement routes like .vn (Vietnam) and .ru (Russia) prove to be especially high risk. Other factors are also significant; for example, a country's wealth affects the levels of technical expertise of Internet service providers (ISPs) and therefore the likelihood of domains being compromised.

In this two-part blog post, we aim to quantify the threat levels associated with specific domain extensions, i.e. the likelihood that a domain on a particular TLD might be registered for fraudulent purposes.

Part 1: Phishing site TLDs

Determining the overall threat frequency for each TLD is useful in several ways:

  • Helping to prioritise results identified via a brand protection service. For example, the TLD can be used to identify top targets for future tracking for content changes.
  • Identifying TLDs where it is advisable to register domains featuring key brand-related strings defensively to avoid them being registered by third parties with malicious intent.
  • Identifying TLDs where it is advantageous for brand protection service providers to offer blocks or alerts when, for example, a third party attempts to register a domain containing a brand-related term.

Analysis and discussion

For this first post, we analysed data from CSC's Fraud Protection services to uncover the TLDs associated with domains used for phishing activity. The analysis covers all sites detected between November 2021 and April 2022 for those TLDs with more than 10 phishing cases and where domain-based phishing cases were recorded (as opposed to subdomain-based). This yielded results for 115 distinct TLDs.

In addition, we also consider the frequency of domain use associated with threatening content across the TLD in question. We do this by expressing the raw numbers as a proportion of the total number of domains registered across the TLD[2]. We then normalise the data, so the value for the highest-threat TLD is 1, with all other values in that dataset scaled accordingly. It is important to note that this value reflects the proportion of malicious domains across each TLD, rather than absolute numbers. Some other TLDs see high numbers of infringements by virtue of the total numbers of domain registrations across these extensions. Table 1 shows the top 20 TLDs represented in CSC's phishing dataset (by absolute numbers), together with the normalised threat frequencies for these TLDs.

TLD
                   
% of total
phishing cases
                       
Total no. of
regd. domains
across TLD
                           
Normalised
threat frequency
within dataset
                           
  .com 45.7% 221,858,334     0.014
  .org 6.9% 15,550,733     0.031
  .app 6.2% 1,155,807     0.377
  .net 4.8% 19,773,315     0.017
  .xyz 2.5% 10,841,304     0.016
  .ru 2.5% 10,627,033     0.016
  .co 2.1% 4,110,132     0.035
  .cn 1.7% 25,147,816     0.005
  .me 1.3% 1,669,800     0.054
  .dev 1.2% 391,929     0.222
  .br 1.2% 5,519,378     0.015
  .top 1.2% 8,830,142     0.009
  .io 1.1% 923,588     0.085
  .in 1.1% 3,271,337     0.023
  .page 1.0% 368,474     0.195
  .id 0.9% 760,240     0.080
  .icu 0.8% 7,956,385     0.007
  .info 0.8% 7,852,896     0.007
  .de 0.7% 22,881,115     0.002
  .ke 0.7% 165,907     0.288

Table 1: Top 20 TLDs represented in CSC's phishing dataset, by absolute numbers

We have observed similar patterns in other analyses of threatening content. Interisle's 'Malware Landscape 2022' study found that the top 10 TLDs associated with malware domains also featured a mix of legacy gTLDs (.com at position one, .net at five, .org at six, and .biz at 10), new gTLDs (.xyz at position two, .club at seven, and .top at nine) and ccTLDs (.br, .in, and .ru at positions three, four and eight, respectively)[3]. Eight of these 10 extensions feature in the top 14 of CSC's phishing list above. Similarly, the Anti-Phishing Working Group's (APWG's) 'Phishing Activity Trends Report' for Q4 2021 analysed top phishing TLDs, with a top nine including new gTLDs .xyz, .buzz and .vip, and ccTLDs .br and .in, alongside legacy gTLDs.

New gTLDs were more than twice as extensively represented in the dataset as would be expected purely based on the total number of domains registered across these extensions[4]. A Q1 2022 study by Agari and PhishLabs also showed similar patterns, where the top 10 TLDs abused by phishing (by number of sites) included the new gTLDs .vip, .xyz and .monster, and ccTLDs .br, .ly, and .tk[5,6].

Table 2 shows the pattern is rather different when looking at the top TLDs by their normalised threat frequency; the list is dominated by a distinct set of ccTLDs, a smaller number of new gTLDs, and excludes many of the more popular TLDs shown previously.

TLD
                   
Normalised
threat frequency
within dataset
                           
Total no. of
regd. domains
across TLD
                           
% of total
phishing cases
                       
  .gd 1.000 3,306     0.05%
  .gy 0.910 4,037     0.05%
  .ms 0.739 9,440     0.10%
  .zm 0.531 4,838     0.04%
  .app 0.377 1,155,807     6.21%
  .ly 0.356 25,801     0.13%
  .ke 0.288 165,907     0.68%
  .dev 0.222 391,929     1.24%
  .page 0.195 368,474     1.03%
  .ug 0.187 10,810     0.03%
  .sn 0.187 9,842     0.03%
  .do 0.176 30,215     0.08%
  .bd 0.127 37,465     0.07%
  .sbs 0.120 44,222     0.08%
  .np 0.112 57,379     0.09%
  .sh 0.110 25,070     0.04%
  .ng 0.097 240,668     0.33%
  .io 0.085 923,588     1.11%
  .id 0.080 760,240     0.86%
  .sa 0.079 60,246     0.07%

Table 2: Top 20 TLDs represented in CSC's phishing dataset, by normalised threat frequency

In the second article in this series, we compare these findings with those from additional datasets to produce an overall measure of TLD threat frequency, considering a range of fraudulent uses. We then consider cybersecurity implications, discuss mediation measures, and cover how CSC can help with this process.

References

[1] https://newgtlds.icann.org/en/program-status/delegated-strings

[2] https://domainnamestat.com/statistics/tldtype/all (statistics correct as of 13 June 2022)

[3] https://interisle.net/MalwareLandscape2022.pdf

[4] https://docs.apwg.org/reports/apwg_trends_report_q4_2021.pdf

[5] https://info.phishlabs.com/hubfs/Agari%20PhishLabs_QTTI%20Report%20-%20May%202022.pdf

[6] https://www.tripwire.com/state-of-security/security-data-protection/phishing-threat-trends-intelligence-report/

This article was first published on 20 October 2022 at:

https://www.cscdbs.com/blog/the-highest-threat-tlds-part-1/

Also published at:

https://circleid.com/posts/20230112-the-highest-threat-tlds-part-1

Tuesday, 18 October 2022

Energy-crisis-related scams highlight how bad actors seek to capitalise on global events

Fraudsters can be counted on to be quick to take advantage of those who may be struggling, and the latest example is the cost-of-energy crisis. Our uncovering of related scams in the UK follows numerous previous studies illustrating how real-world events can trigger associated spikes in online infringement activity, including efforts focused on the invasion of Ukraine[1] and the pandemic[2]

Events such as the war in Ukraine and associated supply-chain issues have triggered huge rises in the cost of energy, resulting in support programmes being introduced by governments. In the UK, for example, the Energy Price Guarantee[3] (which reduces energy unit costs to consumers) and Energy Bills Support Scheme[4] (providing an automatic energy payment rebate), come into effect in this month, in addition to energy price caps for corporations. 

In response to these initiatives, bad actors have instigated a range of phishing campaigns designed to harvest users' personal information, under the guise of soliciting applications for participation into the schemes.

In the two examples shown below, we identified SMS messages of a similar style (sent on 26 September), directing users to phishing sites hosted on the domains via-rebate-scheme[.]com and energy bills-support[.]com. 


Figure 1: Examples of SMS messages directing users to phishing sites related to the UK Energy Bill [sic] Support Scheme

The two domains in question had been registered in the previous few days (25 and 21 September, respectively), and both had redacted whois records. Neither of the sites was active by the time of analysis (on 26 September). 

Searches for reports of other scams featuring similar text revealed that several additional domain names had also been utilised in scams of this type, with a selection of examples listed below:

  • energy-bill-online[.]com
  • energy-bill-support[.]com
  • energybills-rebate[.]com
  • my-energybill-online[.]com
  • mygov-energy-help[.]com
  • online-energybill-rebate[.]com
  • rebate-application[.]com
  • support-rebatescheme[.]com
  • energy[.]bill-rebate[.]com

The majority of these sites were inactive by the date of analysis; however, two of the above domains were found still to resolve to active sites – displaying very convincing lookalikes of the government's official 'gov.uk' sites, including webforms prompting for the input of names, dates of birth, mobile numbers, and addresses.

Figure 2: Phishing site content visible on fake UK government domains mygov-energy-help[.]com and rebate-application[.]com (live as of 26 September 2022)

Considering the above observations, we utilised our monitoring technology to look for patterns in the registration of domains with names containing the strings ‘energy’ and ‘rebate’, in the period to 26 September. Analysing the raw data, we found that there has been continuous activity (in terms of the registration, re-registration and lapse of relevant domain names) across the preceding year, with numerous 'noisy' peaks and troughs, and no obvious trends. 

This is perhaps unsurprising given the generic nature of the keywords under consideration, and the numerous different ways they can be utilised in domain names unrelated to the programmes and scams of interest. However, our tools allow us to look at specific match types, and thereby drill down more closely into examples which are more likely to be of direct relevance. Accordingly, we next considered only those domain names containing a 'word match' for the keywords 'rebate', 'energy', 'energybill' or 'energybills' (i.e. those domains where these terms appear in isolation, or are separated from the remainder of the domain name by hyphens - i.e. similar patterns to those appearing in the known examples of the scam domains listed above). 

For 'energy' domains, this still yields a rather noisy dataset. However for the (somewhat more distinctive) keyword 'rebate', there is a much clearer ramp-up in activity in the latter part of September 2022, in the lead-up to the launch of the related UK government scheme.

Figure 3: Five-day centred rolling averages of the total daily number of registrations (including re-registrations) of domains with names containing 'energy' (top) and 'rebate' (bottom) (as 'word matches'), between March and September 2022

Of the 39 distinct 'rebate' (word-match) domains registered in the final two weeks of the analysis period, a significant proportion featured additional keywords suggesting that they may have been registered with similar scams in mind - seven referenced 'energy', six 'scheme', three 'application' and two 'claim'. 

This dataset included three additional domains (energy-bill-rebate[.]com, mytax-rebate-application[.]com and rebate-applications[.]com) resolving to active 'gov.uk' branded phishing sites as of 26 September, together with several more which (though inactive) still featured the 'gov.uk' favicon. Six further examples featured browser-level warnings that they had previously featured 'dangerous' content. 

Five of the domains were found to have been both registered and then lapsed within the two-week period (with delays of between one and five days between the two events).

These observations once again highlight how real-world events can trigger peaks in infringement activity by bad actors wishing to take advantage of difficult situations for their own financial gain, at the expense of their victims. 

The phishing campaigns highlighted in this analysis make use of domains which are specifically registered for use in the campaign, and are typically used only for a short period (potentially in an attempt to circumvent detection and takedown efforts), before being allowed to lapse. 

Phishing activity generally is most effectively detected through the implementation of product sets - which incorporate use of spam traps and honeypot accounts, and other feeds such as brand-owner webserver logs - as a complement to other detection methodologies. 

However, the findings presented here also highlight how nimble infringers can be and, for example in the case of organisations and not-for-profits involved in responding to crisis and global events, why it is important to ensure particular vigilance when mission-related incidents occur. 

References

[1] https://www.cscdbs.com/blog/how-to-manage-the-online-effects-of-the-ukraine-war/

[2] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/

[3] https://www.gov.uk/government/publications/energy-bills-support/energy-bills-support-factsheet-8-september-2022

[4] https://www.gov.uk/guidance/getting-the-energy-bills-support-scheme-discount

This article was first published on 17 October 2022 at:

https://www.worldtrademarkreview.com/article/energy-crisis-related-scams-highlight-how-bad-actors-seek-capitalise-global-events

Tuesday, 4 October 2022

The continued rise of phishing and the case of the customisable site

As noted in previous CSC studies[1,2,3], phishing continues to be an extremely popular threat vector with bad actors and shows no signs of subsiding - in part, because of the COVID-19 pandemic and the rise in popularity of remote working. Indeed, the most recent figures from the Anti-Phishing Working Group (APWG)[4,5] show that the numbers of phishing attacks are higher than ever before, with the quarterly total of identified unique phishing attacks exceeding 1 million for the first time in Q1 2022, and over 600 distinct brands attacked each month.

Figure 1: Total monthly numbers of unique phishing attacks from Q1 2018 to Q2 2022, as reported by APWG[6]

An earlier report by APWG[7] noted that over 80% of phishing sites were found to be employing SSL (secure socket layers) or TLS (transport layer security) certificates (allowing use of HTTPS) - an increase from around 5% at the end of 2016 - and 90% of these certificates had been issued by free providers, such as cPanel and Let's Encrypt.

Furthermore, Interisle's 2022 Phishing Landscape study[8,9] reported the detection of over 1.1 million phishing attacks between April 2021 and April 2022, with over 2,000 brands targeted, but the majority targeting just 10 top brands. Overall, 69% of attacks made use of specifically registered domains, with the attacks disproportionately concentrated on new generic top-level domains (gTLDs). Additionally, a small number of registrars dominate the malicious registrations. Around 41% of domains reported for phishing were found to have been used within 14 days of their registration, and most of these were reported within 48 hours.

Modern phishing is driven by the desire for credential theft and business impersonation, but it is also increasingly recognised as the gateway for launching malware and ransomware attacks, which often lead to serious compromises of corporate systems and other security issues, such as DNS (domain name system) attacks.

The customisable phishing site

Central to many phishing attacks is the use of a fraudulent lookalike site mimicking the appearance of the official site of the brand being targeted - often including a log-in form prompting the input of sensitive customer information which thereby falls into the hands of the fraudster. In a classic phishing attack, the site will impersonate a specific brand, and cybercriminals will send e-mails to a wide group of users driving them to the site. This strategy uses the assumption that a certain portion of the recipients will be genuine customers of the targeted brand and may be fooled.

However, over the last two years, CSC has noted the emergence of a much more egregious style of phishing site, the appearance of which is dynamically tailored to the specific recipient in each case and can successfully target a much broader portion of recipients from a single campaign.

An example was first identified in February 2020, using a URL of the form https://[fraudsite.com]/[directory]/?usr=[string], where 'string' was a series of apparently random characters. The site appeared to target the user of a specific corporate e-mail address relating to a brand owner, with the address pre-populated into the log-in form on the page. The background of the site displayed a framed version of the official company website, giving the appearance that the user was logging into their own corporate site. All of this content appeared to be hard-coded into the HTML of the phishing site.

However, closer inspection revealed that the content actually appeared to be dynamically generated, with the string in the URL comprising a Base64-encoded version (a standard method of converting binary data, such as a string of standard ASCII characters, into an alternative text format) of the recipient e-mail address.

To determine how the phishing site handled this information in practice, a modified URL was generated, replacing the previous Base64 string with an encoded version of a CSC employee e-mail address. This produced the page shown in Figure 2, for which the HTML source code again appeared to be hard coded when viewed.

Figure 2: A version of the phishing site constructed by modifying the string in the original URL, showing how it would appear if targeted towards the user of a specific CSC corporate e-mail address (obfuscated in the screenshot for privacy purposes)

The implication is that the site is presumably running a script to dynamically generate the HTML of the page, based on the content of the Base64 string within the URL. This provides the potential to generate a very convincing, customised phishing attack whereby, given a recipient e-mail address, the fraudulent site is configured to display a framed version of the host domain of the e-mail address, overlain by a log-in box pre-populated with that address. Consequently, the same phishing e-mail could potentially be sent to large numbers of e-mail addresses, with no further requirement to customize the e-mail or the corresponding phishing site to the recipients in question - beyond ensuring that a Base64-encoded version of the recipient e-mail address is appended to the link in the phishing e-mail in each case (which could easily be automated via use of a script).

It was also established that the behaviour of the site appears to be dependent on exactly how and where it is viewed, with the site appearing inactive when viewed in a virtual machine environment. This type of configuration has previously been noted as a technique used by fraudsters to thwart forensic analysis of their sites by security professionals, who often work in virtual environments.

It is also notable that this type of site would be very difficult to detect using traditional brand monitoring approaches. Aside from the fact that the site may have been set up as an unindexed island site, intended to be accessed only via links in spam e-mails, there is potentially no reference to any brand in the site content itself, with brand-specific content being generated dynamically in the HTML only when a specific URL is accessed. In this type of case, detection would be dependent on the ability of CSC's anti-fraud engine, working in conjunction with web referrer information provided by the brand owner, to identify when the phishing site draws information from the brand owner's official site when the framing process is carried out.

A study in July 2022[10,11] reported the identification of an extremely similar style of attack, in this case using a bit-for-bit mirror of the official site of the brand being targeted.

Conclusions

These findings highlight the importance of a comprehensive phishing detection and enforcement programme, able to identify threats of a variety of types. Detection should incorporate domain monitoring (to identify phishing sites where the brand name - or a variant - is included in the domain name) and Internet monitoring (to identify other fraudulent sites linked from content indexed by search engines) components. However, other data sources - such as spam traps and honeypots, and other data feeds like customer abuse mailbox data and webserver logs - should also be used to identify phishing sites that are unindexed or feature content that is more dynamically generated.

However, even this is only part of the solution. As noted above, phishing attacks often form the basis for subsequent malware attacks or other security incursions. Accordingly, a robust security posture should also include the deployment of a range of domain security measures - such as those offered by an enterprise-class registrar - to protect critical corporate domains. It is also advisable for brand owners to avoid the use of service providers who allow unsavoury practices such as typosquatting, domain name auctions, and name spinning (the sale of domains containing brand variations) - all of which can facilitate phishing attacks.

References

[1] https://www.cscdbs.com/blog/branded-domains-are-the-focal-point-of-many-phishing-attacks/

[2] https://www.worldtrademarkreview.com/global-guide/anti-counterfeiting-and-online-brand-enforcement/2022/article/going-phishing-countering-fraudulent-campaigns

[3] https://www.cscdbs.com/blog/going-phishing-countering-fraudulent-campaigns-2/

[4] https://docs.apwg.org/reports/apwg_trends_report_q1_2022.pdf

[5] https://docs.apwg.org/reports/apwg_trends_report_q2_2022.pdf

[6] https://apwg.org/trendsreports/

[7] https://docs.apwg.org/reports/apwg_trends_report_q2_2021.pdf

[8] https://interisle.net/PhishingLandscape2022.html

[9] https://interisle.net/PhishingLandscape2022.pdf

[10] https://www.darkreading.com/endpoint/apt-phishing-mirrors-landing-pages-credential-harvesting

[11] https://www.avanan.com/blog/mirroring-actual-landing-pages-for-convincing-credential-harvesting

This article was first published on 4 October 2022 at:

https://www.cscdbs.com/blog/the-continued-rise-of-phishing-and-the-case-of-the-customizable-site/

Also published at:

https://circleid.com/posts/20221010-the-continued-rise-of-phishing-and-the-case-of-the-customizable-site

Unregistered Gems Part 6: Phonemizing strings to find brandable domains

Introduction The UnregisteredGems.com series of articles explores a range of techniques to filter and search through the universe of unregis...