by Justin Hartland and David Barnett
A domain name consists of two main elements: the second-level domain name to the left of the dot - often consisting of a brand name or relevant keywords - and the domain extension or top-level domain (TLD) to the right of the dot. Domain names form the key elements of the readable web addresses allowing users to access pages on the Internet and also allow the construction of e-mail addresses.
There are different types of TLDs, including generic or global (gTLDs), that were originally intended to provide a description of the site type, such as .com for company websites or .org for charitable organisations. There are also country-code TLDs (ccTLDs) for specific countries, e.g. .co.uk for the UK, .fr for France, etc. Finally there are a range of new gTLDs that have launched since 2013[1], usually relating to specific content types, business areas, interests, or geographic locations (e.g. .shop, .club, .tokyo). Each TLD is overseen by a registry organisation, which manages its infrastructure.
Domain names are associated with the full spectrum of Internet content, from legitimate use by brands or individuals, to infringing or criminal activity. CSC has observed that certain TLDs get used more for egregious content.
There are several possible reasons why particular TLDs are more attractive to infringers, including the cost of domain registration, and difficulties in conducting enforcement (takedown) actions against infringing content. TLDs operated by certain registries, like those offering low- or no-cost domain registrations or those with lax registration security policies, are more likely to be used for infringing activities. Additionally, domain extensions lacking well-defined, reliable enforcement routes like .vn (Vietnam) and .ru (Russia) prove to be especially high risk. Other factors are also significant; for example, a country's wealth affects the levels of technical expertise of Internet service providers (ISPs) and therefore the likelihood of domains being compromised.
In this two-part blog post, we aim to quantify the threat levels associated with specific domain extensions, i.e. the likelihood that a domain on a particular TLD might be registered for fraudulent purposes.
Part 1: Phishing site TLDs
Determining the overall threat frequency for each TLD is useful in several ways:
- Helping to prioritise results identified via a brand protection service. For example, the TLD can be used to identify top targets for future tracking for content changes.
- Identifying TLDs where it is advisable to register domains featuring key brand-related strings defensively to avoid them being registered by third parties with malicious intent.
- Identifying TLDs where it is advantageous for brand protection service providers to offer blocks or alerts when, for example, a third party attempts to register a domain containing a brand-related term.
Analysis and discussion
For this first post, we analysed data from CSC's Fraud Protection services to uncover the TLDs associated with domains used for phishing activity. The analysis covers all sites detected between November 2021 and April 2022 for those TLDs with more than 10 phishing cases and where domain-based phishing cases were recorded (as opposed to subdomain-based). This yielded results for 115 distinct TLDs.
In addition, we also consider the frequency of domain use associated with threatening content across the TLD in question. We do this by expressing the raw numbers as a proportion of the total number of domains registered across the TLD[2]. We then normalise the data, so the value for the highest-threat TLD is 1, with all other values in that dataset scaled accordingly. It is important to note that this value reflects the proportion of malicious domains across each TLD, rather than absolute numbers. Some other TLDs see high numbers of infringements by virtue of the total numbers of domain registrations across these extensions. Table 1 shows the top 20 TLDs represented in CSC's phishing dataset (by absolute numbers), together with the normalised threat frequencies for these TLDs.
TLD |
% of total phishing cases |
Total no. of regd. domains across TLD |
Normalised threat frequency within dataset |
---|---|---|---|
.com | 45.7% | 221,858,334 | 0.014 |
.org | 6.9% | 15,550,733 | 0.031 |
.app | 6.2% | 1,155,807 | 0.377 |
.net | 4.8% | 19,773,315 | 0.017 |
.xyz | 2.5% | 10,841,304 | 0.016 |
.ru | 2.5% | 10,627,033 | 0.016 |
.co | 2.1% | 4,110,132 | 0.035 |
.cn | 1.7% | 25,147,816 | 0.005 |
.me | 1.3% | 1,669,800 | 0.054 |
.dev | 1.2% | 391,929 | 0.222 |
.br | 1.2% | 5,519,378 | 0.015 |
.top | 1.2% | 8,830,142 | 0.009 |
.io | 1.1% | 923,588 | 0.085 |
.in | 1.1% | 3,271,337 | 0.023 |
.page | 1.0% | 368,474 | 0.195 |
.id | 0.9% | 760,240 | 0.080 |
.icu | 0.8% | 7,956,385 | 0.007 |
.info | 0.8% | 7,852,896 | 0.007 |
.de | 0.7% | 22,881,115 | 0.002 |
.ke | 0.7% | 165,907 | 0.288 |
Table 1: Top 20 TLDs represented in CSC's phishing dataset, by absolute numbers
We have observed similar patterns in other analyses of threatening content. Interisle's 'Malware Landscape 2022' study found that the top 10 TLDs associated with malware domains also featured a mix of legacy gTLDs (.com at position one, .net at five, .org at six, and .biz at 10), new gTLDs (.xyz at position two, .club at seven, and .top at nine) and ccTLDs (.br, .in, and .ru at positions three, four and eight, respectively)[3]. Eight of these 10 extensions feature in the top 14 of CSC's phishing list above. Similarly, the Anti-Phishing Working Group's (APWG's) 'Phishing Activity Trends Report' for Q4 2021 analysed top phishing TLDs, with a top nine including new gTLDs .xyz, .buzz and .vip, and ccTLDs .br and .in, alongside legacy gTLDs.
New gTLDs were more than twice as extensively represented in the dataset as would be expected purely based on the total number of domains registered across these extensions[4]. A Q1 2022 study by Agari and PhishLabs also showed similar patterns, where the top 10 TLDs abused by phishing (by number of sites) included the new gTLDs .vip, .xyz and .monster, and ccTLDs .br, .ly, and .tk[5,6].
Table 2 shows the pattern is rather different when looking at the top TLDs by their normalised threat frequency; the list is dominated by a distinct set of ccTLDs, a smaller number of new gTLDs, and excludes many of the more popular TLDs shown previously.
TLD |
Normalised threat frequency within dataset |
Total no. of regd. domains across TLD |
% of total phishing cases |
---|---|---|---|
.gd | 1.000 | 3,306 | 0.05% |
.gy | 0.910 | 4,037 | 0.05% |
.ms | 0.739 | 9,440 | 0.10% |
.zm | 0.531 | 4,838 | 0.04% |
.app | 0.377 | 1,155,807 | 6.21% |
.ly | 0.356 | 25,801 | 0.13% |
.ke | 0.288 | 165,907 | 0.68% |
.dev | 0.222 | 391,929 | 1.24% |
.page | 0.195 | 368,474 | 1.03% |
.ug | 0.187 | 10,810 | 0.03% |
.sn | 0.187 | 9,842 | 0.03% |
.do | 0.176 | 30,215 | 0.08% |
.bd | 0.127 | 37,465 | 0.07% |
.sbs | 0.120 | 44,222 | 0.08% |
.np | 0.112 | 57,379 | 0.09% |
.sh | 0.110 | 25,070 | 0.04% |
.ng | 0.097 | 240,668 | 0.33% |
.io | 0.085 | 923,588 | 1.11% |
.id | 0.080 | 760,240 | 0.86% |
.sa | 0.079 | 60,246 | 0.07% |
Table 2: Top 20 TLDs represented in CSC's phishing dataset, by normalised threat frequency
In the second article in this series, we compare these findings with those from additional datasets to produce an overall measure of TLD threat frequency, considering a range of fraudulent uses. We then consider cybersecurity implications, discuss mediation measures, and cover how CSC can help with this process.
References
[1] https://newgtlds.icann.org/en/program-status/delegated-strings
[2] https://domainnamestat.com/statistics/tldtype/all (statistics correct as of 13 June 2022)
[3] https://interisle.net/MalwareLandscape2022.pdf
[4] https://docs.apwg.org/reports/apwg_trends_report_q4_2021.pdf
[5] https://info.phishlabs.com/hubfs/Agari%20PhishLabs_QTTI%20Report%20-%20May%202022.pdf
This article was first published on 20 October 2022 at:
https://www.cscdbs.com/blog/the-highest-threat-tlds-part-1/
Also published at:
https://circleid.com/posts/20230112-the-highest-threat-tlds-part-1
No comments:
Post a Comment