Introduction
As noted in numerous previous studies, one of the main objectives in the construction of a deceptive infringement (such as a phishing site) may be the use of a URL which appears similar to that of the official site being targeted.
One way in which this can be achieved is by constructing a hostname (consisting of a subdomain and domain name combination) which is identical (apart from an additional dot) to that of the genuine brand site. Active use of this technique has been observed in numerous cases - e.g. considering the case of the fictitious banking brand bankbrand.com, the use of a URL such as ba.nkbrand.com to target the bank's customers with a phishing attack. In order to put this type of attack into practice, the infringer needs to register a domain name which is a truncated form of the official brand site (in the above case, nkbrand.com), allowing them to construct the full hostname by configuring the required subdomain (in this case, 'ba.').
Study methodology
In order to investigate the scale of this practice being used for fraud and other brand infringements, I consider hostname-based variations of each of the top 50 most popular brand websites on the Internet[1] (see Appendix). For example, for the domain google.com, I investigate whether any live content exists at any of the following hostname-based variations:
- g.oogle.com
- go.ogle.com
- goo.gle.com
- goog.le.com
- googl.e.com
This approach (i.e. checking the subdomain specifically) is more robust than simply checking whether the truncated versions of the domain names (e.g. oogle.com, ogle.com, etc.) have been registered, since some of these (particularly the shortest domain names) may be in use by unrelated third parties.
Findings
Of the 262 candidate URLs[2] (i.e. the hostname-based variants of the top 50 domain names), 89 (34%) have active A records (indicating that they point at a live IP address) and 37 (14%) have active MX records (indicating that they have been configured to be able to send and receive e-mails), as shown in Figure 1. Significantly (where whois information is available), only six (2.6%) of the 233[3] truncated domain-name variants are registered to the brand owner who could be targeted using an associated hostname infringement.
Figure 1: Breakdown of URLs by presence of A and MX records
Of the 89 URLs with active A records, a range of content types were observed, including:
- Live third-party content - Pages where the URL resolves or re-directs to content unrelated to the brand in question (i.e. traffic misdirection)
- PPC - Sites monetised through the inclusion of pay-per-click links
- Domain-for-sale pages - Pages where the domain name is explicitly being offered for sale
A breakdown of the numbers is shown in Figure 2.
Figure 2: Breakdown of URLs with active A records by content type
It is worth noting that some of the instances of URLs resolving to live content may arise through the use of wildcard DNS records[4] (i.e. where the domain has been configured such that any arbitrary subdomain will resolve, rather than the specific subdomain having been explicitly configured). However, any URL pointing to a live IP address raises the potential for fraudulent or infringing use. At the time of analysis, none of the 262 URLs resolved to live phishing sites targeting the brand in question; however, it has been previously noted that in many cases, sites are left in a dormant state - in some cases, for an extended period of time - before being weaponised[5,6]. Consequently, many of the sites resolving to parking, holding or inactive pages may be worthy of monitoring for future changes in content. Furthermore, some of the identified instances of URLs resolving to third-party content may be of particular concern to the brand owner, if they misdirect web-users to competitor content or provide an undesirable brand association. Some examples include:
- Hostname-based variant of google[.]com → Resolves to a page promoting a VPN product
- Hostname-based variant of yandex[.]com → Re-directs to a flight-sales website
- Hostname-based variant of xvideos[.]com → Resolves to a third-party adult website
- Hostname-based variant of pornhub[.]com → Re-directs to a third-party adult website
- Hostname-based variant of linkedin[.]com → Resolves to a gambling-site portal page
- Hostname-based variant of ebay[.]com → Re-directs to the Google website
Additionally, the frequency of PPC pages within the dataset indicates the popularity to infringers of monetising domains whilst in their dormant state. Furthermore, the fact that many of these examples display content unrelated to the brand in question may also suggest that they have been configured to attract web traffic arising from mistyped browser requests, rather than being intended as explicitly deceptive variants of the brand domain name in question.
As a final observation, we can compare the date of registration with the length (in characters) of the second-level domain (SLD) name string (i.e. the portion of the domain name prior to the TLD, or domain extension), for each of the 233 potentially infringing domain names in the dataset (where these are registered and have whois information available) (Figure 3).
Figure 3: Comparison of date of registration with length of the SLD name, for the domains comprising (right-) truncated versions of the top 50 most popular domain names
The dataset shows that the domains in question have been registered over an extended period, between 1986 and 2022. The shorter domain names - i.e. those which are more likely to have been used for unrelated third-party or generic use - tend to comprise the oldest registrations. However, many of the domains with longer SLD string lengths - i.e. those less likely to be associated with 'accidental' brand collisions, and more likely to have been registered specifically to create hostname-based infringements - tend to have been registered over the last few years, highlighting a potential growth in popularity over time of this particular attack vector.
Summary and recommendations
The proportion of hostname-based infringements resolving to live content, or configured with active A and/or MX records - combined with previous observations of the use of this type of infringement as a phishing attack vector - highlights the scale of this infringement type as a potential source of concern. Consequently, brand owners may wish to consider proactively registering or acquiring domain names comprising truncated versions (where the right-hand end is retained) of their core domain name, to prevent registration and abuse by a third party. In cases where acquisition is not possible, it may be advisable to monitor the hostname-based infringements for future changes in content and - if and when active infringing content is detected - launching a timely enforcement action for the takedown of the material.
Appendix
Top 50 most popular websites according to Similarweb (October 2022).
Rank |
Website |
Category |
---|---|---|
1 | google[.]com | Computers Electronics and Technology → Search Engines |
2 | youtube[.]com | Arts & Entertainment → Streaming & Online TV |
3 | facebook[.]com | Computers Electronics and Technology → Social Media Networks |
4 | twitter[.]com | Computers Electronics and Technology → Social Media Networks |
5 | instagram[.]com | Computers Electronics and Technology → Social Media Networks |
6 | baidu[.]com | Computers Electronics and Technology → Search Engines |
7 | wikipedia[.]org | Reference Materials → Dictionaries and Encyclopedias |
8 | yandex[.]ru | Computers Electronics and Technology → Search Engines |
9 | yahoo[.]com | News & Media Publishers |
10 | xvideos[.]com | Adult |
11 | whatsapp[.]com | Computers Electronics and Technology → Social Media Networks |
12 | pornhub[.]com | Adult |
13 | amazon[.]com | eCommerce & Shopping → Marketplace |
14 | xnxx[.]com | Adult |
15 | yahoo[.]co[.]jp | News & Media Publishers |
16 | live[.]com | Computers Electronics and Technology → Email |
17 | netflix[.]com | Arts & Entertainment → Streaming & Online TV |
18 | docomo[.]ne[.]jp | Computers Electronics and Technology → Telecommunications |
19 | tiktok[.]com | Computers Electronics and Technology → Social Media Networks |
20 | reddit[.]com | Computers Electronics and Technology → Social Media Networks |
21 | office[.]com | Computers Electronics and Technology → Programming and Developer Software |
22 | linkedin[.]com | Computers Electronics and Technology → Social Media Networks |
23 | dzen[.]ru | Community and Society → Faith and Beliefs |
24 | vk[.]com | Computers Electronics and Technology → Social Media Networks |
25 | xhamster[.]com | Adult |
26 | samsung[.]com | Computers Electronics and Technology → Consumer Electronics |
27 | turbopages[.]org | News & Media Publishers |
28 | mail[.]ru | Computers Electronics and Technology → Email |
29 | bing[.]com | Computers Electronics and Technology → Search Engines |
30 | naver[.]com | News & Media Publishers |
31 | microsoftonline[.]com | Computers Electronics and Technology → Programming and Developer Software |
32 | twitch[.]tv | Games → Video Games Consoles and Accessories |
33 | discord[.]com | Computers Electronics and Technology → Social Media Networks |
34 | bilibili[.]com | Arts & Entertainment → Animation and Comics |
35 | pinterest[.]com | Computers Electronics and Technology → Social Media Networks |
36 | zoom[.]us | Computers Electronics and Technology → Other Computers Electronics and Tech. |
37 | weather[.]com | Science and Education → Weather |
38 | qq[.]com | News & Media Publishers |
39 | microsoft[.]com | Computers Electronics and Technology → Programming and Developer Software |
40 | globo[.]com | News & Media Publishers |
41 | roblox[.]com | Games → Video Games Consoles and Accessories |
42 | duckduckgo[.]com | Computers Electronics and Technology → Search Engines |
43 | news[.]yahoo[.]co[.]jp | News & Media Publishers |
44 | quora[.]com | Reference Materials → Dictionaries and Encyclopedias |
45 | msn[.]com | News & Media Publishers |
46 | realsrv[.]com | Adult |
47 | fandom[.]com | Arts & Entertainment → Other Arts and Entertainment |
48 | ebay[.]com | eCommerce & Shopping → Marketplace |
49 | aajtak[.]in | News & Media Publishers |
50 | ok[.]ru | Computers Electronics and Technology → Social Media Networks |
References
[1] https://www.similarweb.com/top-websites/ (data correct for October 2022)
[2] All observations correct as of 11-Nov-2022
[3] Excluding duplicates
[4] https://en.wikipedia.org/wiki/Wildcard_DNS_record
[5] https://www.cscdbs.com/en/resources-news/impact-of-covid-on-internet-security/
[6] https://unit42.paloaltonetworks.com/strategically-aged-domain-detection/
This article was first published on 7 February 2023 at:
https://www.linkedin.com/pulse/exploring-domain-hostname-based-infringements-david-barnett/
No comments:
Post a Comment