Thursday, 26 December 2024

Brand Protection Data is (Still) Beautiful - Part 1: 'Year' domains

Introduction

My recent study of new-year-related domain names[1] highlighted the case of 2025[.]com, registered on 23-Aug-1998 (the oldest '2025'-specific .com domain currently registered). The domain is also an example of a numeric name, the subject of another recent study[2]. 'Year' domain names - that is, where the SLD (or second-level domain, i.e. the part of the domain name to the left of the dot) string is simply a four-digit number in the form of a year in the modern era, can be highly attractive from the point of view of memorability, use-cases, search-engine prominence and tradability, with (for example) 2025[.]org resolving to a Sedo domain marketplace page offering the sale of the domain name for $1M.

In this study, I consider the sets of 200 domain names with SLDs between '1900' and '2099', across popular domain extensions (top-level domains, or TLDs), to identify any trends and patterns in the registrations.

Analysis

The analysis considers the 'big five' legacy TLDs (that is, .com, .net, .biz, .org and .info), for which comprehensive information is available, both from the point of view of zone-file data (though this is not strictly necessary in this study, due to the fact that the SLDs are defined in advance) and domain registration data from automated whois look-ups. It is worth noting that all 1,000 possible domain names within the dataset are already taken, with none available for registration.

Figure 1 shows the registration dates for each of the 200 .com examples, as a function of the SLD (i.e. the year string). The domains were registered over a period between 13-Jan-1995 (2020[.]com) and 28-Jul-2010 (2038[.]com) - noting that these are the most recent registration dates, and some of the names may have been registered previously and subsequently allowed to lapse. 2038[.]com, for example, actually has a registration history (based on cached historical records) dating back to 22-May-1998.

Figure 1: (Most recent) registration (i.e. creation) dates for the 200 .com 'year' domains in the dataset

The next point to note is that groups of potentially related registrations appear on the graph as horizontal 'clusters' (i.e. similarly named domains appearing at identical or similar dates). For example, two obvious such groups are:

  • 2081[.]com, 2082[.]com, 2083[.]com, 2085[.]com, 2086[.]com, 2087[.]com, 2089[.]com, 2093[.]com, 2094[.]com, 2098[.]com, 2096[.]com - all registered on 17 and 18-Nov-1999
  • 2034[.]com, 2037[.]com, 2041[.]com, 2043[.]com, 2044[.]com, 2046[.]com, 2047[.]com, 2049[.]com, 2051[.]com, 2052[.]com, 2053[.]com, 2054[.]com, 2055[.]com, 2056[.]com, 2057[.]com, 2065[.]com, 2066[.]com - all registered between 09 and 11-Dec-1999

Like many domains in the post-GDPR world (and as remarked in my recent article on 'dark' whois records[3]), the whois details for these domains are almost all essentially entirely redacted, meaning other factors such as commonalities in registration dates (as discussed here) and other factors, such as the registrar - and historical records, as discussed below - are necessary in identifying probable clusters of associated registrations. Nevertheless, in cases of potential brand infringements or other fraudulent use (for example), this exercise can be key in identifying serial infringers, demonstrating bad-faith activity and allowing bulk takedowns.

The analysis can then be widened out by adding in the data for the other five TLDs, as shown in Figure 2.

Figure 2: (Most recent) registration (i.e. creation) dates for the 1,000 'year' domains in the dataset (covering all five TLDs considered)

Apart from the high-level trends (that the .com and .net names have, in general, been (most recently) registered considerably longer ago than (say) the .biz and .info names), other groups of registrations which are likely linked to each other become apparent. With the data plotted in this format, registrations covering similar SLDs at similar or identical dates, even when registered across different TLDs, appear as physical clusters on the plot.

What is perhaps less straightforward to see in this format is the (arguably most significant) case where the same SLD is registered across different TLDs on the same date (in which case the data points will overlay each other). These cases can be explicitly identified by visualising the data in a different way: for each SLD, there are five TLDs (or distinct domain names) (.com, .net, .biz, .org, .info) being considered, and therefore ten pairs of domains (net/com, biz/com, org/com, info/com, biz/net, org/net, info/net, org/biz, info/biz, info/org) for which the registration dates are to be compared. Appendix A shows the intervals (in days) between the registrations of the same SLD for each of the ten pairs of TLDs. The simplest way of drawing insights from the data is by highlighting all cases where the interval is less than a certain threshold (in this case, 7 days) - i.e. where the domains SLD.TLD1 and SLD.TLD2 were registered less than a week apart.

From this analysis, we can widen out the second of the two potential clusters listed above (for example) to include non-.com examples (Table 1).

* Those domains least likely to be connected to the remainder of the cluster

Table 1: Registration details for a potential cluster of associated registrations

If the associated sites represented some sort of infringement requiring enforcement, in many cases it may be helpful to uncover 'real-world' contact details for the actual domain owner(s). Possible ways of achieving this objective may be to launch some sort of domain dispute (though this can be slow and costly) or through an unmasking request to (say) the registrar (though this typically requires proof of a breach of terms and conditions, and registrars differ markedly in their levels of compliance). It is often more efficient to use an open-source intelligence (OSINT) investigation approach, which can include analysis of the current or cached historical content of the websites in question, or analysis of cached whois records. In many cases, whois records yielded richer information prior to the introduction of GDPR in 2018 and - given in these cases that the domains have been continuously registered since December 1999 - contact details from any point subsequent to this date may be associated with the current owner. For 2034[.]com (for example), the following historical details are given:

By following these threads across the other domains, a deeper view of the cluster can be ascertained. For example, the earliest of the three e-mail addresses listed above appears in the historical whois records of 80 domains in total - including a number of additional numeric domains - giving an overview of the owner's portfolio (Appendix B). One of these appears to be the owner's personal website; it is not currently active, but a cached view from 2002 from archive.org[4] includes a range of pieces of personal information (Figure 3).

Figure 3: A historical view of the website at kylecrothers[.]com

Appendix A: Intervals (in days) between the registration of the same SLD across different TLDs (for each of the ten TLD pairs under consideration)

Instances where this value is less than 7 days are highlighted in red.

Appendix B: Domains where kyle_crothers[at]bigfoot.com appears in the historical whois record

Domain
                                
Created
                                
Registrar
                                                                                                    
  1087[.]com 08-Dec-1999   GoDaddy.com, LLC
  1096[.]com 11-Jan-2009   GoDaddy.com, LLC
  1125[.]com 09-Dec-1999   Name.com, Inc.
  1224[.]com 09-Dec-1999   Alibaba Cloud Computing Ltd. d/b/a HiChina (www.net.cn)
  1650[.]com 13-Dec-1999   eName Technology Co.,Ltd.
  1772[.]com 13-Dec-1999   eName Technology Co.,Ltd.
  1889[.]com 09-Dec-1999   GoDaddy.com, LLC
  1908[.]com 08-Dec-1999   GoDaddy.com, LLC
  1914[.]com 09-Dec-1999   Name.com, Inc.
  1924[.]com 09-Dec-1999   eName Technology Co.,Ltd.
  2026[.]net 13-Dec-1999   GoDaddy.com, LLC
  2032[.]net 13-Dec-1999   GoDaddy.com, LLC
  2034[.]com 08-Dec-1999   GoDaddy.com, LLC
  2034[.]net 10-Dec-1999   GoDaddy.com, LLC
  2037[.]com 09-Dec-1999   GoDaddy.com, LLC
  2041[.]com 08-Dec-1999   GoDaddy.com, LLC
  2043[.]com 08-Dec-1999   GoDaddy.com, LLC
  2044[.]com 08-Dec-1999   GoDaddy.com, LLC
  2044[.]net 10-Dec-1999   GoDaddy.com, LLC
  2046[.]com 09-Dec-1999   Name.com, Inc.
  2047[.]com 08-Dec-1999   GoDaddy.com, LLC
  2049[.]com 10-Dec-1999   GoDaddy.com, LLC
  2051[.]com 10-Dec-1999   GoDaddy.com, LLC
  2052[.]com 10-Dec-1999   GoDaddy.com, LLC
  2052[.]net 14-Dec-1999   GoDaddy.com, LLC
  2053[.]com 10-Dec-1999   GoDaddy.com, LLC
  2054[.]com 10-Dec-1999   GoDaddy.com, LLC
  2054[.]net 10-Dec-1999   GoDaddy.com, LLC
  2055[.]com 10-Dec-1999   GoDaddy.com, LLC
  2055[.]net 10-Dec-1999   GoDaddy.com, LLC
  2056[.]com 10-Dec-1999   GoDaddy.com, LLC
  2056[.]net 10-Dec-1999   GoDaddy.com, LLC
  2057[.]com 10-Dec-1999   GoDaddy.com, LLC
  24-7admin[.]com 18-Oct-2018   NameSilo, LLC
  2772[.]com 10-Dec-1999   Alibaba Cloud Computing Ltd. d/b/a HiChina (www.net.cn)
  30000[.]com 14-Dec-1999   eName Technology Co.,Ltd.
  3773[.]com 09-Dec-1999   GoDaddy.com, LLC
  40000[.]com 14-Dec-1999   eName Technology Co.,Ltd.
  4010[.]com 08-Dec-1999   Deutsche Telekom AG
  4020[.]com 13-Dec-1999   Squarespace Domains II LLC
  5010[.]com 01-Sep-1999   GoDaddy.com, LLC
  5025[.]com 11-Dec-1999   Name.com, Inc.
  6010[.]com 08-Dec-1999   GoDaddy.com, LLC
  6050[.]com 13-Dec-1999   GoDaddy.com, LLC
  7010[.]com 08-Dec-1999   GoDaddy.com, LLC
  8010[.]com 08-Dec-1999   eName Technology Co.,Ltd.
  8050[.]com 13-Dec-1999   GoDaddy.com, LLC
  barfface[.]com 08-Dec-1999   GoDaddy.com, LLC
  bigjoke[.]com 16-Dec-1999   GoDaddy.com, LLC
  crothers[.]org 24-Feb-2015   GoDaddy.com, LLC
  e2011[.]com 14-Jan-2000   NameSilo, LLC
  e2012[.]com 14-Jan-2000   GoDaddy.com, LLC
  e2015[.]com 14-Jan-2000   GoDaddy.com, LLC
  geekslacker[.]com 18-Dec-1999   GoDaddy.com, LLC
  greenlobster[.]com 08-Jun-2000   GoDaddy.com, LLC
  i2011[.]com 14-Jan-2000   NameSilo, LLC
  i2012[.]com 14-Jan-2000   GoDaddy.com, LLC
  i2014[.]com 10-Mar-2000   GoDaddy.com, LLC
  i2015[.]com 14-Jan-2000   GoDaddy.com, LLC
  kyle-crothers[.]com 08-Dec-1999   godaddy.com, llc
  kylecrothers[.]com 09-Dec-1999   GoDaddy.com, LLC
  lawson-techs[.]com 24-Aug-1999   godaddy.com, llc
  lawsonadmin[.]com 15-Aug-2000   GoDaddy.com, LLC
  lawsonadmin[.]net 28-Sep-2000   GO DADDY SOFTWARE INC
  lawsonadmin[.]org 28-Sep-2000   GO DADDY SOFTWARE INC
  lawsonexperts[.]com 03-Jan-2001   GoDaddy.com, LLC
  lawsonexperts[.]net 23-Mar-2017   RJG VENTURES, L.L.C
  lawsonpeople[.]com 16-Mar-2001   GoDaddy.com, LLC
  lawsonpeople[.]net 16-Mar-2001   GoDaddy.com, LLC
  lawsonrecruiting[.]com 16-Mar-2001   GoDaddy.com, LLC
  lawsonrecruiting[.]net 16-Mar-2001   GoDaddy.com, LLC
  powermodem[.]com 14-Dec-1999   GoDaddy.com, LLC
  powermodems[.]com 14-Dec-1999   GoDaddy.com, LLC
  quickwires[.]com 16-Dec-1999   GoDaddy.com, LLC
  realtx[.]net 10-Dec-1999   GoDaddy.com, LLC
  stefco[.]com 06-Dec-1995   Squarespace Domains II LLC
  streetyacht[.]com 29-Nov-2013   TurnCommerce, Inc. DBA NameBright.com
  streetyachts[.]com 29-Nov-2019   TurnCommerce, Inc. DBA NameBright.com
  tourneyrank[.]com 03-Apr-2021   Gname.com Pte. Ltd.
  tournyrank[.]com 30-Aug-2006   godaddy.com, llc

References

[1] https://www.iamstobbs.com/opinion/christmas-and-new-year-brand-protection-trends-new-year-domain-names

[2] https://www.iamstobbs.com/opinion/the-universe-of-numeric-domain-names

[3] https://www.iamstobbs.com/opinion/its-a-dark-whois-world

[4] https://web.archive.org/web/20020604060934/http://kylecrothers.com/

This article was first published on 26 December 2024 at:

https://www.linkedin.com/pulse/brand-protection-data-still-beautiful-part-1-year-domains-barnett-juwhe/

No comments:

Post a Comment

Br'AI've New World - Part 1: Brand protection 'clustering' as a candidate task for the application of AI capabilities

Introduction The issue of 'clustering' in brand protection - that is, the ability to flexibly identify the existence of links betwee...