Monday, 23 June 2025

'Notorious IP Addresses' and initial steps towards the formulation of an overall threat score for website

Part of the 'Patterns in Brand Monitoring: Brand Protection Data is Beautiful' series of articles[1,2,3]

EXECUTIVE SUMMARY

The ability to rank results according to the level of threat they pose is a key component of many brand protection services, offering the ability to identify priority targets for further analysis, content tracking or enforcement.

Metrics providing the capability to rank results in this way are often based on a range of website characteristics, including webpage content and technical configuration features of the associated domain name. 

This study considers the case of website hosting characteristics, with a specific focus on the IP address at which the website is hosted. The IP address - and, by extension, the associated hosting service provider - can be an important factor to consider, as hosting providers can vary in their level of attractiveness to infringers, based on a range of factors such as their compliance to takedown requests. 

The analysis presented in this case utilises data from an IP address 'blacklist', compiled using insights from any identified association of the address in question with content found to be infringing, such as use for spamming or malware distribution. The construction of one possible formulation of a threat-score component based on the host IP-address is then presented, calculated using the proximity of the IP address in question to other addresses explicitly included in the blacklist. The algorithm is based on the subdivision of IP-address space into 'netblocks', across which patterns in the frequency of infringing content are also considered.

This article was first published on 19 June 2024 at:

https://www.iamstobbs.com/insights/notorious-ip-addresses-and-initial-steps-towards-the-formulation-of-an-overall-threat-score-for-websites

* * * * *

WHITE PAPER

Introduction

The identification of those website characteristics which are disproportionately associated with infringing or illicit activity is a key element of the process of threat quantification for brand-protection findings. Quantifying the level of (current or potential future) risk of an identified domain name or website has a number of benefits, including the ability to identify priority targets for further analysis, content tracking or enforcement, amongst potentially large datasets[4,5]. The same datasets can also provide insights into 'clusters' of associated findings[6], and into the likelihood of enforcement success in any particular case.

Two such characteristics are the registrar (the organisation through which the associated domain name was registered) and the hosting provider (the organisation supplying the physical infrastructure - i.e. a webserver - on which a site is hosted) of any given website in question. There are many possible reasons why specific registrars and hosting providers may be disproportionately popular with infringers, including differences in their inherent level of cooperativeness ('compliance') to notifications of IP infringements, their speed of response[7], geographic region(s) of operation, and so on. 

In the case of registrars, various research organisations collate information on those providers which are found to be more commonly associated with infringing activity of a range of types. The most meaningful datasets are those in which the numbers of infringements are expressed as a proportion of the total number of domains registered, to give an overall 'trust' or 'reputation' score for each registrar, rather than just considering the raw numbers of infringing sites (since this will skew the data towards those registrars which are simply more popular generally). One such dataset is that provided by Spamhaus[8], which (as of 29-Jan-2025) gives the top five 'low-trust' registrars (by quantitative 'bad reputation score') as 'Ultahost, Inc.', 'Domain International Services Limited', 'nicenic.net (ZhuHai NaiSiNiKe Information Technology)', '香港翼优有限公司' ('Hong Kong Wingyou Co., Ltd.') and 'Dnsgulf Pte. Ltd.'. Note that this list does not necessarily imply that the registrars in question are non-compliant with enforcement notices, although it has been noted that the frequent or repeated association of a registrar with infringing activity is often an indication of non-compliance[9]. Examples of non-compliant registrars are also discussed in forums between infringers looking for providers to use for their content[10].

Moreover, many brand protection service providers will have collated (in many cases, quantifiable) information on the compliance of individual registrars, based on their previous enforcement experience. This allows for the construction of a risk 'score' for each registrar, which can serve as an input into algorithms for quantifying the overall level of potential threat of any associated website.

Similar comments are also true of hosting providers. Indeed, some providers explicitly bill themselves as 'bulletproof' -  implying a lack of compliance to enforcement notices - as a means of attracting business from providers of illicit content (Figure 1).

Figure 1: Examples of websites of self-proclaimed ‘bulletproof’ hosting providers (and/or registrars)

Other websites also exist to serve as resources for content producers looking for recommendations of non-compliant providers (Figure 2).

Figure 2: Website offering recommendations of 'bulletproof' hosting providers

Similarly to the case with 'high-risk' registrars, a number of resources are also available where information on infringing hosting providers is collated, such as the information provided again by Spamhaus[11]

In this study, we aim to collate information from a related dataset; namely a 'blacklist' of IP addresses which has been compiled based on reports of associated infringing activity of a variety of types, and from a range of sources. This analysis aims to identify any trends and patterns in the groups of high-risk IP addresses[12] - and, by extension, the hosting providers with which they are associated - as a means of establishing additional datasets which could be used as data inputs into algorithms for assessing overall website potential risk level (i.e. if a website is hosted on a high-risk IP address, it is potentially more likely to be associated with illicit activity). 

Analysis

The dataset used in this case is the IP address blacklist provided by Myip.ms[13], containing around 169,000 listings (0.0039% of the total possible IP-space)[14] as of 29-Jan-2025. The (IPv4) addresses are of the format xx.xx.xx.xx, where each 'xx' is a number between 0 and 255. In this study, we use the terminology 'netblock' to refer to a group of IP addresses with the same initial elements; a group of addresses of the form A.xx.xx.xx (with fixed 'A') would be a 'first-level netblock', A.B.xx.xx a 'second-level netblock' and A.B.C.xx a 'third-level'.

The most obvious initial stage of analysis would simply be to consider the hosting provider and country associated with each of the IP addresses in the dataset. This 'granular' approach in some ways provides more meaningful information than any insights gained by grouping together the individual IP addresses into their respective netblocks, not least because there is not necessarily any reason to believe that all addresses in a particular netblock are associated with each other, or with a common hosting provider (although it is often the case that major providers may control entire netblocks). Nevertheless, a netblock-based analysis can provide some useful insights.

The most obvious observation is that the blacklisted IP addresses are not distributed evenly across IP-space; Figure 3 shows the total number of such addresses within each first-level netblock.

 

Figure 3: Total number of blacklisted IP addresses within each first-level netblock

The 10.xx.xx.xx, 11.xx.xx.xx, 127,xx.xx.xx and all blocks from 224.xx.xx.xx onwards do not contain any blacklisted addresses. The majority of these have special uses, however, such as the 127 netblock, which is reserved for (internal) loopback addresses[15], and the 10 netblock, reserved for private networks[16]

Next, we consider the IP address 'universe' grouped into second-level netblocks (A.B.xx.xx), of which there are 65,536 (i.e. 2562) in total. Using this framework, it is possible to determine how many blacklisted IP addresses appear in each block (which may provide valuable insights, working on the principle that those blocks more highly populated with blacklisted addresses could, all other factors being equal, be deemed 'higher risk' for any arbitrary associated other websites). This dataset is presented graphically in Figure 4.

Figure 4: Number of blacklisted IP addresses (out of a possible maximum of 65,536) in each second-level netblock - first-level address component ('A' in A.B.xx.xx) (from 0 to 255) shown across the horizontal axis; second-level address component ('B' in A.B.xx.xx) (from 0 to 255) shown down the vertical axis

The next associated insight is the identification of those individual netblocks which are associated with the greatest numbers of infringements (i.e. the greatest numbers of blacklisted addresses) - i.e. the brightest 'hotspots' in the figure - of which the top ten are shown in Table 1.

Netblock
                                    
No. blacklisted
addresses
                                    
114.119.xx.xx 2,353
159.138.xx.xx 1,606
104.21.xx.xx 1,253
172.67.xx.xx 986
47.251.xx.xx 882
17.241.xx.xx 670
183.130.xx.xx 658
54.36.xx.xx 604
3.145.xx.xx 507
116.2.xx.xx 496

Table 1: Top ten 'high-risk' (second-level) netblocks, by the numbers of blacklisted IP addresses (out of a possible maximum of 65,536)

In additional to the individual 'hotspot' netblocks, a number of vertical 'stripes' are present in the visualisation, indicating groups of adjacent netblocks, all (or many) of which are associated with unusually high levels of infringements (and also more strongly suggesting meaningful links between them). Examples include the first-level netblocks 45.xx.xx.xx (3,279 blacklisted addresses out of a possible 16.7 million (i.e. 2563)), 103.xx.xx.xx (4,210 addresses), 185.xx.xx.xx (4,867 addresses) (red arrows in Figure 5), and the groups of second-level blocks 35.159.xx.xx to 35.243.xx.xx (962 addresses out of a possible 5.5 million), 54.144.xx.xx to 54.246.xx.xx (1,492 / 6.8 million), and 91.190.xx.xx to 91.247.xx.xx (1,026 / 3.8 million) (blue arrows in Figure 5).

Figure 5: Version of Figure 4, but with arrows highlighting 'clusters' of adjacent netblocks all (or many) of which contain high numbers of blacklisted IP addresses

Note that it would also be possible to carry out a similar analysis looking at the third-level netblocks, in which the equivalent of Figure 3 would be a visualisation as a 3D cube. Although a graphical analysis is somewhat more cumbersome, it is a relatively simple matter to identify the highest-risk netblocks (by the number of blacklisted IP addresses - out of a possible maximum of 256 - contained within them), in a way analogous to Table 1. This analysis is shown in Table 2, for all third-level netblocks in which at least half the IP addresses are blacklisted. 

Netblock
                                    
No. blacklisted
addresses
                                    
54.36.148.xx 256
195.154.122.xx 255
95.108.213.xx 254
213.180.203.xx 253
87.250.224.xx 252
110.52.235.xx 252
17.241.219.xx 226
17.241.75.xx 225
17.241.227.xx 219
5.255.231.xx 209
113.123.0.xx 200
52.167.144.xx 195
54.36.150.xx 192
20.171.206.xx 179
117.45.252.xx 175
95.163.255.xx 160
185.220.101.xx 159
195.154.123.xx 146
159.138.152.xx 142
13.66.139.xx 141
52.233.106.xx 136
159.138.128.xx 133
159.138.156.xx 132
159.138.157.xx 132
64.124.8.xx 130
159.138.154.xx 129
159.138.155.xx 129
159.138.153.xx 128

Table 2: Top 'high-risk' (third-level) netblocks, by the numbers of blacklisted IP addresses (out of a possible maximum of 256)

From this data, we can start to see the possible basis of a threat scoring algorithm for arbitrary websites. A website hosted on an IP address which is actually blacklisted is highly likely to be of concern; however, one hosted in one of the netblocks featured in Table 2 (for example) will still warrant careful analysis (i.e. being assigned a 'secondary' level of concern), even if it is hosted on one of the specific IP addresses within the block which is not explicitly blacklisted.

The next stage of analysis is to consider the hosting provider and geographical country of location associated with each of the blacklisted addresses, in order to determine which providers and countries appear most commonly in the dataset and might therefore be deemed 'highest risk'. This information is generally readily available via an IP address 'whois' look-up in each case. 

From this dataset, some patterns are immediately apparent. For example, the set of 'high-risk' addresses between 35.159.xx.xx and  35.243.xx.xx are all associated with Amazon Technologies Inc. and Google LLC as hosting providers, and the 54.144.xx.xx to 54.246.xx.xx set is also under the management of Amazon Technologies Inc.

As a simple way of post-processing the data (so as to extract a 'clean' version of the name of the hosting provider in each case, and to most efficiently collect together - at a high level - IP addresses pertaining to what is actually the same provider), the name of the hosting provider as given by the whois look-up in each case is truncated at the first instance of a comma - so that, for example,  'GoDaddy.com' and 'GoDaddy.com, LLC' are both treated as the same entity. This yields a set of 8,757 distinct entities.

It is worth pointing out that the whois look-ups required specifically to identify the hosting providers of the IP addresses in question (noting that the original IP address blacklist dataset itself also gives country information) failed in 51,696 cases, which may cause the statistics to be 'skewed' somewhat, if the failures are disproportionately associated with particular providers or geographic regions.

From the available data, Tables 3 and 4 show the top (i.e. 'highest risk') hosting providers and countries most commonly associated with the IP addresses in the blacklist.

Hosting provider
                                                                                                        
No. blacklisted
IP addresses
                                    
  Amazon Technologies Inc. 14,030
  CHINANET jiangsu province network 7,285
  Cloudflare 3,317
  Microsoft Corporation 2,817
  Amazon.com 2,796
  Huawei-Cloud-SG 2,526
  DigitalOcean 2,329
  HostPapa 2,157
  Alibaba Cloud LLC 1,971
  CHINANET SHANDONG PROVINCE NETWORK 1,869
  CHINANET Jiangxi province network 1,619
  Google LLC 1,584
  Huawei HongKong Clouds 1,538
  CHINANET Anhui province network 1,382
  CHINANET Guangdong province network 1,222
  PSINet 1,206
  PT TELKOM INDONESIA 1,070
  CHINANET-ZJ Zhongxin node network 873
  CHINANET henan province network 821
  Apple Inc. 756

Table 3: Top (i.e. 'highest risk') hosting providers represented in the IP address blacklist (where data available)

Host country
                                                
No. blacklisted
IP addresses
                                    
  US (USA) 53,373
  CN (China) 27,189
  RU (Russia) 9,669
  SG (Singapore) 6,099
  DE (Germany) 4,734
  ID (Indonesia) 4,264
  BR (Brazil) 4,125
  GB (UK) 3,607
  IN (India) 3,557
  FR (France) 2,450
  VN (Vietnam) 2,140
  UA (Ukraine) 2,065
  PL (Poland) 2,061
  BD (Bangladesh) 2,012
  CA (Canada) 1,989
  TH (Thailand) 1,886
  NL (Netherlands) 1,604
  RO (Romania) 1,558
  RS (Serbia) 1,468
  ZA (South Africa) 1,407

Table 4: Top (i.e. 'highest risk') host countries represented in the IP address blacklist (where data available)

In order to take a more granular view, it is possible to convert each IP address to a city-level location (and an associated latitude / longitude reference) through a process called 'geolocation', for which a number of standard tools are available[17]. From this analysis, we can also extract the top 'high risk' city locations for hosting blacklisted content (Table 5). 

Host city
                                                
No. blacklisted
IP addresses
                                    
  Shanghai, CN 19,981
  Columbus, US 7,061
  Ashburn, US 6,088
  Singapore, SG 4,270
  San Francisco, US 3,553
  Moscow, RU 3,287
  Hong Kong, HK 3,077
  Los Angeles, US 3,063
  Jiaxing, CN 2,714
  Frankfurt am Main, DE 2,497
  San Jose, US 2,478
  Seattle, US 2,127
  New York City, US 1,822
  Amsterdam, NL 1,687
  Buffalo, US 1,674
  London, GB 1,669
  Jakarta, ID 1,571
  Paris, FR 1,431
  Tokyo, JP 1,318
  Dallas, US 1,297

Table 5: Top (i.e. 'highest risk') city host locations represented in the IP address blacklist (where data available)

Following on from the above, it is also possible to construct a 'heat map' to visualise the host locations of the blacklisted IP addresses (essentially, aggregating together the geolocation information into grid squares, and shading them according to the number of blacklisted IP addresses within each square. This visualisation is shown in Figures 6 and 7 (where each grid square covers 1° of latitude / longitude). 

Figure 6: Global heat map showing the host locations of the blacklisted IP addresses (shading denotes the number of addresses hosted within each grid square)

Figure 7: Detailed views of Figure 6 - top to bottom: Americas; Europe and Middle East; Asia

Whilst the numbers presented in this study are meaningful in their own right (in terms of reflecting where (and with whom) the blacklisted IP addresses are hosted - i.e.. the 'dark spots' on the heat map in Figures 6 and 7), they do reflect both the locations of the infringements and the locations where content is most commonly hosted generally. For example, if a particular hosting provider is generally very commonly used, it might not be unreasonable to expect that provider also to be associated with high volumes of infringements (even if the extent of abuse is not disproportionate). For a future piece of analysis, it may be instructive to compare the extent to with which locations and hosting providers are associated with high levels of threat (i.e. numbers of blacklisted IP addresses) with the overall numbers of IP addresses associated with those same locations and hosting providers (e.g. the total numbers of IP addresses under their management), so as to get a more meaningful measure of rate of association with infringing activity (i.e. a 'reputation' score). 

Discussion: Steps towards a threat-scoring framework

The main application of this type of analysis is the determination of factors which are most commonly associated with infringing websites. Once these databases are in place, they can be used as inputs into overall algorithms to quantify the likely level of threat which may be posed by an arbitrary (perhaps newly-identified) website, even in cases where no live website content is not yet present (in the cases of characteristics such as registrars and hosting providers and locations, which are inherent to the technical infrastructure of the domain name in question).

Looking at the case of hosting IP address as an example, it may also be appropriate to assign IP addresses, and IP address ranges, into threat 'tiers' (with associated threat-score components) based on the 'closeness' of their association with known infringing content. A host IP address which is actually blacklisted is likely to be associated with the highest level of potential threat, followed by a non-blacklisted IP address within a netblock which itself contains high numbers of blacklisted addresses. Lower tiers of threat may be appropriate for IP addresses in higher-level netblocks which are generally found to be associated with higher-than-average rates of abuse (such as those covered by the vertical 'stripes' in Figures 4 and 5). 

A fuller formulation of a threat-scoring framework along these lines may also be a topic for future research, but it is instructive to test an initial prototype version based on the characteristics (high-risk IP addresses, hosting providers and registrars[18]) discussed in this study. For this analysis, we consider a sample set of arbitrary domain names registered on a particular day, based on zone-file analysis[19].

For this dataset of around 11,000 domain names, whois look-ups were run to determine the host IP address, the associated hosting provider and the registrar in each case. For each of these three characteristics, a threat-score component (nominally between 0 and 100) was calculated (based on comparison with the datasets outlined in this study, pertaining to the frequency of each of these characteristics with infringing content) for each domain in question. Details of the methodology are given in Appendix A. 

These components were then aggregated together to yield an overall potential threat score for the domain; the simplest implementation of the threat score is that given by simply adding the three components together. In this case, this yields 398 jointly top-scored domains, all with a score of 171, all of which are hosted on an IP address which is explicitly blacklisted (score component = 100), with the dominant remaining component of the score being a contribution of 70, caused by the fact that the sites in question are hosted with Amazon Technologies Inc., which appears extensively in the IP address blacklist. However, the score for this provider is probably artificially rather too high, appearing as an artefact of the fact that Amazon is a very popular hosting provider generally, and highlighting the requirement for some kind of normalisation according to the total number of websites / IP addresses under management. 

The use of a high-threat registrar (according to the Spamhaus list) is probably a better indication of potential infringing activity than either of the other two domain characteristics being considered, so it may be appropriate to increase (by some factor) the weight of the contribution of the registrar score to the overall threat score. In so doing, we gain (apparently) a much more meaningful assessment of the level of potential threat posed by the domains, as verified in many cases by an inspection of site content (where present), or a simple analysis of the types of keywords present in the domain names (suggesting that, even where no live site is yet present, several of the most highly-scored names are likely to have been registered for use in conjunction with the types of content which are frequently of concern, making them worthy of future monitoring). The most highly-scored domain registrations by this weighted threat score are shown in Table 6 (noting that some of these may, of course, actually be legitimate).

* 'NiceNIC' = NiceNIC International Group Co., Limited

Table 6: Top-ranked domains in the dataset by potential threat score

Indeed, of the top twenty domains of with greatest potential threat scores (shown in Table 6), several feature characteristics of particular concern:

  • Two (eflowtollsystem[.]com and kraken2trfqodidvlh4aa337cpzfrhdlfldhve5nf7ujhnmwr7instad[.]com*) generate browser warning pages advising of 'dangerous' content
  • Some are blocked from viewing in certain geographic locations
  • Some resolve or re-direct to apparently innocuous content, but which may also be a means of 'masking' infringing content, which might only be visible at certain times or from certain locations (i.e. 'geoblocking')[20]
  • Some pertain to content which is commonly associated with scams or other types of abuse, such as blockchain technology or cryptocurrency (e.g. claim-pinlink[.]com - re-directs to claims-realios[.]net/main*, proposai-soniclabs[.]com*, resasfinance[.]com*)
  • Others are soliciting for the input of personal details and may be impersonating trusted brands (e.g. 1298245[.]com*)
  • Of the domains which do not resolve to live content (or where the content is not visible as of the date of analysis), several have domain names which are highly suggestive of suspicious or fraudulent use (e.g. unlock-e-trade[.]com, netbotrade[.]com, contactlloydsonline[.]com, secure-coinb[.]com)

Those examples marked with an asterisk are shown in Figure 8.

Figure 8: Examples of live site content of potential concern hosted on domains listed in Table 6.

In cases where this type of threat-scoring approach is applied to sets of domain registrations pertaining to a specific brand (or other issue of interest), the ranking is likely to offer an efficient way of determining which of the names in the dataset are most worthy of initial prioritised analysis or enforcement.

A final point to note is that insights regarding the geographical focuses of infringing activity, as presented in this study, can also help inform wider policies on intellectual property protection, such as identifying key territories in which additional trade mark protection would be advisable.

Appendix A: Methodology for calculating the prototype threat score components

i. Score component based on host IP address / third-level netblock

If the host IP address is explicitly one of the blacklisted addresses, it is automatically assigned a score of 100. If this is not the case, but if the IP address appears in the same third-level netblock as at least one blacklisted address, the score component is calculated as the ratio between the number of blacklisted addresses within the netblock, and 256 (i.e. the total number of possible addresses in the block), multiplied by 100. 

For example, if a domain was found to be hosted in a non-blacklisted IP address in the 159.138.153.xx netblock (which contains 128 blacklisted addresses in total), the threat score component is calculated as (128 / 256) × 100 = 50. 

ii. Score component based on hosting provider

The score component assigned to each hosting provider is based on the frequency of association of each provider with blacklisted IP addresses contained within the dataset utilised in this study. The individual providers thereby fall into a range between 0 and 14,030 blacklisted addresses (Amazon Technologies Inc.). The score component assigned to a website associated with any given hosting provider is calculated as the ratio between the number of blacklisted addresses and (arbitrarily) 20,000, multiplied by 100 (giving a final value between 0 and 70.15). 

Note that the score as defined is therefore unnormalised relative to the total number of IP addresses under management for that hosting provider.

N.B. It is generally necessary to apply an element of data 'cleansing' before carrying out the matching of hosting provider names (and also to aggregate together entries in the blacklist, as necessary), as the same provider may be referenced differently by distinct whois look-ups - e.g. GoDaddy may variably be referenced as 'GoDaddy', 'GoDaddy.com', 'GoDaddy.com, LLC', etc.

iii. Score component based on registrar

The score component associated with the domain registrar is simply based on the dataset provided by Spamhaus, as referenced in the Introduction section of this study (which itself already incorporates an element of 'normalisation', based on the total numbers of domains under management).

The registrars in the Spamhaus database are assigned scores which sit in a range from 0 to 7.6. Wherever a registrar for a domain in the analysed dataset appears in the Spamhaus list, the associated threat-score component is calculated just as the Spamhaus score multiplied by ten (to give a score in the range from 0 to 76, for the dataset provided as of the date of analysis). 

N.B. (1) As for the hosting providers, it is generally necessary to apply an element of data 'cleansing' before matching the registrar given by a whois look-up against the contents of the Spamhaus list (rather than simply carrying out a straight look-up), since the same registrar may be referenced differently across the lists (e.g. 'CSL Computer Service Langenbach GmbH d/b/a joker.com' is referenced by Spamhaus as 'Joker (CSL Computer Service)'.

N.B. (2) In cases where the same registrar appears more than once in the Spamhaus list with a variant name, but with different scores (e.g. 'Turkticaret.net Yazilim Hizmetleri Sanayi ve Ticaret A.S.' (0.0355) and 'Turkticaret.net Yazılım Hizmetleri Sanayi ve Ticaret A.Ş.' (0.0595)), the score used in this analysis is taken simply as the mean of the relevant Spamhaus scores (i.e. 0.0475 in the above case).  

References

[1] https://www.linkedin.com/pulse/brand-protection-data-beautiful-david-barnett-c66be/

[2] https://www.linkedin.com/pulse/brand-protection-data-still-beautiful-part-1-year-domains-barnett-juwhe/

[3] https://www.linkedin.com/pulse/brand-monitoring-data-niblet-5-law-firm-scam-websites-david-barnett-ap5de/

[4] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 3: 'Brand content scoring'

[5] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 5: 'Prioritization criteria for specific types of content'

[6] 'Patterns in Brand Monitoring' (D.N. Barnett, Business Expert Press, 2025), Chapter 6: 'Result clustering'

[7] https://brandsec.com.au/phishing-malicious-domain-names/

[8] https://www.spamhaus.org/reputation-statistics/registrars/domains/

[9] https://bfore.ai/navigating-domain-takedowns-with-non-cooperative-registrars/

[10] e.g. https://www.blackhatworld.com/seo/question-looking-for-bulletproof-domain-registrar.1412558/

[11] https://www.spamhaus.org/resource-hub/bulletproof-hosting/bulletproof-hosting-theres-a-new-kid-in-town/

[12] This study uses the terminology of 'notorious' IP addresses, in reference to the USTR 'Notorious Markets List', which is published annually to reflect those high-risk platforms most commonly associated with facilitating counterfeiting and piracy - see https://ustr.gov/about-us/policy-offices/press-office/press-releases/2025/january/ustr-releases-2024-review-notorious-markets-counterfeiting-and-piracy

[13] https://myip.ms/browse/blacklist/Blacklist_IP_Blacklist_IP_Addresses_Live_Database_Real-time

[14] Note that the analysis focuses only on the 'old format' (IPv4) IP addresses (of the form xx.xx.xx.xx, where each 'xx' is a number between 0 and 255) in the blacklist; this type of analysis is likely to become much more complex as IP address usage transitions to the IPv6 format (yyyy:yyyy:yyyy:yyyy:yyyy:yyyy:yyyy:yyyy, where each 'yyyy' is a four-digit hexadecimal (base-16) number) in the future.

[15] https://www.cronj.com/blog/localhost-127001-a-special-address/

[16] https://en.wikipedia.org/wiki/List_of_assigned_/8_IPv4_address_blocks

[17] In this study, we utilise the Python-based library tool 'IPinfo' (https://pypi.org/project/ipinfo/), which references the dataset available from IPinfo.io. In order to limit the number of geolocation look-ups required in this study, we perform a query only for one IP address in any range where (a) the second-level netblock, (b) the hosting provider name, and (c) the hosting provider country are all the same (i.e. for point (a), where the first- and second-level IP address components are the same). The latitude and longitude of the physical location of all other IP addresses in the range sharing these characteristics is then assumed to be identical.

[18] This approach allows us to incorporate additional information than would be available by (say) just considering the host IP address as a means of identifying the associated hosting provider - this is appropriate given (for example) the fact that, just because an IP address under the management of a particular provider may be blacklisted, it does not necessarily follow that all of that provider's addresses will be higher risk.

[19] The dataset is taken from one day's worth of registrations (117,456) of .com domain names - a TLD for which registration information is generally readily available - as provided by the zonefiles.io website on 01-Feb-2025, relating to the previous day's registrations. The sample analysed in this study consists of every tenth domain name (when sorted into alphabetical order), yielding a dataset of 11,745 domains. Analysis of site content was carried out on 03-Feb-2025.

[20] https://circleid.com/posts/20220531-do-you-see-what-i-see-geotargeting-in-brand-infringements

This article was first published as a white paper on 19 June 2024 at:

https://www.iamstobbs.com/uploads/general/Notorious-IP-addresses-e-book.pdf

No comments:

Post a Comment

'Notorious IP Addresses' and initial steps towards the formulation of an overall threat score for website

Part of the 'Patterns in Brand Monitoring: Brand Protection Data is Beautiful' series of articles [1,2,3] EXECUTIVE SUMMARY The abil...