Examining the Impact of Website Take-down on Phishing
Tyler Moore and Richard Clayton
Computer Laboratory, University of Cambridge
15 JJ Thomson Avenue, Cambridge CB3 0FD, United Kingdom
{tyler.moore},{richard.clayton}@cl.cam.ac.uk
ABSTRACT
Banks and other organisations deal with fraudulent phishing
websites by pressing hosting service providers to remove the
sites from the Internet. Until they are removed, the fraudsters
learn the passwords, personal identiﬁcation numbers
(PINs) and other personal details of the users who are fooled
into visiting them. We analyse empirical data on phishing
website removal times and the number of visitors that the
websites attract, and conclude that website removal is part
of the answer to phishing, but it is not fast enough to completely
mitigate the problem. The removal times have a good
ﬁt to a lognormal distribution, but within the general pattern
there is ample evidence that some service providers are
faster than others at removing sites, and that some brands
can get fraudulent sites removed more quickly. We particularly
examine a major subset of phishing websites (operated
by the ‘rock-phish’ gang) which accounts for around half
of all phishing activity and whose architectural innovations
have extended their average lifetime. Finally, we provide
a ballpark estimate of the total loss being suﬀered by the
banking sector from the phishing websites we observed.
Categories and Subject Descriptors
K.4.4 [Computing Milieux]: Computers and Society—
Electronic Commerce, Security
Keywords
phishing, security economics, electronic crime
1. INTRODUCTION
Phishing is the process of enticing people into visiting
fraudulent websites and persuading them to enter identity
information such as usernames, passwords, addresses, social
security numbers, personal identiﬁcation numbers (PINs)
and anything else that can be made to appear to be plausible.
This data is then used to impersonate the victim to
empty their bank account, run fraudulent auctions, launder
Copyright is held by the author/owner. Permission to make digital or hard
copies of all or part of this work for personal or classroom use is granted
without fee.
APWG eCrime Researchers Summit, Oct. 4–5, 2007, Pittsburgh, PA, USA.
money, apply for credit cards, take out loans in their name,
and so on. Although most current phishing attacks target
the banks, phishing websites regularly appear for businesses
as diverse as online auctions (eBay), payments (PayPal),
share dealers (E*Trade), social-networking (MySpace), gambling
(PartyPoker) and online retailers (Amazon).
The academic work on phishing has been diverse, with a
useful starting point being Jakobsson and Myers’ book [6].
Researchers have tried to understand the psychology of the
process [4], how to block the email containing the initial
enticement [9], how server operators might automatically
detect fraudulent sites [22], and whether there are patterns
to their occurrence [16]. There have been many proposals
for browser mechanisms to detect phishing websites [14, 24],
and schemes to prevent users from disclosing their secrets
to them [17]. Others have looked at disseminating information
about the trustworthiness of websites through central
repositories (blacklists) or social networks [2], although it
seems that users generally ignore any cues that tell them
that websites are likely to be malicious [18, 23].
In this paper we consider phishing from a completely different
angle. The banks (and other organisations being impersonated)
are dealing with the fake websites through ‘takedown’
procedures, so that there is nothing there for a misled
visitor to see. Our aim has been to determine how eﬀective
this strategy is, and whether it is likely to be suﬃcient on
its own to prevent phishing from being proﬁtable.
We monitored the availability of several thousand phishing
websites in Spring 2007. Our results show that a typical
phishing website can be visited for an average of 62 hours,
but this average is skewed by very long-lived sites – we ﬁnd
that the distribution is lognormal – and the median lifetime
is just 20 hours. We were able to examine web log summaries
at a number of sites, along with some detailed records of visitors
that a handful of phishers inadvertently disclosed. This
allowed us to estimate the number of visitors who divulged
their data on a typical site to be 18 if it remained up for one
day, and growing by 8 more per day thereafter.
We identiﬁed a signiﬁcant subset of websites (about half of
all URLs being reported) which were clearly being operated
by a single ‘rock-phish’ gang. These sites attacked multiple
banks and used pools of IP addresses and domain names.
We found that these sites remained available for an average
of 95 hours (again with a lognormal distribution, but with
a median of 55 hours). A newer architectural innovation
dubbed ‘fast-ﬂux’ that uses hundreds of diﬀerent compromised
machines per week extended the website availability
to an average of 196 hours. Within the overall ﬁgures, we
show that some brands are considerably faster than others in
getting spoof websites removed, and that hosting providers
exhibit a wide disparity in their response times.
We see ‘take-down’ as a reactive strategy, an increasingly
prevalent trend in the way that security issues are being
handled. Software vendors wait for vulnerabilities to be discovered
and then issue patches. Anti-virus tools update their
databases with new signatures as new viruses are identiﬁed.
In these reactive approaches, the defenders aim to identify
the bad guys as quickly as possible to minimise exposure,
while the bad guys scramble to open new security holes at
a suﬃciently fast rate to continue their activities.
In this case our ﬁgures demonstrate that a reactive strategy
does reduce the damage done by phishing websites. However,
it is clearly not occurring fast enough to prevent losses
from occurring, and so it cannot be the only response. In
particular, we use the lifetime and visitor numbers above to
show that, on fairly conservative extrapolations, the banks’
losses that can be directly attributed to ordinary phishing
websites are some $160m per annum, with a similar amount
being raked in by the rock-phish gang.
The rest of the paper is arranged as follows. We ﬁrst set
out a model of the mechanics of a phishing attack in Section
2, presenting the arms race resulting from the tactics
available to both attacker and defender. In Section 3.1 we set
out our methodology for gathering data about phishing websites
to compute take-down times, and in Section 3.2 explain
how we estimate the time distribution of phishing responses.
In Section 4 we describe a particularly pernicious category
of phishing site called ‘rock-phish’, which simultaneously impersonates
many banks and regularly cycles through domain
names and IP addresses. In Section 5 we analyse our results
and ﬁnd that by the time phishing sites are removed, damage
has already been done: many responses have been received
and the attackers are moving on to new sites. Finally, in Section
6, we discuss what our results mean in terms of practical
strategies for the banks (and the phishing attackers).
2. THE MECHANICS OF PHISHING
To carry out phishing scams, attackers transmit large
numbers of spam emails which include links (URLs) to websites
under their control. The spam emails must resemble
legitimate email, so that unsuspecting users will consider
them genuine. The spam must also contain an appropriate
message so that users will act upon it, be it an impending
account suspension, a payment for a marketing survey, or a
report of a transaction that the user will know to be fake
and must therefore be cancelled [4]. The email must also be
able to evade the user’s spam ﬁlters. Looking like genuine
email clearly helps, but the ﬁlters may also have access to a
blacklist of URLs that are currently being promoted, so that
there is value in varying the URL to prevent matches.
The user connects to a spoof website by clicking on a link in
the email. Their web browser may access the website directly
or be redirected from an initial site (perhaps via legitimate
redirector systems at, for example, Google1
) to the actual
phishing pages. At this stage browsers may apply their own
heuristics and consult their own blacklists to determine if the
site should be blocked as clearly illegitimate. Provided the
1
In February 2007 Google started to detect usage of their
redirectors and provide a warning message [3], so it is likely
that other redirectors will now be used in preference.
browser does not interfere, the user will then be presented
with an accurate imitation of the legitimate company’s pages
(often including all the links to warnings about fraud), and
thus reassured will ﬁll in their personal details. Although a
handful of sites validate these details immediately, it is more
common for any response at all to be accepted.
The compromised details are usually emailed to a webmail
address, but are sometimes stored in plain text ﬁles at the
spoof website, awaiting direct collection by the fraudster.
Once they have received the compromised details they will
discard the obviously fake and then sell on the details to
cashiers who will empty the bank accounts [19], perhaps
transferring the money via a mule who has been recruited via
further spam email seeking ‘ﬁnancial consultants’ to accept
and relay payments for a commission.
The spoof website is sometimes hosted on ‘free’ webspace,
where anyone can register and upload pages, but is more usually
placed on a compromised machine; perhaps a residential
machine, but often a server in a data centre. The hijacked
machine will have come under the attacker’s control either
through a security vulnerability (typically unpatched applications
within a semi-abandoned ‘blog’ or message-board),
or because the user is running some malware, delivered by
email or downloaded during a visit to a malicious website.
If the website is on ‘free’ webspace a typical URL would be
http://www.bankname.freehostsite.com/login where the
bankname is chosen to match or closely resemble the domain
name of the ﬁnancial institution being attacked. Changing
the hostname is not always possible for compromised
machines, and attackers may have restricted permissions,
so they will add their own web pages within an existing
structure, leading to URLs of the typical form http://www.
example.com/~user/www.bankname.com/ where, once again,
the bankname is present to lend specious legitimacy should
the user check which site they are visiting, yet fail to appreciate
the way in which URLs are really structured.
To avoid the use of example.com, the URL may use just
the IP address of the compromised machine, perhaps encoded
into hexadecimal to obscure its nature. However, to
further allay suspicion, the fraudsters will sometimes go to
the eﬀort of registering their own domain name, which they
will then point at either free webspace, which can often be
conﬁgured to allow this to work, or to a compromised machine
where they have suﬃcient control of the web server
conﬁguration. The domain names are usually chosen to be
a variation on bankname.com such as bankname-usa.com, or
they will use the bank’s name as a subdomain of some plausible,
but superﬁcially innocuous domain, such as bankname.
xtrasecuresite.com. A half-way house to an actual domain
name is the use of systems that provide domain names
for dynamic IP address users, which results in the usage of
domains such as bankname.dyndns.org.
Defence against phishing attacks is primarily carried out
by the impersonated targets (banks etc.) themselves, with
signiﬁcant assistance from a number of technically-savvy volunteers,
who often work at Internet Service Providers (ISPs).
Suspicious emails will be reported by some of the users who
received them, either to the targeted institution, or to one
of several collators – entities that keep a record of reported
phishing sites. Newer web browsers, such as Microsoft’s Internet
Explorer 7 and Mozilla’s Firefox 2, contain single click
reporting systems [8, 10] to make user reporting as simple
as possible. In addition, spam ﬁltering systems are increas-
ingly picking out phishing emails by generic characteristics,
and automatically generating reports where the link they
contain was not previously known.
The recipients of the reports will examine the site being
linked to, in order to determine if it is illegitimate. Once a
reported phish has been vetted, the URL will be added to
the blacklists to block further email spam and to assist antiphishing
browser toolbars and other mechanisms in assessing
the site’s (in)validity. Meanwhile, the defenders will send a
take-down request to the operator of the free webspace, or in
the case of a compromised machine, to the relevant ISP who
will temporarily remove it from the Internet or otherwise
ensure that the oﬀending web pages are disabled. Where a
domain name has been registered by a phishing attacker, the
defenders will ask the domain name registrar to suspend the
oﬀending domain. However, not all ISPs and registrars are
equally co-operative and knowing that a phishing site exists
does not automatically cause its removal. Some ISPs take
down phishing sites immediately, while others do not act
especially promptly. Responsiveness often varies by company
and by country, as well as with the competence (and
language skills) of the organisation requesting the removal.
3. DATA COLLECTION
The average duration for which phishing sites are accessible
is an important measure of the state of phishing attack
and defence. Most phishing sites are identiﬁed and removed
within a few days, yet there must have been suﬃcient visitors
during that period – because the attackers do not appear
to be discouraged, but move on to new locations and
continue their activities. We now describe a methodology
for quantifying phishing site duration and determining the
distribution of user-responses.
3.1 Phishing website availability
We gathered phishing reports from ‘PhishTank’ [15], one
of the primary phishing-report collators. Comparison of
their datasets with other public sources such as ‘Castle Cops’
and Google showed that their collection was by far the most
complete and timely. The PhishTank database records the
URL that has been reported to them, the time of that report,
and sometimes further detail such as whois data or screenshots
of the website. Volunteers use the URL to examine the
website and determine whether it is indeed a phishing website
or an incorrect report (perhaps of a legitimate bank).
Unfortunately, PhishTank does not provide an exact indication
of when sites are removed, and its systems are regularly
misled when phishing websites are not disabled, but
replaced with generic advertising web pages. We therefore
constructed our own testing system which, of necessity, became
rather complex.
This system fetches reports of conﬁrmed phishing websites
from PhishTank and records exactly when PhishTank
ﬁrst learnt of the site. In order to track the existence of
the website independently of whether its host name can be
resolved, further records are constructed by replacing the
host name part of the URL with the IP address it resolves
to and the reverse DNS lookup of that IP address. These
extra records also help to link together multiple reports of
the same site. Additional canonicalisation is done to link together
reports with or without trailing / characters, or when
index.html (index.php etc.) are provided in some reports
and not others.
We tested all of the sites in our database on a continuous
basis, twice every hour, to determine if they were still accessible.
The web page data fetched (along with its HTTP
headers) was ﬁngerprinted so that signiﬁcant changes (anything
apart from date-stamps, session-IDs, etc.) could be
detected. Just prior to fetching the page, the host name
was once again resolved (having ensured that there was no
cached data in the DNS server) and if it had moved to a new
IP address further records for that IP address (and its reverse
DNS lookup) were added to the database as required.
A website that returned a ‘404’ error was removed from the
database, but timeouts and other temporary failures were
retried for at least 48 hours.2
This testing regime enables us to precisely (with an accuracy
of about 30 minutes) determine when a phishing website
is removed or changed, whilst remaining tolerant of temporary
outages. Where multiple database entries pointed at
the same web page, the ﬁngerprinting enabled us to detect
this and remove the duplicates. Also, for known malicious
sites with identical ﬁngerprints (and, in particular, the rockphish
attacks described in Section 4), we immediately categorised
the sites as malicious, without waiting to discover
whether the PhishTank volunteers had correctly done so.
In practice, our observations showed that phishing websites
were entirely static, and hence any change in ﬁngerprint
was suﬃcient to indicate that it had been removed, or further
requests were showing a generic page. This simpliﬁed
our monitoring considerably, but it was still necessary to
view the ﬁrst page we captured to determine which institution
was being targeted or, as sometimes happened, whether
it was already removed by the time we learnt of its existence.
3.2 Visitor statistics
We also wished to gain a better understanding of the distribution
of user responses to phishing attacks, and were
able to gather some limited information about how many
visitors a typical website received, and how many ﬁlled in
the web form and provided any data.
In a small number of cases (less than two dozen that we
have detected) the site recorded details of victims into text
ﬁles that were stored on the site itself in such a way that
we could retrieve them. Inspection of these ﬁles showed how
many responses were received and whether or not they were
likely to be valid. Some of the entries were clearly testing
(random sequences of characters), or consisted of profanities
directed at the recipient of the data. The remainder of the
responses were counted as valid, although it is understood
that some banks deliberately provide data on dummy accounts
for their own tracing purposes, so our counts will to
some minor extent overestimate the number of people actually
compromised.
In other cases we have collected publicly available web
page usage statistics collated by the sites where the phishing
pages are residing. Webalizer [21] is a particularly popular
package, which is often set up by default in a world-readable
state on the type of web servers that seem to be regularly
compromised. Indeed, it may be unpatched Webalizer vulnerabilities
that permitted access in the ﬁrst place. These
2
At present, we are excluding all sites that involve nonstandard
forms of redirection to reach the ﬁnal phishing
webpage. This avoids considerable complexity (some phishers
even use Macromedia ﬂash ﬁles to redirect traﬃc), at the
expense of a lack of completeness.
statistical reports provide daily updates as to which URLs
are visited, and these can be used to determine the total
number of visitors and how many reached the ‘thank you’
page that is generally provided once personal data has been
uploaded. By assuming that similar proportions of these
‘hits’ are valid occurrences of visitors compromising their
identity information, it is possible to form a view as to the
eﬀectiveness of the phishing exercise and the distribution
of visitors day by day. As new reports are obtained from
PhishTank, we have automatically queried sites to determine
whether Webalizer is running; if so, we returned daily
to collect new reports. In all, we discovered over 2 500 sites
using Webalizer in this manner.
4. ROCK-PHISH ATTACKS
In Section 2 we described the way in which typical phishing
websites were operated with web pages added to existing
structures and the occasional use of misleading domain
names. However, the ‘rock-phish’ gang operate (in early
2007) in a rather diﬀerent manner. Having compromised
a machine they then cause it to run a proxy system that
relays requests to a back-end server system. This server is
loaded with a large number (up to 20 at a time) of fake bank
websites, all of which are available from any of the rockphish
machines. The gang then purchase a number of domain
names with short, generally meaningless, names such
as lof80.info. The email spam then contains a long URL
such as: http://www.volksbank.de.networld.id3614061.
lof80.info/vr where the ﬁrst part of the URL is intended
to make the site appear genuine and a mechanism such as
‘wildcard DNS’ can be used to resolve all such variants to a
particular IP address.
Transmitting unique URLs trips up spam ﬁlters looking
for repeated links, fools collators like PhishTank into recording
duplicate entries, and misleads blacklist users who search
for exact matches. Since the numeric values are sent to the
DNS server (which the gang also hosts) it is clear that tracking
of responses is possible along with all kinds of customisation
of responses. However, which bank site is reached
depends solely upon the url-path (after the ﬁrst /). Hence,
a canonical URL such as http://www.lof80.info/ is sufﬁcient
to fetch a top level web page and its ﬁngerprint is
suﬃcient to identify the domain and associated IP address
as owned by the rock-phish gang.
Because the gang use proxies, the real servers – that hold
all the web pages and collate the stolen information – can
be located almost anywhere. The number and location of
these servers might be found by inspecting the proxies and
determining who they were communicating with, but we did
not attempt to do this. However, we did see some small
variations between what the proxies returned both in the
range of pages and the minutiae of their headers, making it
clear that the gang were operating more than one server and
failing to completely synchronise them.
The gang’s methods have evolved over time – they originally
placed all their websites into a /rock directory (hence
their name), morphed later into /r1 but now this directory
name is dispensed with (although we found that /r1/vr/
still works as a synonym for /vr). The gang’s evolution has
been tracked well enough, and their methods diﬀer so much
from other phishing websites, that it is useful to measure
their activities separately for this study. In particular, their
email spam, which has a characteristic section of random
text followed by a GIF image containing the actual message,
is estimated to account for between one third and one
half of all phishing email. The rock-phish gang is believed
to be extremely successful, and it is claimed that they have
stolen in excess of $100m so far [7].
For traditional phishing sites, removing either the hosting
website or the domain (if only used for phishing), is suﬃcient
to remove a phishing site. However, rock-phish sites
share hosts – so that if one is removed, the site automatically
switches to working machines which are still hosting
a copy of the proxy. This switching behaviour provides the
strongest evidence that rock-phish sites collude. To verify
this collusion, we selected a random rock-phish domain and
examined each of the IP addresses associated with the domain.
We tallied each domain that also used one of these
IP addresses and recursively checked these domain’s associated
IP addresses. In this manner we identiﬁed every IP
address associated with rock-phish sites starting from just
one address.
It should be noted that our methodology meant that we
were rapidly aware of DNS changes, where domain names
were mapped to new IP addresses. Because we tended to
make all of our name lookups over a short period of time
we often recorded many names resolving to the same IP
address, and the next time we accessed the rock-phish site we
would see most of them resolving to another address. Users
would not see the same eﬀect because of caching by DNS
servers (usually at their ISP). This caching would mean that
their perception would be of a constant mapping between
name and IP address until the cache entry expired, when
the site would ‘move’. This caching eﬀect also means that
the removal of a domain name does not lead to the instant
disappearance of the website, provided that the machine at
the relevant IP address remains ‘up’. When another ISP
customer has resolved the name already, the site will remain
visible at that ISP for an extended period, and will often be
reachable via the ‘removed’ domain name for most of a day.
4.1 ‘Fast-ﬂux’ phishing domains
While we were collecting data for this paper the gang
introduced a new system dubbed ‘fast-ﬂux’ by the antiphishing
community, with trials in February and wider deployment
from March onwards.3
They arranged for their
domains to resolve to a set of ﬁve IP addresses for a short
period, then switched to another ﬁve. This of course ‘eats
up’ many hundreds of IP addresses a week, but the agility
makes it almost entirely impractical to ‘take down’ the hosting
machines. The gang is likely to have large numbers of
compromised machines available, since if they are not used
to serve up phishing websites they are available for sending
email spam. For further obfuscation, the gang changed
from using the url-path to select the target bank to using
the Host: header from the HTTP connection. This makes
it somewhat more complex for ISPs and registrars to understand
the nature of the sites and to what extent they can be
considered to be ‘live’.
3
We were able to identify several machines that were used
for both the original rock-phish scheme and for the new fastﬂux
architecture, so we are conﬁdent the same gang is involved.
Further, although there are currently (August 2007)
three fairly distinct pools of fast-ﬂux machines being used
for phishing, there are a handful of overlaps which indicate
to us that one gang is operating at least two of them.
10
20
30
40
50
60
70
Mar Apr
Rock domains operational
Rock IPs operational
Figure 1: Rock-phish site activity per day.
4.2 Rock-phish statistics
We analysed rock-phishing sites during a period of eight
weeks between February and April 2007. During this time,
we collected 18 680 PhishTank reports which we categorised
as rock-phish (52.6% of all PhishTank reports for the time
period). While these reports are intended to be unique, we
identiﬁed many duplicates due to the use of unique URLs as
described above. This yielded a signiﬁcant saving in eﬀort,
since just 421 canonical rock-phish URLs were observed.
Rock-phish sites used 125 IP addresses that were found to
be operational for any duration. In all, the rock-phish sites
impersonated 21 diﬀerent banks and 3 other organisations.
Meanwhile, fast-ﬂux sites triggered 1 803 PhishTank reports
during the collection period. These reports pare down
to 72 unique domains which resolve to 4 287 IP addresses.
Observed fast-ﬂux sites have targeted 18 banks and 10 other
organisations.
Rock-phish sites continue to work for a particular domain
that is mentioned in a spam email, provided that they can be
resolved to at least one working IP address. Figure 1 tracks
the average number of operational rock-phish domains and
IP addresses on a daily basis. Sites or domains were removed
constantly, but they were replenished frequently enough to
keep a number of sites working every day. Only once, right
at the start of our data collection period, did the sites fail to
work entirely, because the IP addresses being used for DNS
resolution all failed. Otherwise, between 1 and 75 domains
and between 2 and 22 IP addresses were always available.
Notably, the number of operational domains steadily increased
during the month of March, before falling steadily in
late March and early April. This is primarily attributed to a
large number of .hk domains bought from a single registrar,
which was slow to remove the oﬀending domains. But why
would the rock-phish gang continue to buy new domains
when their earlier ones still worked? One reason is that the
domains may lose eﬀectiveness over time as they are blocked
by spam ﬁlters. Indeed, comparing the number of domains
added per day to the number removed (see Figure 2-top)
reveals only a weak correlation between domain addition
following removal. This suggests the rock-phish gang are
motivated to purchase new domains even when registrars
are slow to take action.
The story is rather diﬀerent for the machines that rockphish
domains resolve to. Figure 2-middle plots the day-byday
addition and removal of compromised machines used.
Here the correlation is strong: as soon as machines are removed,
new ones replace them. The correlation coeﬃcient
of 0.740 implies that 55% of the total variance is explained
by the correlation between adding and removing machines.
Perhaps the rock-phish gang have automated IP replacement;
automating domain acquisition, by contrast, is more
diﬃcult and costly – so it is not surprising that the data suggests
that manual selection prevails when adding domains.
Finally, we can infer whether co-ordination between rockphish
domain and machine removal takes place by comparing
daily take-down rates for both (Figure 2-bottom). There
is almost no correlation between the number of domains removed
on a given day and the number of machines removed.
This suggests that very little co-operation between registrars
and ISPs is taking place. Furthermore, the lack of correlation
implies that either banks and other removal entities are
not communicating convincingly to both ISPs and registrars,
or they do not fully understand the rock-phish gang’s use of
domains and compromised machines.
5. WHO IS WINNING THE ARMS RACE?
Phishing targets invest signiﬁcant resources in removing
phishing sites. In this section we present data on the duration
of phishing sites and on user response to these sites to
determine the eﬀectiveness of the take-down strategy.
In addition to the collection of rock-phish sites, we also examined
reports of regular phishing sites targeting a number
of banks and other sites. From 15 030 reports gathered over
the same 8-week period from February to April 2007, we
0
5
10
15
20
25
30
35
Mar Apr
Rock domains added
Rock domains removed
0
5
10
15
Mar Apr
Rock IPs added
Rock IPs removed
0
5
10
15
20
25
30
35
Mar Apr
Rock domains removed
Rock IPs removed
Correlation coeﬃcient r r2
Rock domains added–Rock domains removed 0.340 0.116
Rock IPs added–Rock IPs removed 0.740 0.547
Rock IPs removed–Rock domains removed 0.142 0.0200
Figure 2: (Top) new and removed rock-phish domains per day; (Middle) new and removed rock-phish IPs
per day; (Bottom) rock-phish domain and IP removal per day. Also included is a table of the respective
correlation coeﬃcients.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 More
Lifetime (days)
0%
10%
20%
30%
40%
50%
Non−rock phish
Rock−phish domains
Rock−phish IPs
Fast−flux domains
Fast−flux IPs
Sites Mean lifetime (hours) Median lifetime (hours)
Non-rock 1 695 61.69 19.52
Rock domains 421 94.68 55.14
Rock IPs 125 171.8 25.53
Fast-ﬂux domains 57 196.2 111.0
Fast-ﬂux IPs 4 287 138.6 18.01
Figure 3: Histogram of phishing site lifetimes with table of sample size and mean and median lifetimes.
identiﬁed 1 695 unique non-rock-phish sites that were alive
upon initial inspection. Because ordinary phishing sites do
not follow a consistent pattern, establishing uniqueness is
diﬃcult. We considered two sites to be duplicates if they
were hosted on the same domain, impersonate the same
bank and were reported to PhishTank within two days of
each other.
However, removing duplicates does not account for the entire
reduction of 14 062 reports. Many sites had already been
removed by the time they have been veriﬁed and promulgated
by PhishTank. Because we cannot evaluate whether
dead-on-arrival-sites are in fact a phishing site or simply a
malformed URL, we exclude them from our lifetime analysis.
Thus, the lifetimes discussed below do not account for
the many sites that are removed immediately.
5.1 Phishing site lifetimes
The site lifetimes for each type of phishing attack are given
in the table in Figure 3. The mean lifetime of a normal
phishing site is 61.69 hours, while for rock-phish domain
the mean lifetime is 94.68 hours. Notably, for all phishing
types, the median take-down time is much less than the
average time. The reason why can be seen in the histogram
of phishing site lifetimes in Figure 3. Each bin represents
one day, and the histogram covers two weeks, which is long
enough for most samples we collected (sites lasting longer
are indicated by the ‘More’ column). 57% of non-rock-phish
sites are removed within 24 hours of reporting, while the
remainder do not survive much longer. Only 28% of nonrock-phish
sites last more than 2 days, though notably the
tail carries on for several weeks. For instance, the longestlived
ordinary phishing site from our sample stuck around
for over seventeen weeks!
For rock-phish sites, the distribution is slightly diﬀerent.
While clearly skewed toward shorter times, the distribution
has a heavier tail: a small but substantial number of rockphish
domains remain operational for longer periods. 25%
are removed on the ﬁrst day, 19% on the second, and 56%
remain for 3 days or longer.
The slightly longer survival time of rock-phish sites may
be partially explained by the persistence of usable hosting
machines (see the ﬁnal histogram in Figure 3). Recall that
rock-phish spam always uses a domain name in the linked
URL. This allows the gang to cycle through IP addresses as
they fail. Several rock-phish domains resolve to the same
IP address at any given time; when the machine is removed,
they switch to another IP address in their pool. Figure 3
suggests that they do not have to switch all that often: IP
addresses work for an average of 171.8 hours. While many
are removed within one day, some remain for months before
being removed.
Another explanation for the longer lifetimes of rock-phish
sites is that their attack method is not widely understood,
leading to sluggish responses. Splitting up the components
of the phishing attack (domains, compromised machines and
hosting servers) obfuscates the phishing behaviour so that
each individual decision maker (the domain registrar, ISP
system administrator) cannot recognise the nature of the
attack as easily when an impersonated domain name is used
(e.g., barclaysbankk.com), or HTML for a bank site is found
in a hidden sub-directory on a hijacked machine.
Fast-ﬂux sites exhibit markedly diﬀerent behaviour. Domains
last much longer: over eight days on average, and
there are far fewer than used for rock-phish. The fast-ﬂux
1 5 50 500
5e−045e−035e−025e−01
Site lifetime t (hours)
Prob(Lifetime>t)
1 5 50 500
5e−045e−035e−025e−01
Site lifetime t (hours)
Prob(Lifetime>t)
1 5 50 500
5e−045e−035e−025e−01
Site lifetime t (hours)
Prob(Lifetime>t)
Lognormal Kolmogorov-Smirnov
µ Std err. σ Std err. D p-value
Non-rock 3.011 0.03562 1.467 0.02518 0.03348 0.3781
Rock domains 3.922 0.05966 1.224 0.04219 0.06289 0.4374
Rock IPs 3.434 0.1689 1.888 0.1194 0.09078 0.6750
Figure 4: Cumulative probability distributions with lognormal curve ﬁt: non-rock-phish lifetimes with µ =
3.01,σ = 1.47 ﬁt (Left); rock-phish domain lifetimes with µ = 3.92, σ = 1.22 ﬁt (Centre); rock-phish IP lifetimes
with µ = 3.43, σ = 0.169 ﬁt (Right).
systems were also used to host a number of other dubious
domains, mainly websites devoted to the recruitment of
‘mules’. The 15 domains we tracked lasted an average of 463
hours (median 135 hours), indicating that their removal was
not a priority. Interestingly, the average lifetime of fast-ﬂux
IP addresses (138.6 hours) is a bit less than the lifetimes of
IPs used for rock-phish attacks (171.8 hours). We speculate
that the phishers are using machines at random and relying
upon the domains resolving to multiple IP addresses to
provide resilience, rather than actively selecting a handful of
hosts that they believe are more likely to remain available.
The skewed distribution of site lifetimes shows that while
most sites are removed promptly, a substantial number remain
for a very long time. These long-lived sites cause the
average lifetime to be much longer than the median lifetime.
We have managed to ﬁt some of the take-down data to match
the lognormal probability distribution. To do so, we ﬁrst
estimated the parameters µ and σ which specify the distribution
using maximum likelihood estimation. To test the
ﬁt, we computed the Kolmogorov-Smirnov test 1000 times
to compute the average maximum diﬀerence D between the
model and data.
The lognormal distribution turns out to be a good ﬁt for
the distribution of ordinary phishing sites as well as rockphish
domains and IP address lifetimes. However, it is not
as good of a ﬁt for fast-ﬂux sites. (Fast-ﬂux IP addresses
are typically ignored by take-down procedures, while the
lifetime of fast-ﬂux domains is consistent with a lognormal
distribution but there is too little data to conﬁrm it.) The
table in Figure 4 gives the relevant attributes for each ﬁtted
distribution, and the plot show the lognormal cumulative
probability distributions and the observed data points.
Note that both axes are logarithmic in scale to demonstrate
the goodness-of-ﬁt in the tail of the distribution. It is signiﬁcant
that the take-down times for these three diﬀerent
categories of phishing attack can each be modelled by the
same family of fat-tailed distribution, particularly since the
actors responsible for the take-down are diﬀerent (domain
registrars, ISPs and system administrators).
Lognormal distributions arise whenever outcomes depend
on realising a number of independent, randomly-distributed
preconditions. For example, software failure rates have been
shown to follow a lognormal distribution [11]. This is because
bugs occur in code locations surrounded by many
conditional statements. Similarly, in order to successfully
remove a phishing site, a number of conditions must be met:
the site must be detected, the site may or may not be located
at an ISP used to removing phishing sites, the bank
may or may not have a working relationship with the police
in the jurisdiction where the site is located, and so on.
5.2 User responses to phishing
Having established how long phishing sites remain operational,
we now estimate user-response rates to phishing sites.
We analysed the site usage statistics from 144 phishing sites,
from which we obtained daily snapshots of hit rates broken
down according to URLs. From this list of popular URLs,
we identiﬁed the phishing entry and completion pages and
cross-referenced its PhishTank report to establish the earliest
report date. Note that these were all ordinary phishing
sites; the rock-phish gang do not leave logging data visible.
Webalizer also provides a rank ordering of entry pages.
An entry page is the ﬁrst one that a site visitor views. By
tracking entry pages, we can readily distinguish between hits
to the phishing page and the rest of the site. Each time we
discovered a live site publishing Webalizer reports, we automatically
returned daily to obtain updated reports until the
site was taken oﬄine. Thus, we ended up with a time sequence
of reports used to estimate the distribution of victim
responses for the days surrounding the phishing report.4
For most phishing scams, when someone enters their details
on the site, they are taken to a fake conﬁrmation page.
4
Our system was not alone in visiting these websites to determine
if they were still operational. We took steps to exclude
these automated monitors from our datasets.
0 1 2 3 4 5
Days after phish reported
Userresponses
0
5
10
15
20
25
Figure 5: User responses to phishing sites over time.
Data includes specious responses.
We picked out these conﬁrmation pages and noted the number
of hits they received, which was small compared with the
number of hits on the initial page. Regrettably, Webalizer
does not record the number of unique visits for all URLs,
so we could seldom obtain the number of unique visits to
the more popular entry pages. Instead, we estimated the
number of unique visits to each site’s conﬁrmation page by
taking the number of hits, and assuming the same fraction
of visits to hits that we saw for the entry page.
Unfortunately, from the point of view of collecting good
data, in many cases the site statistics presented diﬃculties:
we could only obtain one reading before the site was removed,
it could be unclear which were the conﬁrmation
pages, or the Webalizer values were not fetched until several
days after the site went live. For these reasons, we were
only able to obtain usable day-by-day statistics from twenty
sites. An average of these results is given in Figure 5.
We estimate that 21 unique users reach the conﬁrmation
page on the same day that the phish is reported. On the next
day, another 15 responses are expected. A wide ﬂuctuation
in the average daily responses then occurs, from 5 responses
on the second day after reporting to 25 responses on the
third. This is doubtless due to the wide variation in the
overall number of responses each site receives.
Somewhat surprisingly, for many sites the user responses
sustain a fairly high level until the site is removed. We cannot
say whether this is caused by ongoing spamming activity,
or by users catching up with email backlogs in their in-boxes.
This ongoing activity was demonstrated to an extreme by
the usage statistics for a PayPal phishing site loaded onto a
web page for the Niger Water Basin Authority. This site remained
alive into March 2007 and received a steady stream
of phishing responses over a month and a half, so the failure
to take it down more quickly caused ongoing problems.
Thus it does appear that take-down, even when it is slow, is
always going to have some positive eﬀects.
We also observed noticeable variation in the number of
responses received. One site (excluded from the average
presented in Figure 5 because of missing data) drew over
500 responses in one day. Hence a small number of sites may
draw signiﬁcantly larger numbers, so the data presented here
should be viewed as a conservative estimate.
But how accurate is the conﬁrmation rate as a measure
of successful attack? Just because the conﬁrmation page
is visited, this does not necessarily mean that every hit
corresponds to a theft of personal details. To arrive at a
more accurate success rate, we have also gathered 414 user
responses with personal information published on phishing
sites in what the attacker believed to be an obscure location.
We examined each response by hand to determine whether
the responses appeared plausible. Many responses were obviously
fake, with names and addresses like ‘Die Spammer’
and ‘123 Do you think I am Stupid Street’. In fact, the responses
were evenly split: 214 responses were obviously fake,
while 200 appeared real. Hence, albeit from a small sample,
we can estimate that half the responses to a phishing site
represent actual theft of details.
So how does this user-response data relate to the phishing
site lifetimes we described in Section 5.1? Of the sites we
sampled, we might expect around 18 victims per site if they
are removed within one day of reporting, and rising by 8
victims for each successive day. This is a substantial number,
and it is unclear whether the phishing targets can act
suﬃciently quickly to reduce it by very much.
5.3 Estimating the cost of phishing attacks
We can now use our empirical data to estimate the cost
imposed by phishing attacks. We must of course qualify
our calculations by noting that we are using a number of
rather fuzzy estimates, so that substantial reﬁnement will
be possible in the future as better ﬁgures come to light.
We ﬁrst consider the cost imposed by ordinary (i.e., not
rock-phish or fast-ﬂux) phishing sites. We collected data
for eight weeks and conﬁrmed 1 438 banking phishing sites
(we exclude eBay phishing scams for the purpose of this
calculation). Extrapolating, we might expect 9 347 sites per
year. These particular sites remain operational for around
61 hours on average, which yields approximately 30 victims
based on the analysis in Section 5.2. Gartner has estimated
the cost of identity theft to be $572 per victim [5].5
Hence,
the estimated annual loss due to ordinary phishing sites is
9 347 ∗ 30 = 280 410 victims ∗ $572 = $160.4m. Gartner
estimates that 3.5 million Americans give away their details
annually, which leads to an estimated loss of $2bn.
We cannot reliably provide an estimate for the costs of
rock-phish and fast-ﬂux phishing scams since we do not have
similar response data. However, given that the rock-phish
gang send a large proportion of all spam [7], which drives
visitor numbers, it is fair to assume that they steal at least
as much money as ordinary phishers. Thus, we estimate,
at an absolute minimum, that at least $320m is lost annually
due to phishing scams. The disparity with Gartner’s
total of $2bn is doubtless due to the extremely rough approximations
used, both by ourselves and Gartner. But the
diﬀerence will also be accounted for by the other ways in
which personal data can be stolen, for example the theft
of merchant databases, and the activities of malware that
scans ﬁles or operates keyloggers.
6. DISCUSSION
6.1 Do weekends affect take-down?
Defenders working for targets of phishing attacks often
speculate that attackers deliberately wait to advertise phishing
sites until just before the weekend to maximise site up-
5
Gartner also gives a value of $1 244 per victim, but reports
that over half of this is subsequently recovered.
Mon Tue Wed Thu Fri Sat Sun
Hours
0
20
40
60
80
100
120 Non−rock−phish sites
Rock−phish sites
Mon Tue Wed Thu Fri Sat Sun
Fraction
0.00
0.05
0.10
0.15
0.20
Figure 6: Phishing-site lifetimes (Top) and counts
(Bottom), collated by weekday they were ﬁrst re-
ported.
time, since many system administrators will be away. Upon
examining the data, we ﬁnd that sites launched before the
weekend are no more likely to last longer.
We ﬁrst examine whether sites reported near the weekend
stay around longer than those reported earlier in the week.
The upper graph in Figure 6 shows the average duration of
phishing sites based upon the day of the week the site was
ﬁrst reported. Rock-phish sites reported on Tuesday last
longest, while those reported on Monday and Saturday are
removed quickest. It is unclear whether there is any signiﬁcance
to these diﬀerences. Non-rock-phish sites launched on
Saturday last around one day longer than those reported on
Sunday, so it seems as if reports from both Saturday and
Sunday are actioned at much the same time.
The next question we address is whether some days are
more popular for launching phishing sites than others. The
lower graph in Figure 6 measures the fraction of sites reported
on each day of the week. The most striking conclusion
to be drawn from this graph is that the weekend is the
least popular time for both rock-phish and ordinary phishermen
to set up sites. More accurately, fewer reports of new
phishing sites are created over the weekend. It is impossible
to tell whether there are fewer sites appearing, or fewer
people looking for them, on Saturday and Sunday.
6.2 Comparing bank performance
There are 122 banks and other institutions targeted from
our sample of ordinary phishing sites. However, some banks
are targeted a lot more than others: PayPal was impersonated
by 399 of the 1 695 sites, while 52 banks were only
spoofed once. A pie chart showing the proportion of targeted
banks is given in Figure 7.
USBANK
RBC
CHASE
LLOYDS
NATIONWIDE
POSTE_IT
HALIFAX
HSBC
FARGO
WACHOVIABOA
EBAY
PAYPAL
Figure 7: Proportion of ordinary phishing sites impersonating
each bank.
While banks cannot control the number of fake sites that
appear, they can certainly help determine how long they
stick around. Here there is also signiﬁcant disparity. Figure
8 presents a rank-ordering of the average site lifetimes for
banks impersonated more than ﬁve times during the sample
period. Egg, TCF Bank, eGold and Citibank are slowest at
taking down sites (over 4 days), while Capital One, NatWest
and Flagstar Bank are quickest (around 12 hours). Note that
the results should be treated with caution because the differences
will, at least in part, result from diﬀerent choices by
the attackers as to where sites are hosted. Furthermore, a
few long-lived sites can drastically alter the average lifetimes
when banks are rarely impersonated.
6.3 Comparing free-hosting performance
We identiﬁed a number of providers of ‘free’ webspace that
regularly hosted phishing websites. We tracked ﬁve organisations’
take-down performance for phishing sites launched
between February 17, 2007 and June 30, 2007 (a longer period
than the other datasets we report upon). The results
are given in the following table:
Sites Mean lifetime Median lifetime
yahoo.com 174 23.79 hours 6.88 hours
doramail 155 32.78 hours 18.06 hours
pochta.ru 1 253 33.79 hours 16.83 hours
alice.it 159 52.43 hours 18.83 hours
by.ru 254 53.11 hours 38.16 hours
As is apparent, the take-down times diﬀer between the organisations,
with Yahoo! being the fastest. Yahoo’s already
impressive take-down performance is understated, since approximately
half of the sites had already been removed before
appearing in PhishTank and are consequently ignored
by our calculations.
However, it is a little more complex than the table makes
apparent. The vast majority of phishing sites hosted on
doramail, pochta.ru, alice.it and by.ru impersonated
eBay, along with a few PayPal and Posteitaliane fakes. By
contrast, the sites on Yahoo’s free ‘GeoCities’ webspace impersonated
a wide range of diﬀerent institutions, so it is not
possible to determine cause and eﬀect with complete conﬁdence.
There may be some delays not only at the hosting
EGG
TCF
EGOLD
BOA
CITIBANK
GERMANAMERICAN
WAMU
LLOYDS
WACHOVIA
WESTUNION
EBAY
NCUA
DNCU
CHASE
PAYPAL
DESJARDINS
FARGO
HALIFAX
HSBC
POSTE_IT
HAWAIIUSAFCU
USBANK
MILBANK
STGEORGE
BANKOFQUEENSLAND
NATIONWIDE
AMAZON
FNBSA
RBC
MONEYBOOKERS
BARCLAYS
VISA
WESTPAC
CAPITAL1
NATWEST
FLAGSTAR
Lifetime(hours)
0
50
100
150
Figure 8: Phishing-site lifetimes per bank (only banks with ﬁve or more sites are presented).
provider but also within eBay (and these sites accounted
for over a third of all eBay sites). However, it is noteworthy
that in all cases the average lifetime of free-hosting sites
is shorter than for regular phishing sites. This is likely to
be due to diﬀerences in service obligations: ‘free’ webspace
can be pulled at the ﬁrst sign of foul play while regular hosts
must be sensitive to inconveniencing a paying customer with
a potential disruption.
6.4 The ‘clued-up’ effect on take-down speed
‘We also investigated whether the take-down performance
of the providers and registrars changed over time. Figure 9
presents scatter plots of phishing site lifetime based on the
date reported. Phishing sites started appearing on the free
host alice.it in April 2007. Yet nothing was done to remove
any of these sites until early May. This eﬀect can be
seen in its scatter plot (Figure 9-left), where the April site
lifetimes decrease linearly. Once the host became ‘clued up’
to the existence of phishing sites, it started removing sites
much more promptly. However, we did not observe a similar
eﬀect for the other free hosting ﬁrms. Most likely, they had
already been targeted prior to our data collection period, so
we could not witness a similar eﬀect.
We did, however, see the same eﬀect for domain name
registrars removing rock-phish domains. Both .hk (Hong
Kong) domains (Figure 9-centre) and .cn (China) domains
(Figure 9-right) lasted much longer in their ﬁrst month of
use when compared to later months. These plots support
the often-espoused notion that attackers beneﬁt by continuously
seeking out new targets, and suggest that some of the
relative success of the rock-phish gang may come from their
rate of innovation rather than innate technical ability. Such
a conclusion is consistent with Ohm’s warnings against the
myth of the ‘Superuser’ [12].
6.5 Collusion dividend for rock-phish gang
Collusion has enabled the rock-phish gang to pool its resources
to its advantage. First, co-operation has strengthened
its defence by swapping between compromised machines
as they are removed by ISPs. Second, the gang can
impersonate many banks on each domain.
Such overt co-operation creates additional risks, however.
Notably, collusion increases the site’s value as a take-down
target. All the banks whose sites are present on the rockphish
servers ought to be motivated to remove a site, not
just one bank as for regular phishing sites. The eﬀectiveness
of phishing defence will be the sum of the banks’ eﬀorts, so
if they are fully co-operating, then one might expect faster
take-down times. However, we were told (oﬀ the record) that
banks tend not to worry about rock-phish sites until their
brand is mentioned in spam emails. It is also possible that
some of the banks targeted by rock-phish sites are not cooperating
at all, but are instead free-riding on the eﬀorts of a
few more capable organisations [20]. Given the longer takedown
times for rock-phish sites, it appears that currently
the beneﬁts to the gang from collusion outweigh the costs –
at the present level of co-operation by the defenders.
6.6 DNS trade-offs
When phishing ﬁrst became widespread it often used domain
names which were minor variations on the real site’s
identity. This is now rather less common. One of the reasons
for this will be that it gives the defenders the option of
getting either the site removed or having the domain name
suspended. The latter approach is simpler since it requires
co-operation by relatively ‘clued-up’ registrars who are already
experienced in dealing with the branding implications
of too-similar domain names, rather than seeking help from
ISPs who might not be familiar with phishing attacks.
The rock-phish gang use nondescript domain names and
avoid this issue of branding, leaving the registrar with the
dilemma of whether to break a contract on the word of a
third-party who claims that the domain name is being used
for phishing. That registrars are now prepared to suspend
the names is apparent from our data – though it is interesting
to note that at present no systematic attempt is being
0200400600
alice.it
Day reported
Sitelifetime(hours)
May Jun Jul
0200400600
.hk
Day reported
Sitelifetime(hours)
Mar Apr Jun Jul Aug
0200400600
.cn
Day reported
Sitelifetime(hours)
May Jun Jul Aug
Figure 9: Scatter plot of phishing site lifetimes over time.
made to suspend the names that are being used for the DNS
servers associated with the rock-phish domains. This is despite
these names being created solely for the purpose of
providing an indirection for the DNS servers used to resolve
the rock-phish URLs. The argument that these too are entirely
fraudulent is not yet won – though as can be seen from
Figure 1, when the rock-phish DNS system is disrupted the
eﬀect can be dramatic. Of course, when these name service
names are regularly suspended the gang will use absolute IP
addresses to locate their DNS servers, thereby continuing to
operate, albeit with slightly less ﬂexibility.
The ﬁnal trade-oﬀ of note that relates to DNS is the
caching mentioned in Section 4. Setting a high value for
‘time-to-live’ will ensure that domain names may be resolved,
particularly at larger ISPs, for some time after the
domain is suspended by a registrar. However, lower values
oﬀer more agility as compromised machines are reclaimed
by their owners.
6.7 Countermeasures
So if take-down strategies are not completely mitigating
phishing attacks, what else can be done?
One important advance would be to reduce the information
asymmetry for the defenders. Phishers obfuscate their
behaviour and make sites appear independent and thereby
phishing appears to many to be an intractable problem. It
is in the interest of security vendors to accept inﬂated statistics
to make the problem seem more important. Indeed, such
inﬂation has occurred frequently, from PhishTank boasting
about the large number of sites it identiﬁes [13] to APACS,
the UK payment association, asserting a 726% increase in
phishing attacks between 2005 and 2006 (with merely a 44%
rise in losses) [1]. But law enforcement will not prioritise investigations
if there appear to be hundreds of small-scale
phishing attacks, whereas their response would be diﬀerent
if there were just a handful of criminals to catch. Hence,
improving the measurement systems, and better identifying
patterns of similar behaviour, will give defenders the opportunity
to focus their response upon a smaller number of
unique phishing gangs.
Other entirely obvious countermeasures include reducing
the availability of compromised machines, rate-limiting domain
registration, dissuading users from visiting the sites,
and reducing the damage that disclosing private information
can do. Unfortunately, these strategies, are either infeasible
or are being attempted with limited impact so far. What
does seem to be working, at least to some extent, is for the
banks that are attacked to improve their back-oﬃce controls.
The incentives to go phishing are much reduced if miscreants
cannot use the account numbers and passwords they steal
to transfer money out of accounts; or if they cannot move
money out of the banking system in such a manner that the
transfers cannot be clawed back.
7. CONCLUSION
In this paper we have empirically measured phishing site
lifetimes and user response rates, to better understand the
impact of the take-down strategies of the institutions that
are being targeted. While take-down certainly hastens the
fraudsters’ movement from one compromised site to another,
many users continue to fall victim. Furthermore, the data
reveals that sophisticated attackers can extend site lifetimes.
Indeed, the rock-phish gang has already demonstrated techniques
for adapting to regular removal. They have invented
(or stumbled upon) a relatively successful formula, and with
‘fast-ﬂux’ are experimenting with another, but it is far from
clear that all the defenders currently understand what those
mechanisms are, and how best to disrupt them.
Removing phishing websites is often perceived of as a
Sisyphean task, but our analysis shows that even when it
is done slowly, it does reduce the damage that is done. We
have also demonstrated wide disparities in reaction time between
comparable organisations. We have shown that these
disparities extend across borders, some banks work faster
than others and some web-hosting companies do a better
job at removing sites. Improving the transparency of attacker
strategy and defender performance is key to reducing
the success of phishing scams.
There is still much work to be done to better understand
attack behaviour and the extent to which defenders
are pulling their weight. Much more analysis can be carried
out on the data we are collecting to show how well
they are doing. For instance, we could compare site lifetimes
categorised by hosting country in order to estimate
the externality impact diﬀerent countries impose on others.
We aim to improve the dataset in several ways: addressing
the bias against sites removed before inspection, identifying
non-rock-phish duplicates better, and improving completeness
by examining other data sources. Finally, we would
also like to study how size and perceived security practices
impact the way in which attackers select particular organisations
as targets. It may be that a brief display of competence
will send the attackers to another target, much as burglar
alarms protect you and not your neighbours.
8. ACKNOWLEDGEMENTS
Tyler Moore is supported by the UK Marshall Aid Commemoration
Commission and by US National Science Foundation
grant DGE-0636782. Richard Clayton is working on
the spamHINTS project, funded by Intel Research.
9. REFERENCES
[1] APACS: Card fraud losses continue to fall. Press
Release, 14 March 2007. http://www.apacs.org.uk/
media_centre/press/07_14_03.html
[2] L. Jean Camp: Reliable, Usable Signaling to Defeat
Masquerade Attacks. The Fifth Workshop on the
Economics of Information Security (WEIS 2006), 2006.
[3] Ionut Alex. Chitu: Google Redirect Notice. 16 Feb
2007. http://googlesystem.blogspot.com/2007/02/
google-redirect-notice.html
[4] Christine E. Drake, Jonathan J. Oliver, and Eugene J.
Koontz: Anatomy of a Phishing Email. First
Conference on Email and Anti-Spam (CEAS),
Mountain View, CA, USA, 2–3 Aug 2004.
[5] Gartner Inc: Gartner Says Number of Phishing
E-Mails Sent to U.S. Adults Nearly Doubles in Just
Two Years, Press Release, 9 Nov 2006.
http://www.gartner.com/it/page.jsp?id=498245
[6] Markus Jakobsson and Steven Myers (Eds.): Phishing
and Countermeasures: Understanding the Increasing
Problem of Electronic Identity Theft. Wiley, Nov
2006, ISBN: 978-0-471-78245-2.
[7] Robert McMillan: ‘Rock Phish’ blamed for surge in
phishing. InfoWorld, 12 Dec 2006.
http://www.infoworld.com/article/06/12/12/
HNrockphish_1.html
[8] Microsoft Inc.: Phishing Filter: Help protect yourself
from online scams. 28 Oct 2006. http:
//www.microsoft.com/athome/security/online/
phishing_filter.mspx
[9] Daisuke Miyamoto, Hiroaki Hazeyama, and Youki
Kadobayashi: SPS: a simple ﬁltering algorithm to
thwart phishing attacks. Asian Internet Engineering
Conference (AINTEC), 13–15 Dec 2005.
[10] Mozilla Corp.: Phishing Protection. 2006.
http://www.mozilla.com/en-US/firefox/
phishing-protection/
[11] Robert Mullen: The Lognormal Distribution of
Software Failure Rates: Origin and Evidence. 9th
International Symposium on Software Reliability
Engineering, IEEE, 1998, pp. 134–142.
[12] Paul Ohm: The Myth of the Superuser: Fear, Risk,
and Harm Online. University of Colorado Law Legal
Studies Research Paper No. 07-14, May 2007.
[13] OpenDNS: OpenDNS Shares April 2007 PhishTank
Statistics, Press Release, 1 May 2007. http://www.
opendns.com/about/press_release.php?id=14
[14] Ying Pan and Xuhua Ding: Anomaly Based Web
Phishing Page Detection. 22nd Annual Computer
Security Applications Conference (ACSAC’06), IEEE,
2006, pp. 381–392.
[15] PhishTank: http://www.phishtank.com/
[16] John S. Quarterman: PhishScope: Tracking Phish
Server Clusters. Digital Forensics Practice, 1(2), 2006,
pp: 103–114.
[17] Blake Ross, Collin Jackson, Nick Miyake, Dan Boneh
and John C Mitchell: Stronger Password
Authentication Using Browser Extensions. 14th
USENIX Security Symposium, 2005.
[18] Stuart E. Schechter, Rachna Dhamija, Andy Ozment
and Ian Fischer: The Emperor’s New Security
Indicators: An evaluation of website authentication
and the eﬀect of role playing on usability studies. 2007
IEEE Symposium on Security and Privacy.
[19] Rob Thomas and Jerry Martin: The underground
economy: priceless. USENIX ;login: 31(6), Dec 2006.
[20] Hal Varian: System Reliability and Free Riding. In
Economics of Information Security, L. J. Camp, S.
Lewis, eds. (Kluwer Academic Publishers, 2004), vol.
12 of Advances in Information Security, pp. 1–15.
[21] Webalizer: http://www.mrunix.net/webalizer/
[22] Liu Wenyin, Guanglin Huang, Liu Xiaoyue, Zhang
Min and Xiaotie Deng: Detection of Phishing
Webpages based on Visual Similarity. Proc. 14th
International World Wide Web Conference, ACM
Press, 2005, pp. 1060–1061.
[23] Min Wu, Robert C. Miller and Simson L. Garﬁnkel:
Do security toolbars actually prevent phishing
attacks? Proceedings of the SIGCHI conference on
Human Factors in computing systems (CHI’06), ACM
Press, 2006, pp. 601–610.
[24] Yue Zhang, Serge Egelman, Lorrie Cranor, and Jason
Hong: Phinding Phish: Evaluating Anti-Phishing
Tools. In Proceedings of the 14th Annual Network &
Distributed System Security Symposium (NDSS
2007), 2007.