scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Protocol-level hidden server discovery

TL;DR: This paper investigates the Tor hidden server protocol and develops a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers.
Abstract: Tor hidden services are commonly used to provide a TCP based service to users without exposing the hidden server's IP address in order to achieve anonymity and anti-censorship. However, hidden services are currently abused in various ways. Illegal content such as child pornography has been discovered on various Tor hidden servers. In this paper, we propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. We investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. We manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. Once our controlled entry onion routers detect such a feature, we can confirm the IP address of the hidden server. We conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of our approach.

Summary (4 min read)

Introduction

  • Tor is a broadly-used low-latency anonymous communication system which supports TCP applications over the Internet [1].
  • Unfortunately, hidden services have been misused or abused for various illegal purposes.
  • The attacker controls both malicious client and entry routers, and uses a simple passive timing analysis, i.e., cell counting, to discover the same patten in the client’s traffic and the entry’s traffic to identify the hidden service at the entry side.
  • In Phases I and II, related information of these cells observed at their clients, entry routers, and the rendezvous point has been sent to their central server.

A. Components of the Tor Network

  • Figure 1 illustrates the basic architecture of Tor network.
  • The first threebye header of the Tor cell is not encrypted so that the Tor onion router can read this header.
  • The first two bytes is the circuit ID, while the third byte, is used to indicate the specific command of this cell.
  • The authors categorize the Tor cell into two types: the control cell as illustrated in Figure 2 (a) and the relay cell as shown in Figure 2 (b).
  • The filed Command of relay cell is CELL RELAY that is used to relay the application data.

B. Circuit Selection and Creation

  • To communicate with an application server via Tor, a Tor client first downloads all of the onion router information from the directory server and uses source routing by choosing a series of onion routers as a route.
  • The authors call the sequence of onion routers as the path through Tor.
  • After that, the client initiates the procedure of creating a circuit incrementally, one hop at a time.
  • Then the client sends a CELL CREATE cell through the TLS connection and uses the Diffie-Hellman (DH) handshake protocol to negotiate a base key K1 = gxy with entry onion router, which responds with a CELL CREATED cell.
  • Once the circuit is established, the client sends a RELAY COMMAND BEGIN cell to the exit onion router, and the cell is encrypted as {{{Begin < IP, Port >}kf1}kf2}kf3 , where the subscript refers to the key used for encryption of one specific onion skin.

C. How Does the Hidden Service Work

  • A hidden service involves six participants, including a Tor client, the directory server, onion routers, a rendezvous point, an introduction point, and a hidden server.
  • Once the introduction point is decided, the hidden server establishes a circuit to the introduction point.
  • This key is used for end-to-end encryption between client and server.
  • 9) The rendezvous point obtains the RELAY COMMAND RENDEZVOUS1 cell and compares the rendezvous cookie from the cell and the one from the Tor client.
  • From the above procedure, the Tor client only knows the introduction point instead of the hidden server directly, and the hidden server only knows the rendezvous point instead of the Tor client directly.

A. Basic Idea

  • Since only entry onion routers may know the real IP address of a hidden server, the authors assume that they are able to control several entry onion routers3.
  • The authors rendezvous point manipulates an appropriate cell [9], and forwards the mangled cell to the hidden server.
  • The rendezvous point also reports this cell to the central server.
  • (v) To determine if the hidden server chooses one of their entry onion routers, the central server searches for correlation between the time when the rendezvous point sends the manipulated cell, the time when the rendezvous point receives the destroy cell, and the time when the entry onion router receives the destroy cell.
  • This is reasonable because onion routers are set up by volunteers.

B. Details of Protocol-level Hidden Server Discovery

  • The hidden server discovery process can be divided into three phases.
  • When the hidden server receives the RELAY COMMAND INTRODUCE2 cell forwarded from the introduction point, the hidden server will build a circuit to the rendezvous point as shown in Step 8 of Figure 5.
  • Let us denote the cells as protocol-level feature.
  • After the central server receives the manipulated cell information, it can first filter out the circuits along which their entry onion routers are not chosen as the first router, by counting the number of CELL CREATE, CELL CREATED and CELL RELAY cells based on the Tor circuit creation protocol.
  • Eventually, the authors choose the CELL DESTROY cell information received from the rendezvous point as the endsign cell of Phase II.

C. Make the Discovery Automatic

  • The authors want to emphasize that their discovery of hidden server is conducted fully automatically.
  • The central server builds tcp connections to the Tor client, entry onion routers and rendezvous points, and receives the information from those nodes.
  • Subsequently, the central server receives the begin-sign cell of Phase II from the rendezvous point and records the cell type, circuit ID, and the timing of the begin-sign cell, denoted as Tb.
  • Once the circuit is found, the central server compares the timing of the CELL DESTROY cell from the entry onion router by using the condition Tb < Td < Te. Eventually, the authors make the entire discovery process automatic.

IV. ANALYSIS

  • For their approach to be effective, the key issue is that one of their controlled entry onion routers should be selected as the entry onion router by the hidden server.
  • The authors analyze the chance that the hidden server selects one of their onion routers as the entry router, and propose a strategy that can greatly increases this chance without incurring large cost.

A. Catch Probability

  • To evaluate the effectiveness of their discovery approach, the authors analyze the catch probability, i.e., the probability that a circuit from a hidden server selects one of their controlled onion routers as the entry onion router.
  • To start with, the authors need to introduce how Tor determines a circuit among the many possible paths.
  • To be specific, the onion routers are categorized into four classes: pure entry routers (entry guards), pure exit routers, both entry and exit routers (denoted as EE routers), and neither entry nor exit routers (denoted as N-EE routers).
  • According to the Tor onion router selection algorithm, the bandwidth of each entry onion router is weighted.
  • The authors intentionally denote the catch probability as P(k, b) to emphasize that the probability depends on the number of injected onion routers and the claimed bandwidth of each injected onion router.

B. How to Improve the Catch Probability

  • Based on Equation (2), it is easy to prove the following claims and the proof can be found in their technical report [10]: .
  • The catch probability is determined by the aggregated bandwidth contributed by the controlled Tor entry routers.

V. EVALUATION

  • The authors have implemented the proposed Tor hidden service discovery approach in Section III.
  • The authors elaborate the results of the empirical evaluation of the approach.
  • The authors experimental results match the theoretical analysis presented in Section IV well.

A. Experiment setup

  • Figure 9 illustrates the experiment setup for protocol-level hidden service discovery over the real-world Tor.
  • The authors deployed a Tor client, entry onion router, and central server at two different campuses in the north American.
  • To implement the discovery, the authors revised the source code of Tor at the Tor client side, rendezvous point, and entry router in order to establish the connections to their central server and send the related information.
  • The version of Tor in their experiment is the latest stable version 0.2.2.37.
  • In addition, at the client side, the authors implemented a web client to automatically send a http request of the specific onion address, and then installed the HTTP proxy, i.e., Privoxy [12], in order to relay the http request into the OP.

B. Experiment Results

  • Table I gives the detection rate by their protocol-level hidden server discovery approach.
  • The experiments were repeated for 1000 times for deriving the true positive rate, the probability that the hidden server is detected if it uses their entry server.
  • One experiment lasts for around 15 seconds.
  • This figure demonstrates that the catch probability increases quickly with the number of controlled entry routers and the bandwidth of each entry router.
  • The current Tor network has 776 pure entry routers and 273 EE routers.

VI. DISCUSSION

  • There are various complicated cases in discovering a hidden server via their protocol-level discovery approach.
  • Once the hidden server chooses one of their entry routers, it will use their entry router in the following 30-60 days and will be exposed by their discovery approach.
  • Another complicated case is that the operator of a hidden server can choose three their trusted entry guards or Tor bridges6 to avoid choosing their surveillance entry guards.
  • Recall in Phase I of their original approach, if their controlled middle router is selected by the hidden server side circuit, the middle router will receive one CELL CREATE cell and three CELL RELAY cells, including a 6Tor bridges are a type of hidden onion routers that are not public in the directory server.
  • The two-steps discovery approach above can also be used to tackle Complicated case I.

VIII. CONCLUSION

  • The hidden service over Tor is a double-edged sword.
  • A system that tracks down a hidden service can effectively deter malicious users from abusing the Tor network for illegal usage.
  • The authors design, implement, and evaluate such a system.
  • The authors method augments the arsenal of existing detection tools, but it is unique in that the authors do not rely on conventional time consuming traffic analysis or watermarking techniques.
  • The authors also expect the debate of detrimental sides of anonymous communication to continue in the long run, and hope to give a choice to law enforcement for tracking notorious hidden services.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Protocol-level Hidden Server Discovery
Zhen Ling
, Junzhou Luo
, Kui Wu
and Xinwen Fu
Southeast University, Email: {zhenling, jluo}@seu.edu.cn
University of Victoria, Email: wkui@cs.uvic.ca
University of Massachusetts Lowell, Email: xinwenfu@cs.uml.edu
Abstract—Tor hidden services are commonly used to provide a
TCP based service to users without exposing the hidden server’s
IP address in order to achieve anonymity and anti-censorship.
However, hidden services are currently abused in various ways.
Illegal content such as child pornography has been discovered on
various Tor hidden servers. In this paper, we propose a protocol-
level hidden server discovery approach to locate the Tor hidden
server that hosts the illegal website. We investigate the Tor hidden
server protocol and develop a hidden server discovery system,
which consists of a Tor client, a Tor rendezvous point, and
several Tor entry onion routers. We manipulate Tor cells, the
basic transmission unit over Tor, at the Tor rendezvous point
to generate a protocol-level feature at the entry onion routers.
Once our controlled entry onion routers detect such a feature,
we can confirm the IP address of the hidden server. We conduct
extensive analysis and experiments to demonstrate the feasibility
and effectiveness of our approach.
Keywords-Anonymous Communication, Tor, Hidden Service
I. INTRODUCTION
Tor is a broadly-used low-latency anonymous communica-
tion system which supports TCP applications over the Inter-
net [1]. It provides users with anonymity service, helps fight
against Internet censorship, and supports hidden services to
preserve the anonymity of web services [2]. Tor was deployed
in late 2003 and comprised hundreds of onion routers, while
hidden services were released in early 2004. Due to increas-
ingly high demand of privacy protection, the Tor network has
seen steady growth, consisting of around 3000 volunteer based
Tor onion routers as of July 2012.
Unfortunately, hidden services have been misused or abused
for various illegal purposes. They can host botnet and illegal
web contents such as drug trading information [3] and pornog-
raphy. The consequence is severe: If botnets are deployed with
the hidden service over Tor [4], they are hard to take down
because of the anonymity protected with the hidden service;
if a hidden service hosts child pornography website [5] [6]
1
,
the hidden service actually blindly provides a protection to the
illegal content in most countries.
Existing research work [7], [8] has been carried out to
investigate the attacks which can locate the Tor hidden server.
The approach in [7] is based on traffic analysis. The attacker
controls both malicious client and entry routers, and uses a
simple passive timing analysis, i.e., cell counting, to discover
the same patten in the client’s traffic and the entry’s traffic
to identify the hidden service at the entry side. Murdoch [8]
1
The authors believe that illegal content is hosted at this hidden website
although we did not dig it because of legal concerns.
presented a clock skew based approach to identifying whether
or not a given Tor node is a hidden server. The attacker
evaluates the load of a given Tor node. Since the server’s
temperature would increase while its workload rises, the
attacker can identify whether the node is the attacked hidden
service by estimating its temperature through measuring the
clock skew. Nevertheless, the attacks based on traffic analysis
may suffer a high rate of false positives due to various factors,
such as Internet traffic dynamics, the load of Tor nodes, and
the large number of cells for the purpose of statistical traffic
analysis.
In this paper, we propose a protocol-level discovery ap-
proach to locating a hidden server by utilizing Tor proto-
col features. We (law enforcement) control a Tor client, a
rendezvous point, several entry onion routers, and a central
server. The discovery takes three phases. In Phase I, our Tor
client continues to create circuits to the hidden server until
one of our entry routers sees a special combination of cells
of different types. Such a combination, denoted as a protocol-
level feature, comes from the Tor protocol that creates the
circuits between a client and a hidden server. However, even
if our entry router observes such a feature, it may result from
other clients that create circuits to a hidden server through
our entry router. Phase II is to confirm that the hidden server
chooses our entry router. We manipulate cells from our client
to incur a special decryption error at the hidden server, which
will destroy all circuits to the client. If our entry router sees
the destroy cell, we know that our entry router is chosen by
the hidden server. Phase III is used to correlate all the events
above. In Phases I and II, related information of these cells
observed at our clients, entry routers, and the rendezvous point
has been sent to our central server. In Phase III, we exploit the
timing information of these cells, and identify the correlation
to confirm that the target hidden server is behind our entry
router. In this way, we have located the hidden server.
Our approach has several unique advantages. First, it is easy
to deploy our detection system. Second, compared to traffic
analysis based methods, our approach is significantly faster,
fully automatic and can quickly locate the hidden server using
only several cells. Third, our approach is accurate with an
observed detection rate of 100% and has an observed low false
positive of 0%. Fourth, our approach works on the protocol
level and is oblivious to traffic patterns; it is more general
and can be used to identify malicious hidden services. A
hidden server may also use its trusted entry routers or Tor
bridges as the first hop into the Tor network. We discuss those
complicated cases of tracking hidden servers in Section VI.

Client
(OP)
Tor Network
Directory Servers
Exit
(OR3)
Middle
(OR2)
Entry
(OR1)
Onion Routers
Legend
Server
Fig. 1. Tor network
The rest of the paper is organized as follows. In Section II,
we introduce the components of Tor, its basic operations and
the protocol of hidden service. In Section III, we present the
basic idea of our approach and then elaborate the algorithm.
In Section IV, we analyze the effectiveness of the approach.
In Section V, we show experimental results on Tor, and we
discuss complicated cases of tracking hidden servers in Section
VI. Related work is reviewed in Section VII. The paper is
concluded in Section VIII.
II. BACKGROUND
In this section, we first introduce the Tor network and then
present its basic operations and the protocol of hidden service.
A. Components of the Tor Network
Figure 1 illustrates the basic architecture of Tor network.
The following components are involved in the typical use of
Tor network:
Tor clients. A Tor client installs a local software referred
to as onion proxy (OP), which packs application data into
equal-sized cells (512 bytes) and delivers them into Tor
network. A cell is the basic transmission unit of Tor.
Onion routers (OR). The onion routers relay the cells on
behalf of Tor client and server.
Directory servers. Directory servers hold the information
of onion routers and hidden services, such as the public
keys of routers and hidden servers.
Application servers. It supports TCP applications such as
a web service and an IRC service.
12
Circ_id Command
Relay
Command
Recognized Stream_id Intergrity Length
Data
1 2 2 4 2 498
12
Circ_id Command Data
509
(a) Tor Cell Format
(b) Tor Relay Cell Format
Fig. 2. Tor cell format [1]
Figure 2 illustrates the format of the Tor cell. The first three-
bye header of the Tor cell is not encrypted so that the Tor onion
router can read this header. The first two bytes is the circuit ID,
while the third byte, is used to indicate the specific command
of this cell. We categorize the Tor cell into two types: the
control cell as illustrated in Figure 2 (a) and the relay cell as
shown in Figure 2 (b). The filed Command of a control cell
can be, for instance, CELL CREATE/CELL CREATE FAST
or CELL CREATED/CELL CREATED FAST, employed for
establishing a new circuit; and CELL DESTROY, used for
Create C3,
E(g^x3)
t
tt
tt
Created C3,
g^y3,H(K3)
Relay C2,
{Extended,g^y3,H(K3)}
Relay C1,
{{Extended,g^y3,H(K3)}}
Relay C2,
{Extend,OR3,E(g^x3)}
Relay C1,
{{Extend,OR3,E(g^x3)}}
Created C2,
g^y2,H(K2)
Relay C1,
{Extended,g^y2,H(K2)}
Create C2,
E(g^x2)
Relay C1,
{Extend,OR2,E(g^x2)}
Server
Created C1,
g^y1,H(K1)
Create C1,
E(g^x1)
Client
(OP)
Entry OR
(link is TLS-encrypted)
Exit OR
Middel OR
(link is TLS-encrypted) (link is TLS-encrypted) (unencrypted)
Legend:
E(x) --- RSA encryption
{X} --- AES encryption
CN --- a circuit ID numbered N
Fig. 3. Circuit creation
tearing down a circuit. The filed Command of relay cell is
CELL RELAY that is used to relay the application data. In
addition, there are numerous types of relay commands (Relay
Command), and its format is like RELAY COMMAND X
where “X” is a word. In our paper, when we mention the
RELAY COMMAND X cell, it indicates a relay cell and the
content of this cell is onion-like encrypted. We will elaborate
these commands further in later sections when we discuss the
Tor operations from the perspective of protocol-level.
B. Circuit Selection and Creation
To communicate with an application server via Tor, a Tor
client first downloads all of the onion router information from
the directory server and uses source routing by choosing a
series of onion routers as a route. We call the sequence of
onion routers as the path through Tor. The number of onion
routers is called the path length. In the Tor network, a path is
also called a circuit, thus we use path/circuit interchangeably
in this paper. We employ the default path length of 3 as an
example in Figure 1 to show how a path is chosen. The client
first selects an appropriate exit onion router (OR3), which
should have an exit policy supporting the relay of the TCP
stream from the client. Then, the client chooses a proper entry
onion router (OR1) (also referred to as entry guard) and a
middle onion router (OR2). After that, the client initiates the
procedure of creating a circuit incrementally, one hop at a
time. Eventually, the client can communicate with the remote
server through this circuit, i.e., OR1 OR2 OR3.
Figure 3 illustrates the procedure that a client builds a
circuit. As shown in Figure 3, the client first establishes a
TLS connection with entry router using the TLS protocol.
Then the client sends a CELL CREATE cell through the
TLS connection and uses the Diffie-Hellman (DH) handshake
protocol to negotiate a base key K
1
= g
xy
with entry onion
router, which responds with a CELL CREATED cell. Note that
the H(K
1
) is the hash value of K
1
in Figure 3. From this base
key material, a forward symmetric key kf
1
and a backward
symmetric key kb
1
are generated. In this way, the first hop
of this circuit, denoted as C1, is created. Similarly, the client
extends the circuit to include the second hop (C2) and the
third hop (C3) of the circuit.
Figure 4 shows the procedure of the data transmission over
the circuit. Once the circuit is established, the client sends a

ttt
tt
TCP Teardown
Relay C3,
{End,Reason}
Relay C2,
{{End,Reason}}
Relay C1,
{{{End,Reason}}}
ĀHelloā
Relay C3,
{Data,āHelloā}
Relay C2,
{{Data,āHelloā}}
Relay C1,
{{{Data,āHelloā}}}
Relay C3,
{Connected}
Relay C2,
{{Connected}}
Relay C1,
{{{Connected}}}
TCP Handshake
<IP,Port>
Relay C3,
{Begin<IP,Port>}
Relay C2,
{{Begin<IP,Port>}}
Relay C1,
{{{Begin<IP,Port>}}}
Client
(OP)
Entry OR
(link is TLS-encrypted)
Exit ORMiddel OR
(link is TLS-encrypted)
(link is TLS-encrypted)
Server
(unencrypted)
Fig. 4. Data transmission over the circuit
RELAY COMMAND BEGIN cell to the exit onion router, and
the cell is encrypted as {{{Begin < IP, P ort >}
kf
1
}
kf
2
}
kf
3
,
where the subscript refers to the key used for encryption of
one specific onion skin. The three layers of onion skin are
removed one by one each time the cell traverses an onion
router through the circuit. When exit onion router removes
the last onion skin by decryption, it recognizes that the request
intends to open a TCP stream to a port at the destination IP
pointing to the remote server. Therefore, the exit onion router
acts as a proxy, builds a TCP connection with the server, and
sends a RELAY COMMAND CONNECTED cell back to the
client. Then the client can download the file.
Tor Network
Directory
Servers
RPO
Middle
Entry
Onion Routers
Legend
Hidden
Server
Client
(OP)
Exit
Middle
Entry
Entry
Middle
IPO
Exit
Middle
Entry
Entry
Middle
Exit
Entry
Middle
Exit
Fig. 5. Tor hidden service
C. How Does the Hidden Service Work
A hidden service involves six participants, including a Tor
client, the directory server, onion routers, a rendezvous point,
an introduction point, and a hidden server. Since the first
three participants have been described above, we will introduce
the functionality of the rendezvous point and the introduction
point and how they work together to support a hidden service.
Introduction point (IPO). An introduction point is select-
ed and published in the directory server associated with
the descriptor of the hidden service by the hidden server.
Once the introduction point is decided, the hidden server
establishes a circuit to the introduction point. Then the
introduction point plays as the front interface to Tor client
and waits until a Tor client creates a three-hop circuit to
the introduction point and forwards the request data from
the Tor client side circuit to the hidden server side circuit.
Rendezvous point (RPO). A rendezvous point is chosen
by the Tor client. Both the Tor client and the hidden
server will establish a three-hop circuit to the RPO, which
acts as a message relay to transmit the application data
between the hidden server side circuit and the Tor client
side circuit.
Hidden server. A hidden server provides various TCP
applications such as web server and IRC server. It could
be deployed over OP or OR.
Figure 5 depicts the procedure of establishing a connection
between the Tor client and the specific hidden server.
1) The hidden server first selects several onion
routers as introduction points and builds the
circuits to these introduction points. To build
a circuit, the hidden server will send the
RELAY COMMAND ESTABLISH INTRO cell to an
introduction point, and the introduction point will reply
with the RELAY COMMAND INTRO ESTABLISHED
cell to inform the hidden server that the circuit is
established.
2) Once the circuits to introduction points are established,
the hidden server establishes a circuit to the directory
server and advertises the service descriptor to the direc-
tory server, including the public key of the hidden server
and the information regarding the introduction points.
Then the owner of the hidden server can post the onion
address
2
in a public place to attract users to access the
hidden service via Tor.
3) When a Tor client obtains the onion address, the client
creates a circuit to the directory server and fetches the
relevant information advertised by the hidden service.
Then the client learns the introduction points of the
hidden service.
4) The client selects a rendezvous point and builds a
circuit to the rendezvous point. The client will send a
RELAY COMMAND ESTABLISH RENDEZVOUS
cell, which carries a rendezvous cookie, to
the rendezvous point, which replies with a
RELAY COMMAND RENDEZVOUS ESTABLISHED
cell to indicate the successful circuit establishment.
5) The client creates a three-hop circuit to one
of the introduction points and transmits a
RELAY COMMAND INTRODUCE1 cell to the chosen
introduction point. The cell carries the information such
as the rendezvous point, rendezvous cookie and the
Diffie-Hellman data g
x
generated by the Tor client.
6) Once the introduction point receives the
RELAY COMMAND INTRODUCE1 cell, it replies
with a RELAY COMMAND INTRODUCE ACK cell to
the client. After the client receives this ACK cell, it
tears down this circuit to the introduction point.
7) The introduction point repacks the RE-
2
Onion address is generated by the hidden server. It is a hostname of the
form “x.onion”, where “x” consists of 16 random characters.

LAY COMMAND INTRODUCE1 cell into a
RELAY COMMAND INTRODUCE2 cell, and then
sends the RELAY COMMAND INTRODUCE2 cell to
the hidden server.
8) After the hidden server receives the RE-
LAY COMMAND INTRODUCE2 cell, it knows
the information of the rendezvous point, rendezvous
cookie and Diffie-Hellman data g
x
. The hidden server
can generate the Diffie-Hellman data g
y
and derive
the key K = g
xy
. This key is used for end-to-end
encryption between client and server. Then the hidden
server builds a circuit to the rendezvous point and
sends a RELAY COMMAND RENDEZVOUS1 cell to
the rendezvous point via the circuit. The cell includes
the key data g
y
, the hash value of the key H(K) and
the rendezvous cookie.
9) The rendezvous point obtains the RE-
LAY COMMAND RENDEZVOUS1 cell and
compares the rendezvous cookie from the cell
and the one from the Tor client. Once the
rendezvous cookies are matched, the rendezvous
point removes the rendezvous cookie from the
RELAY COMMAND RENDEZVOUS1 cell and repacks
the rest data into RELAY COMMAND RENDEZVOUS2
cell, and then forwards the cell to the client.
10) When the Tor client receives the RE-
LAY COMMAND RENDEZVOUS2 cell, it can generate
the key K = g
xy
using g
y
and verify it based on
H(K). The key is used to encrypt the data between
client and server and is the same as the one generated
by the hidden server at Step 8. In this way, the client
and hidden server complete the handshake. Then the
client sends a RELAY COMMAND BEGIN cell to
establish a stream to the hidden server via the six-hop
circuit.
From the above procedure, the Tor client only knows the
introduction point instead of the hidden server directly, and
the hidden server only knows the rendezvous point instead
of the Tor client directly. In addition, the introduction point
acts as the front interface to the Tor client for service query
and request only, and it does not get involved any more once
the six-hop circuit passing through the rendezvous point is
established for data communication between the Tor client
and the hidden server. Since either introduction point or
rendezvous point knows neither the location of Tor client nor
the location of hidden server, anonymous web service is hence
achieved. Note that in the Tor network, only the entry onion
router knows IP addresses of hidden servers.
III. PROTOCOL-LEVEL HIDDEN SERVER DISCOVERY
In this section, we first introduce the basic idea of discover-
ing a Tor hidden server and then present detailed algorithms.
A. Basic Idea
Since only entry onion routers may know the real IP address
of a hidden server, we assume that we are able to control
several entry onion routers
3
. In addition, we need a client
and rendezvous point to cooperate with entry onion routers.
A central server is used to record information of related
cells forwarded from the Tor client, entry onion routers, and
rendezvous point.
The discovery is conducted as follows: (i) Our Tor client
obtains the introduction point information from the directory
server, and builds a circuit to the introduction point and also
reports the circuit creation to our central server to indicate
the start of discovery. (ii) The hidden server also establishes
a circuit to the rendezvous point. If the hidden server chooses
our entry router, our entry router will see a special combination
of cells of different types, denoted as protocol-level features,
during the creation of those circuits. However, such protocol-
level features do not necessarily imply the hidden server
chooses our entry router. We perform the following actions
to confirm it is a true positive. (iii) Once the connection is
established between the client and the hidden server, the client
will send cells that contain application data to the hidden
server as illustrated in Step 10 of Figure 5. Our rendezvous
point manipulates an appropriate cell [9], and forwards the
mangled cell to the hidden server. The rendezvous point also
reports this cell to the central server. (iv) The mangled cell
arrives at the hidden server. Since the hidden server cannot
correctly decrypt the manipulated cell, it will destroy the
circuit between the client and hidden server by sending a
destroy cell to the client. This cell traverses along the circuit
to the client. The rendezvous point will detect it and report
it to the central server to indicate the end of the discovery
process. In addition, our controlled entry onion routers will
report to the central server immediately when a destroy cell
is received. (v) To determine if the hidden server chooses
one of our entry onion routers, the central server searches
for correlation between the time when the rendezvous point
sends the manipulated cell, the time when the rendezvous
point receives the destroy cell, and the time when the entry
onion router receives the destroy cell. Since the entry onion
router knows the IP address of the circuit creator, once such
time correlation is found, we can identify the hidden server.
Figure 6 illustrates the work flow of the protocol-level hidden
server discovery approach.
RPO
Entry
Onion Routers
Legend
Hidden
Server
Client
(OP)
Exit
EntryMiddle
IPO
Exit
Middle
Entry
Entry
Middle
Middle
Receive
Create/Relay Cell
Send
Intro Cell
Modify
Begin Cell
Receive
Destory Cell
Tor Network
Fig. 6. Circuit creation and data transmission
3
The same assumption was made in virtually all attacks towards the Tor
network. This is reasonable because onion routers are set up by volunteers.

B. Details of Protocol-level Hidden Server Discovery
The hidden server discovery process can be divided into
three phases. Phase I: Presumably identify the hidden server
- the client continues to create circuits to the hidden server
until one of our entry routers sees a special combination of
cells of different types, i.e., protocol-level features. Phase II:
Verify the hidden server - our rendezvous point manipulates a
data cell and creates a decryption error at the hidden server,
which has to send out a destroy cell to destroy the circuit.
The destroy cell can be recognized by our entry router if the
hidden server uses our entry router. Phase III: Conclude by
time correlation - the central server uses timing information
of collected cells to correlate the unique sequence of events of
our discovery actions and draw a conclusion if the presumably
identified entry router is chosen by the hidden server and locate
the hidden server accordingly.
Relay C1,
{{{Extended,g^y4,H(K4)}}}
Relay C1,
{{{Extend,RPO,E(g^x4)}}}
Relay C2,
{{Extend,RPO,E(g^x4)}}
Relay C3,
{Extend,RPO,E(g^x4)}
Relay C4,
{Rendezvous1}
Relay C3,
{{Rendezvous1}}
Relay C2,
{{{Rendezvous1}}}
Relay C1,
{{{{Rendezvous1}}}}
Create C4,
E(g^x4)
Created C4,
g^y4,H(K4)
Created C3,
{g^y4,H(K4)}
Relay C2,
{{Extended,g^y4,H(K4)}}
Create C3,
E(g^x3)
ttttt
Created C3,
g^y3,H(K3)
Relay C2,
{Extended,g^y3,H(K3)}
Relay C1,
{{Extended,g^y3,H(K3)}}
Relay C2,
{Extend,OR3,E(g^x3)}
Relay C1,
{{Extend,OR3,E(g^x3)}}
Created C2,
g^y2,H(K2)
Relay C1,
{Extended,g^y2,H(K2)}
Create C2,
E(g^x2)
Relay C1,
{Extend,OR2,E(g^x2)}
Rendezvous
Point
Created_Fast C1,
g^y1,H(K1)
Create_Fast C1,
E(g^x1)
Entry OR
Exit ORMiddel OR
(link is TLS-encrypted) (link is TLS-encrypted) (link is TLS-encrypted)
Legend:
E(x) --- RSA encryption
{X} --- AES encryption
CN --- a circuit ID numbered N
Hidden
Server
(link is TLS-encrypted)
Fig. 7. Hidden server creating a circuit to the rendezvous point
Phase I: Presumably identify the hidden server. Recall
that the Tor client can derive the introduction point information
from directory server as illustrated in Step 3 of Figure 5. After
the client establishes a circuit to the rendezvous point, the
client will send a RELAY COMMAND INTRODUCE1 cell to
the introduction point in order to negotiate the Diffie-Hellman
key with the hidden server. We select this cell as a begin-sign
of our discovery approach and send this information to the
central server.
When the hidden server receives the RE-
LAY COMMAND INTRODUCE2 cell forwarded from
the introduction point, the hidden server will build a circuit to
the rendezvous point as shown in Step 8 of Figure 5. Once the
circuit is established, the hidden server will promptly send a
RELAY COMMAND RENDEZVOUS1 cell to the rendezvous
point. The circuit creation process is illustrated in Figure 7. As
we can see from Figure 7, the entry onion router will receive
one CELL CREATE FAST cell and four CELL RELAY
cells, including a RELAY COMMAND INTRODUCE2 cell,
and relay one CELL CREATED FAST cell and three
CELL RELAY cells to the hidden server.
4
Let us denote
the cells as protocol-level feature. Moreover, the entry
onion router will report the related information of each cell,
including the cell type, circuit ID, and the IP address of circuit
creator, to the central server. Furthermore, after the rendezvous
point receives the RELAY COMMAND RENDEZVOUS1 cell,
the rendezvous point needs to report the information to the
central server immediately.
Destroy C4',
{{Reason}}
Destroy C3',
{{{Reason}}}
Destroy C1',
{{{{{Reason}}}}}
Destroy C2',
{{{{Reason}}}}
Relay C1',
{{{{{XXXX}}}}}
Entry OR
t
Relay C2',
{{{{{XXXX}}}}
Relay C3',
{{{XXXX}}}
t
t
Relay C4',
{{XXXX}}
Hidden
Server
Exit OR
t
Middle OR
(TLS Link)
(TLS Link)(TLS Link)
(TLS Link)
Fig. 8. Modify a cell at the RPO
Phase II: Verify the hidden server. Recall
that after the rendezvous point receives a RE-
LAY COMMAND RENDEZVOUS1 cell from the hidden serv-
er, it repacks it into a RELAY COMMAND RENDEZVOUS2
cell and forwards it to the client as illustrated in
Steps 8 and 9 of Figure 5. The client will send the
RELAY COMMAND BEGIN cell to the hidden server
through the circuit in order to open a stream between the
client and the hidden server. Then our controlled rendezvous
point can detect this special cell based on the hidden service
protocol, even if the rendezvous point cannot decrypt the cell
and obtain the content. Once the rendezvous point catches
this cell, we modify one bit of the cell and forward it to
the hidden server. Due to the lack of integrity verification,
other onion routers cannot detect the manipulated cell. For
detection purpose, the rendezvous point needs to send the
timestamp of the manipulated cell to the central server.
Figure 8 illustrates the procedure of modifying the cell at the
rendezvous point.
When the manipulated cell reaches the hidden server, the
hidden server cannot correctly recognize this cell. According
to the design of Tor, the hidden server will tear down the
circuit between the client and hidden server by sending a CEL-
L Destroy cell promptly. The controlled entry onion router
will be the first router that receives this cell, and it reports
the cell type, the timestamp of the cell, circuit ID and the
source IP address of the cell to the central server. Moreover,
the rendezvous point will receive this CELL Destroy cell as
well. The rendezvous point also needs to report the timestamp
of this cell to the central server.
Phase III: Conclude by time correlation. Since the central
server may receive many cells from our entry onion routers,
we carefully choose several appropriate feature cells and
use them to filter out useless cell information. The central
server records the source IP address of each cell, circuit ID,
4
The CELL CREATE FAST cell and CELL CREATED FAST are used
in the first hop creation of hidden server instead of CELL CREATE and
CELL CREATED.

Citations
More filters
Journal ArticleDOI
TL;DR: The first formal analysis to evaluate the extent of threat such vulnerabilities may cause and quantify the costs of Eclipse attacks involved in the attack via probabilistic analysis is presented.
Abstract: Tor hidden services (HSs) are used to provide anonymity services to users on the Internet without disclosing the location of the servers so as to enable freedom of speech. However, existing Tor HSs use decentralized architecture that makes it easier for an adversary to launch DHT-based attacks. In this paper, we present practical Eclipse attacks on Tor HSs that allow an adversary with an extremely low cost to block arbitrary Tor HSs. We found that the dominant cost of this attack is IP address resources, the experimental results show that we can use only three IP addresses to eclipse an arbitrary HS with 100% success probability. To understand the severity of the Eclipse attack problems on Tor HSs, and its security implications, we present the first formal analysis to evaluate the extent of threat such vulnerabilities may cause and quantify the costs of Eclipse attacks involved in our attack via probabilistic analysis. Theoretical analysis suggests that adversaries with a modest number of IP address resources can block a large number of HSs at any time. Finally, we discuss countermeasures and future works.

118 citations

Journal ArticleDOI
TL;DR: In this article, Dingledine et al. shed light on the design weaknesses and challenges facing the Tor network and point out unresolved issues, focusing on the anonymous communication systems, and provided the reader with the state of current research directions and challenges in anonymous communication system.
Abstract: Tor lDingledine et al. 2004r is the most widely used anonymity network today, serving millions of users on a daily basis using a growing number of volunteer-run routers. Since its deployment in 2003, there have been more than three dozen proposals that aim to improve its performance, security, and unobservability. Given the significance of this research area, our goal is to provide the reader with the state of current research directions and challenges in anonymous communication systems, focusing on the Tor network. We shed light on the design weaknesses and challenges facing the network and point out unresolved issues.

84 citations

Journal ArticleDOI
TL;DR: By operating a large number of Tor servers for a period of 6 months, the authors were able to capture data from the Tor distributed hash table to collect the list of hidden services, classify their content and count the number of requests.
Abstract: Tor hidden services allow someone to host a website or other transmission control protocol (TCP) service whilst remaining anonymous to visitors. The collection of all Tor hidden services is often referred to as the ‘darknet’. In this study, the authors describe results from what they believe to be the largest study of Tor hidden services to date. By operating a large number of Tor servers for a period of 6 months, the authors were able to capture data from the Tor distributed hash table to collect the list of hidden services, classify their content and count the number of requests. Approximately 80,000 hidden services were observed in total of which around 45,000 are present at any one point in time. Abuse and Botnet C&C servers were the most frequently requested hidden services although there was a diverse range of services on offer.

56 citations

Proceedings ArticleDOI
01 Apr 2014
TL;DR: To the best of the knowledge, this work is the first to perform malicious traffic categorization over Tor, and can avoid legal and administrative complaints and allows the investigation to be performed in a sensitive environment such as a university campus.
Abstract: Tor is a popular low-latency anonymous communication system. However, it is currently abused in various ways. Tor exit routers are frequently troubled by administrative and legal complaints. To gain an insight into such abuse, we design and implement a novel system, TorWard, for the discovery and systematic study of malicious traffic over Tor. The system can avoid legal and administrative complaints and allows the investigation to be performed in a sensitive environment such as a university campus. An IDS (Intrusion Detection System) is used to discover and classify malicious traffic. We performed comprehensive analysis and extensive real-world experiments to validate the feasibility and effectiveness of TorWard. Our data shows that around 10% Tor traffic can trigger IDS alerts. Malicious traffic includes P2P traffic, malware traffic (e.g., botnet traffic), DoS (Denial-of-Service) attack traffic, spam, and others. Around 200 known malware have been identified. To the best of our knowledge, we are the first to perform malicious traffic categorization over Tor.

34 citations


Cites methods from "Protocol-level hidden server discov..."

  • ...To anonymously communicate with the server over Tor, the client downloads onion router information from directory servers and chooses a series of onion routers to establish a three-hop path2, referred to as circuit....

    [...]

Journal ArticleDOI
TL;DR: A novel system, TorWard, is designed and implemented for the discovery and the systematic study of malicious traffic over Tor, which can avoid legal and administrative complaints, and allows the investigation to be performed in a sensitive environment such as a university campus.
Abstract: Tor is a popular low-latency anonymous communication system. It is, however, currently abused in various ways. Tor exit routers are frequently troubled by administrative and legal complaints. To gain an insight into such abuse, we designed and implemented a novel system, TorWard, for the discovery and the systematic study of malicious traffic over Tor. The system can avoid legal and administrative complaints, and allows the investigation to be performed in a sensitive environment such as a university campus. An intrusion detection system (IDS) is used to discover and classify malicious traffic. We performed comprehensive analysis and extensive real-world experiments to validate the feasibility and the effectiveness of TorWard. Our results show that around 10% Tor traffic can trigger IDS alerts. Malicious traffic includes P2P traffic, malware traffic (e.g., botnet traffic), denial-of-service attack traffic, spam, and others. Around 200 known malwares have been identified. To mitigate the abuse of Tor, we implemented a defense system, which processes IDS alerts, tears down, and blocks suspect connections. To facilitate forensic traceback of malicious traffic, we implemented a dual-tone multi-frequency signaling-based approach to correlate botnet traffic at Tor entry routers and that at exit routers. We carried out theoretical analysis and extensive real-world experiments to validate the feasibility and the effectiveness of TorWard for discovery, blocking, and traceback of malicious traffic.

32 citations


Cites background or methods from "Protocol-level hidden server discov..."

  • ...To anonymously communicate with the remote server over Tor, the client downloads onion router information from a directory server and chooses a series of onion routers to establish a three-hop path, referred to as circuit....

    [...]

  • ...…Engineering Research Council of Canada, in part by the U.S. National Science Foundation under Grant 1461060, Grant 1116644, Grant 1350145, and Grant CNS 1117175, in part by Jiangsu Provincial Natural Science Foundation of China under Grant BK20150637, Grant BK20150628, and Grant BK20150629, in…...

    [...]

  • ...This work was supported in part by the China National High Technology Research and Development Program under Grant 2013AA013503, in part by the National Natural Science Foundation of China under Grant 61502100, Grant 61572130, Grant 61532013, Grant 61502098, Grant 61502099, Grant 61272054, Grant 61202449, Grant 61402104, and Grant 61320106007, in part by a Discovery Grant (195819339) from the Natural Sciences and Engineering Research Council of Canada, in part by the U.S. National Science Foundation under Grant 1461060, Grant 1116644, Grant 1350145, and Grant CNS 1117175, in part by Jiangsu Provincial Natural Science Foundation of China under Grant BK20150637, Grant BK20150628, and Grant BK20150629, in part by Jiangsu Provincial Key Technology R&D Program under Grant BE2014603, in part by Jiangsu Provincial Key Laboratory of Network and Information Security under Grant BM2003201, and in part by Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grant 93K-9....

    [...]

  • ...To transmit data to a remote server, the client packs the data into basic transmission units, referred to as cells, where each cell has a fixed size of 512 bytes....

    [...]

References
More filters
ReportDOI
13 Aug 2004
TL;DR: This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points.
Abstract: We present Tor, a circuit-based low-latency anonymous communication service. This second-generation Onion Routing system addresses limitations in the original design by adding perfect forward secrecy, congestion control, directory servers, integrity checking, configurable exit policies, and a practical design for location-hidden services via rendezvous points. Tor works on the real-world Internet, requires no special privileges or kernel modifications, requires little synchronization or coordination between nodes, and provides a reasonable tradeoff between anonymity, usability, and efficiency. We briefly describe our experiences with an international network of more than 30 nodes. We close with a list of open problems in anonymous communication.

3,960 citations


"Protocol-level hidden server discov..." refers background in this paper

  • ...I NTRODUCTION Tor is a broadly-used low-latency anonymous communication system which supports TCP applications over the Internet [1]....

    [...]

Proceedings ArticleDOI
08 May 2005
TL;DR: New traffic-analysis techniques are presented that allow adversaries with only a partial view of the network to infer which nodes are being used to relay the anonymous streams and therefore greatly reduce the anonymity provided by Tor, and it is shown that otherwise unrelated streams can be linked back to the same initiator.
Abstract: Tor is the second generation onion router supporting the anonymous transport of TCP streams over the Internet. Its low latency makes it very suitable for common tasks, such as Web browsing, but insecure against traffic-analysis attacks by a global passive adversary. We present new traffic-analysis techniques that allow adversaries with only a partial view of the network to infer which nodes are being used to relay the anonymous streams and therefore greatly reduce the anonymity provided by Tor. Furthermore, we show that otherwise unrelated streams can be linked back to the same initiator Our attack is feasible for the adversary anticipated by the Tor designers. Our theoretical attacks are backed up by experiments performed on the deployed, albeit experimental, Tor network. Our techniques should also be applicable to any low latency anonymous network. These attacks highlight the relationship between the field of traffic-analysis and more traditional computer security issues, such as covert channel analysis. Our research also highlights that the inability to directly observe network links does not prevent an attacker from performing traffic-analysis: the adversary can use the anonymising network as an oracle to infer the traffic load on remote nodes in order to perform traffic-analysis.

595 citations

Proceedings ArticleDOI
30 Oct 2006
TL;DR: This work examines the effectiveness of two traffic analysis techniques, based upon classification algorithms, for identifying encrypted HTTP streams, and gives evidence that these techniques will exhibit the scalability necessary to be effective on the Internet.
Abstract: We examine the effectiveness of two traffic analysis techniques for identifying encrypted HTTP streams. The techniques are based upon classification algorithms, identifying encrypted traffic on the basis of similarities to features in a library of known profiles. We show that these profiles need not be collected immediately before the encrypted stream; these methods can be used to identify traffic observed both well before and well after the library is created. We give evidence that these techniques will exhibit the scalability necessary to be effective on the Internet. We examine several methods of actively countering the techniques, and we find that such countermeasures are effective, but at a significant increase in the size of the traffic stream. Our claims are substantiated by experiments and simulation on over 400,000 traffic streams we collected from 2,000 distinct web sites during a two month period.

438 citations

Proceedings ArticleDOI
21 May 2006
TL;DR: This work presents fast and cheap attacks that reveal the location of a hidden server, the first actual intersection attacks on any deployed public network: thus confirming general expectations from prior theory and simulation.
Abstract: Hidden services were deployed on the Tor anonymous communication network in 2004. Announced properties include server resistance to distributed DoS. Both the EFF and Reporters Without Borders have issued guides that describe using hidden services via Tor to protect the safety of dissidents as well as to resist censorship. We present fast and cheap attacks that reveal the location of a hidden server. Using a single hostile Tor node we have located deployed hidden servers in a matter of minutes. Although we examine hidden services over Tor, our results apply to any client using a variety of anonymity networks. In fact, these are the first actual intersection attacks on any deployed public network: thus confirming general expectations from prior theory and simulation. We recommend changes to route selection design and implementation for Tor. These changes require no operational increase in network overhead and are simple to make; but they prevent the attacks we have demonstrated. They have been implemented.

362 citations


"Protocol-level hidden server discov..." refers background in this paper

  • ...…are deployed with the hidden service over Tor [4], they are hard to take down because of the anonymity protected with the hidden service; if a hidden service hosts child pornography website [5] [6]1, the hidden service actually blindly provides a protection tthe illegal content in most countries....

    [...]

  • ...Existing research work [7], [8] has been carried out to investigate the attacks which can locate the Tor hidden server....

    [...]

Proceedings ArticleDOI
30 Oct 2006
TL;DR: This work suggests the same technique could be exploited as a classical covert channel and can even provide geolocation, because existing abstract models of anonymity-network nodes do not take into account the inevitable imperfections of the hardware they run on.
Abstract: Location-hidden services, as offered by anonymity systems such as Tor, allow servers to be operated under a pseudonym. As Tor is an overlay network, servers hosting hidden services are accessible both directly and over the anonymous channel. Traffic patterns through one channel have observable effects on the other, thus allowing a service's pseudonymous identity and IP address to be linked. One proposed solution to this vulnerability is for Tor nodes to provide fixed quality of service to each connection, regardless of other traffic, thus reducing capacity but resisting such interference attacks. However, even if each connection does not influence the others, total throughput would still affect the load on the CPU, and thus its heat output. Unfortunately for anonymity, the result of temperature on clock skew can be remotely detected through observing timestamps. This attack works because existing abstract models of anonymity-network nodes do not take into account the inevitable imperfections of the hardware they run on. Furthermore, we suggest the same technique could be exploited as a classical covert channel and can even provide geolocation.

278 citations


"Protocol-level hidden server discov..." refers background in this paper

  • ...…are deployed with the hidden service over Tor [4], they are hard to take down because of the anonymity protected with the hidden service; if a hidden service hosts child pornography website [5] [6]1, the hidden service actually blindly provides a protection tthe illegal content in most countries....

    [...]

  • ...The attacker controls both malicious client and entry routers, and uses a simple passive timing analysis, i.e., cell counting, to discover the same patten in the client’s traffic and the entry’s traffic to identify the hidden service at the entry side....

    [...]

Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Protocol-level hidden server discovery" ?

In this paper, the authors propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. The authors investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. The authors manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. The authors conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of their approach. Once their controlled entry onion routers detect such a feature, the authors can confirm the IP address of the hidden server. 

The hidden server on the PlanetLab node is deployed on a Tor client and Apache server is installed as the HTTP server so as to offer hidden service. 

Existing traffic analysis attacks against anonymous communication can be largely categorized into two groups: passive traffic analysis and active watermarking techniques. 

There are 3101 onion routers in the Tor network, including 814 pure entry routers and 776 pure exit routers, 273 EE routers and 1238 N-EE routers. 

Since the hidden server cannot correctly decrypt the manipulated cell, it will destroy the circuit between the client and hidden server by sending a destroy cell to the client. 

2) Once the circuits to introduction points are established, the hidden server establishes a circuit to the directory server and advertises the service descriptor to the directory server, including the public key of the hidden server and the information regarding the introduction points. 

Phase I: Presumably identify the hidden server - the client continues to create circuits to the hidden server until one of their entry routers sees a special combination of cells of different types, i.e., protocol-level features. 

Such techniques can reduce the false positive rate significantly if the signal is long enough and does not require massive training study of traffic cross correlation as required in passive traffic analysis. 

Trending Questions (1)
How to check Netbios name in Windows Server 2016?

In this paper, we propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website.