Proceedings Article•DOI•

Protocol-level hidden server discovery

Zhen Ling¹, Junzhou Luo¹, Kui Wu², Xinwen Fu³•Institutions (3)

Southeast University¹, University of Victoria², University of Massachusetts Lowell³

14 Apr 2013-pp 1043-1051

TL;DR: This paper investigates the Tor hidden server protocol and develops a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers.

read less

Abstract: Tor hidden services are commonly used to provide a TCP based service to users without exposing the hidden server's IP address in order to achieve anonymity and anti-censorship. However, hidden services are currently abused in various ways. Illegal content such as child pornography has been discovered on various Tor hidden servers. In this paper, we propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. We investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. We manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. Once our controlled entry onion routers detect such a feature, we can confirm the IP address of the hidden server. We conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of our approach.

...read moreread less

Summary (4 min read)

Jump to: [Introduction] – [A. Components of the Tor Network] – [B. Circuit Selection and Creation] – [C. How Does the Hidden Service Work] – [A. Basic Idea] – [B. Details of Protocol-level Hidden Server Discovery] – [C. Make the Discovery Automatic] – [IV. ANALYSIS] – [A. Catch Probability] – [B. How to Improve the Catch Probability] – [V. EVALUATION] – [A. Experiment setup] – [B. Experiment Results] – [VI. DISCUSSION] – [VII. RELATED WORK] and [VIII. CONCLUSION]

Introduction

Tor is a broadly-used low-latency anonymous communication system which supports TCP applications over the Internet [1].
Unfortunately, hidden services have been misused or abused for various illegal purposes.
The attacker controls both malicious client and entry routers, and uses a simple passive timing analysis, i.e., cell counting, to discover the same patten in the client’s traffic and the entry’s traffic to identify the hidden service at the entry side.
In Phases I and II, related information of these cells observed at their clients, entry routers, and the rendezvous point has been sent to their central server.

A. Components of the Tor Network

Figure 1 illustrates the basic architecture of Tor network.
The first threebye header of the Tor cell is not encrypted so that the Tor onion router can read this header.
The first two bytes is the circuit ID, while the third byte, is used to indicate the specific command of this cell.
The authors categorize the Tor cell into two types: the control cell as illustrated in Figure 2 (a) and the relay cell as shown in Figure 2 (b).
The filed Command of relay cell is CELL RELAY that is used to relay the application data.

B. Circuit Selection and Creation

To communicate with an application server via Tor, a Tor client first downloads all of the onion router information from the directory server and uses source routing by choosing a series of onion routers as a route.
The authors call the sequence of onion routers as the path through Tor.
After that, the client initiates the procedure of creating a circuit incrementally, one hop at a time.
Then the client sends a CELL CREATE cell through the TLS connection and uses the Diffie-Hellman (DH) handshake protocol to negotiate a base key K1 = gxy with entry onion router, which responds with a CELL CREATED cell.
Once the circuit is established, the client sends a RELAY COMMAND BEGIN cell to the exit onion router, and the cell is encrypted as {{{Begin < IP, Port >}kf1}kf2}kf3 , where the subscript refers to the key used for encryption of one specific onion skin.

C. How Does the Hidden Service Work

A hidden service involves six participants, including a Tor client, the directory server, onion routers, a rendezvous point, an introduction point, and a hidden server.
Once the introduction point is decided, the hidden server establishes a circuit to the introduction point.
This key is used for end-to-end encryption between client and server.
9) The rendezvous point obtains the RELAY COMMAND RENDEZVOUS1 cell and compares the rendezvous cookie from the cell and the one from the Tor client.
From the above procedure, the Tor client only knows the introduction point instead of the hidden server directly, and the hidden server only knows the rendezvous point instead of the Tor client directly.

A. Basic Idea

Since only entry onion routers may know the real IP address of a hidden server, the authors assume that they are able to control several entry onion routers3.
The authors rendezvous point manipulates an appropriate cell [9], and forwards the mangled cell to the hidden server.
The rendezvous point also reports this cell to the central server.
(v) To determine if the hidden server chooses one of their entry onion routers, the central server searches for correlation between the time when the rendezvous point sends the manipulated cell, the time when the rendezvous point receives the destroy cell, and the time when the entry onion router receives the destroy cell.
This is reasonable because onion routers are set up by volunteers.

B. Details of Protocol-level Hidden Server Discovery

The hidden server discovery process can be divided into three phases.
When the hidden server receives the RELAY COMMAND INTRODUCE2 cell forwarded from the introduction point, the hidden server will build a circuit to the rendezvous point as shown in Step 8 of Figure 5.
Let us denote the cells as protocol-level feature.
After the central server receives the manipulated cell information, it can first filter out the circuits along which their entry onion routers are not chosen as the first router, by counting the number of CELL CREATE, CELL CREATED and CELL RELAY cells based on the Tor circuit creation protocol.
Eventually, the authors choose the CELL DESTROY cell information received from the rendezvous point as the endsign cell of Phase II.

C. Make the Discovery Automatic

The authors want to emphasize that their discovery of hidden server is conducted fully automatically.
The central server builds tcp connections to the Tor client, entry onion routers and rendezvous points, and receives the information from those nodes.
Subsequently, the central server receives the begin-sign cell of Phase II from the rendezvous point and records the cell type, circuit ID, and the timing of the begin-sign cell, denoted as Tb.
Once the circuit is found, the central server compares the timing of the CELL DESTROY cell from the entry onion router by using the condition Tb < Td < Te. Eventually, the authors make the entire discovery process automatic.

IV. ANALYSIS

For their approach to be effective, the key issue is that one of their controlled entry onion routers should be selected as the entry onion router by the hidden server.
The authors analyze the chance that the hidden server selects one of their onion routers as the entry router, and propose a strategy that can greatly increases this chance without incurring large cost.

A. Catch Probability

To evaluate the effectiveness of their discovery approach, the authors analyze the catch probability, i.e., the probability that a circuit from a hidden server selects one of their controlled onion routers as the entry onion router.
To start with, the authors need to introduce how Tor determines a circuit among the many possible paths.
To be specific, the onion routers are categorized into four classes: pure entry routers (entry guards), pure exit routers, both entry and exit routers (denoted as EE routers), and neither entry nor exit routers (denoted as N-EE routers).
According to the Tor onion router selection algorithm, the bandwidth of each entry onion router is weighted.
The authors intentionally denote the catch probability as P(k, b) to emphasize that the probability depends on the number of injected onion routers and the claimed bandwidth of each injected onion router.

B. How to Improve the Catch Probability

Based on Equation (2), it is easy to prove the following claims and the proof can be found in their technical report [10]: .
The catch probability is determined by the aggregated bandwidth contributed by the controlled Tor entry routers.

V. EVALUATION

The authors have implemented the proposed Tor hidden service discovery approach in Section III.
The authors elaborate the results of the empirical evaluation of the approach.
The authors experimental results match the theoretical analysis presented in Section IV well.

A. Experiment setup

Figure 9 illustrates the experiment setup for protocol-level hidden service discovery over the real-world Tor.
The authors deployed a Tor client, entry onion router, and central server at two different campuses in the north American.
To implement the discovery, the authors revised the source code of Tor at the Tor client side, rendezvous point, and entry router in order to establish the connections to their central server and send the related information.
The version of Tor in their experiment is the latest stable version 0.2.2.37.
In addition, at the client side, the authors implemented a web client to automatically send a http request of the specific onion address, and then installed the HTTP proxy, i.e., Privoxy [12], in order to relay the http request into the OP.

B. Experiment Results

Table I gives the detection rate by their protocol-level hidden server discovery approach.
The experiments were repeated for 1000 times for deriving the true positive rate, the probability that the hidden server is detected if it uses their entry server.
One experiment lasts for around 15 seconds.
This figure demonstrates that the catch probability increases quickly with the number of controlled entry routers and the bandwidth of each entry router.
The current Tor network has 776 pure entry routers and 273 EE routers.

VI. DISCUSSION

There are various complicated cases in discovering a hidden server via their protocol-level discovery approach.
Once the hidden server chooses one of their entry routers, it will use their entry router in the following 30-60 days and will be exposed by their discovery approach.
Another complicated case is that the operator of a hidden server can choose three their trusted entry guards or Tor bridges6 to avoid choosing their surveillance entry guards.
Recall in Phase I of their original approach, if their controlled middle router is selected by the hidden server side circuit, the middle router will receive one CELL CREATE cell and three CELL RELAY cells, including a 6Tor bridges are a type of hidden onion routers that are not public in the directory server.
The two-steps discovery approach above can also be used to tackle Complicated case I.

VIII. CONCLUSION

The hidden service over Tor is a double-edged sword.
A system that tracks down a hidden service can effectively deter malicious users from abusing the Tor network for illegal usage.
The authors design, implement, and evaluate such a system.
The authors method augments the arsenal of existing detection tools, but it is unique in that the authors do not rely on conventional time consuming traffic analysis or watermarking techniques.
The authors also expect the debate of detrimental sides of anonymous communication to continue in the long run, and hope to give a choice to law enforcement for tracking notorious hidden services.

Did you find this useful? Give us your feedback

Figures (11)

TABLE I DETECTION RATE OF PROTOCOL-LEVEL HIDDEN SERVER DISCOVERY

Fig. 4. Data transmission over the circuit

Fig. 12. Probability that at least a circuit traverses through the controlled entry routers

Fig. 6. Circuit creation and data transmission

Fig. 7. Hidden server creating a circuit to the rendezvous point

Content maybe subject to copyright Report

Protocol-level Hidden Server Discovery

Zhen Ling

∗†

, Junzhou Luo

∗

, Kui Wu

†

and Xinwen Fu

‡

∗

Southeast University, Email: {zhenling, jluo}@seu.edu.cn

†

University of Victoria, Email: wkui@cs.uvic.ca

‡

University of Massachusetts Lowell, Email: xinwenfu@cs.uml.edu

Abstract—Tor hidden services are commonly used to provide a

TCP based service to users without exposing the hidden server’s

IP address in order to achieve anonymity and anti-censorship.

However, hidden services are currently abused in various ways.

Illegal content such as child pornography has been discovered on

various Tor hidden servers. In this paper, we propose a protocol-

level hidden server discovery approach to locate the Tor hidden

server that hosts the illegal website. We investigate the Tor hidden

server protocol and develop a hidden server discovery system,

which consists of a Tor client, a Tor rendezvous point, and

several Tor entry onion routers. We manipulate Tor cells, the

basic transmission unit over Tor, at the Tor rendezvous point

to generate a protocol-level feature at the entry onion routers.

Once our controlled entry onion routers detect such a feature,

we can conﬁrm the IP address of the hidden server. We conduct

extensive analysis and experiments to demonstrate the feasibility

and effectiveness of our approach.

Keywords-Anonymous Communication, Tor, Hidden Service

I. INTRODUCTION

Tor is a broadly-used low-latency anonymous communica-

tion system which supports TCP applications over the Inter-

net [1]. It provides users with anonymity service, helps ﬁght

against Internet censorship, and supports hidden services to

preserve the anonymity of web services [2]. Tor was deployed

in late 2003 and comprised hundreds of onion routers, while

hidden services were released in early 2004. Due to increas-

ingly high demand of privacy protection, the Tor network has

seen steady growth, consisting of around 3000 volunteer based

Tor onion routers as of July 2012.

Unfortunately, hidden services have been misused or abused

for various illegal purposes. They can host botnet and illegal

web contents such as drug trading information [3] and pornog-

raphy. The consequence is severe: If botnets are deployed with

the hidden service over Tor [4], they are hard to take down

because of the anonymity protected with the hidden service;

if a hidden service hosts child pornography website [5] [6]

the hidden service actually blindly provides a protection to the

illegal content in most countries.

Existing research work [7], [8] has been carried out to

investigate the attacks which can locate the Tor hidden server.

The approach in [7] is based on trafﬁc analysis. The attacker

controls both malicious client and entry routers, and uses a

simple passive timing analysis, i.e., cell counting, to discover

the same patten in the client’s trafﬁc and the entry’s trafﬁc

to identify the hidden service at the entry side. Murdoch [8]

The authors believe that illegal content is hosted at this hidden website

although we did not dig it because of legal concerns.

presented a clock skew based approach to identifying whether

or not a given Tor node is a hidden server. The attacker

evaluates the load of a given Tor node. Since the server’s

temperature would increase while its workload rises, the

attacker can identify whether the node is the attacked hidden

service by estimating its temperature through measuring the

clock skew. Nevertheless, the attacks based on trafﬁc analysis

may suffer a high rate of false positives due to various factors,

such as Internet trafﬁc dynamics, the load of Tor nodes, and

the large number of cells for the purpose of statistical trafﬁc

analysis.

In this paper, we propose a protocol-level discovery ap-

proach to locating a hidden server by utilizing Tor proto-

col features. We (law enforcement) control a Tor client, a

rendezvous point, several entry onion routers, and a central

server. The discovery takes three phases. In Phase I, our Tor

client continues to create circuits to the hidden server until

one of our entry routers sees a special combination of cells

of different types. Such a combination, denoted as a protocol-

level feature, comes from the Tor protocol that creates the

circuits between a client and a hidden server. However, even

if our entry router observes such a feature, it may result from

other clients that create circuits to a hidden server through

our entry router. Phase II is to conﬁrm that the hidden server

chooses our entry router. We manipulate cells from our client

to incur a special decryption error at the hidden server, which

will destroy all circuits to the client. If our entry router sees

the destroy cell, we know that our entry router is chosen by

the hidden server. Phase III is used to correlate all the events

above. In Phases I and II, related information of these cells

observed at our clients, entry routers, and the rendezvous point

has been sent to our central server. In Phase III, we exploit the

timing information of these cells, and identify the correlation

to conﬁrm that the target hidden server is behind our entry

router. In this way, we have located the hidden server.

Our approach has several unique advantages. First, it is easy

to deploy our detection system. Second, compared to trafﬁc

analysis based methods, our approach is signiﬁcantly faster,

fully automatic and can quickly locate the hidden server using

only several cells. Third, our approach is accurate with an

observed detection rate of 100% and has an observed low false

positive of 0%. Fourth, our approach works on the protocol

level and is oblivious to trafﬁc patterns; it is more general

and can be used to identify malicious hidden services. A

hidden server may also use its trusted entry routers or Tor

bridges as the ﬁrst hop into the Tor network. We discuss those

complicated cases of tracking hidden servers in Section VI.

Client

(OP)

Tor Network

Directory Servers

Exit

(OR3)

Middle

(OR2)

Entry

(OR1)

Onion Routers

Legend

Server

Fig. 1. Tor network

The rest of the paper is organized as follows. In Section II,

we introduce the components of Tor, its basic operations and

the protocol of hidden service. In Section III, we present the

basic idea of our approach and then elaborate the algorithm.

In Section IV, we analyze the effectiveness of the approach.

In Section V, we show experimental results on Tor, and we

discuss complicated cases of tracking hidden servers in Section

VI. Related work is reviewed in Section VII. The paper is

concluded in Section VIII.

II. BACKGROUND

In this section, we ﬁrst introduce the Tor network and then

present its basic operations and the protocol of hidden service.

A. Components of the Tor Network

Figure 1 illustrates the basic architecture of Tor network.

The following components are involved in the typical use of

Tor network:

• Tor clients. A Tor client installs a local software referred

to as onion proxy (OP), which packs application data into

equal-sized cells (512 bytes) and delivers them into Tor

network. A cell is the basic transmission unit of Tor.

• Onion routers (OR). The onion routers relay the cells on

behalf of Tor client and server.

• Directory servers. Directory servers hold the information

of onion routers and hidden services, such as the public

keys of routers and hidden servers.

• Application servers. It supports TCP applications such as

a web service and an IRC service.

Circ_id Command

Relay

Command

Recognized Stream_id Intergrity Length

Data

1 2 2 4 2 498

Circ_id Command Data

509

(a) Tor Cell Format

(b) Tor Relay Cell Format

Fig. 2. Tor cell format [1]

Figure 2 illustrates the format of the Tor cell. The ﬁrst three-

bye header of the Tor cell is not encrypted so that the Tor onion

router can read this header. The ﬁrst two bytes is the circuit ID,

while the third byte, is used to indicate the speciﬁc command

of this cell. We categorize the Tor cell into two types: the

control cell as illustrated in Figure 2 (a) and the relay cell as

shown in Figure 2 (b). The ﬁled Command of a control cell

can be, for instance, CELL CREATE/CELL CREATE FAST

or CELL CREATED/CELL CREATED FAST, employed for

establishing a new circuit; and CELL DESTROY, used for

Create C3,

E(g^x3)

Created C3,

g^y3,H(K3)

Relay C2,

{Extended,g^y3,H(K3)}

Relay C1,

Relay C2,

{Extend,OR3,E(g^x3)}

Relay C1,

Created C2,

g^y2,H(K2)

Relay C1,

{Extended,g^y2,H(K2)}

Create C2,

E(g^x2)

Relay C1,

{Extend,OR2,E(g^x2)}

Server

Created C1,

g^y1,H(K1)

Create C1,

E(g^x1)

Client

(OP)

Entry OR

(link is TLS-encrypted)

Exit OR

Middel OR

(link is TLS-encrypted) (link is TLS-encrypted) (unencrypted)

Legend:

E(x) --- RSA encryption

{X} --- AES encryption

CN --- a circuit ID numbered N

Fig. 3. Circuit creation

tearing down a circuit. The ﬁled Command of relay cell is

CELL RELAY that is used to relay the application data. In

addition, there are numerous types of relay commands (Relay

Command), and its format is like RELAY COMMAND X

where “X” is a word. In our paper, when we mention the

RELAY COMMAND X cell, it indicates a relay cell and the

content of this cell is onion-like encrypted. We will elaborate

these commands further in later sections when we discuss the

Tor operations from the perspective of protocol-level.

B. Circuit Selection and Creation

To communicate with an application server via Tor, a Tor

client ﬁrst downloads all of the onion router information from

the directory server and uses source routing by choosing a

series of onion routers as a route. We call the sequence of

onion routers as the path through Tor. The number of onion

routers is called the path length. In the Tor network, a path is

also called a circuit, thus we use path/circuit interchangeably

in this paper. We employ the default path length of 3 as an

example in Figure 1 to show how a path is chosen. The client

ﬁrst selects an appropriate exit onion router (OR3), which

should have an exit policy supporting the relay of the TCP

stream from the client. Then, the client chooses a proper entry

onion router (OR1) (also referred to as entry guard) and a

middle onion router (OR2). After that, the client initiates the

procedure of creating a circuit incrementally, one hop at a

time. Eventually, the client can communicate with the remote

server through this circuit, i.e., OR1 → OR2 → OR3.

Figure 3 illustrates the procedure that a client builds a

circuit. As shown in Figure 3, the client ﬁrst establishes a

TLS connection with entry router using the TLS protocol.

Then the client sends a CELL CREATE cell through the

TLS connection and uses the Difﬁe-Hellman (DH) handshake

protocol to negotiate a base key K

= g

with entry onion

router, which responds with a CELL CREATED cell. Note that

the H(K

) is the hash value of K

in Figure 3. From this base

key material, a forward symmetric key kf

and a backward

symmetric key kb

are generated. In this way, the ﬁrst hop

of this circuit, denoted as C1, is created. Similarly, the client

extends the circuit to include the second hop (C2) and the

third hop (C3) of the circuit.

Figure 4 shows the procedure of the data transmission over

the circuit. Once the circuit is established, the client sends a

ttt

TCP Teardown

Relay C3,

{End,Reason}

Relay C2,

Relay C1,

{{{End,Reason}}}

ĀHelloā

Relay C3,

{Data,āHelloā}

Relay C2,

Relay C1,

{{{Data,āHelloā}}}

Relay C3,

{Connected}

Relay C2,

Relay C1,

{{{Connected}}}

TCP Handshake

<IP,Port>

Relay C3,

{Begin<IP,Port>}

Relay C2,

Relay C1,

{{{Begin<IP,Port>}}}

Client

(OP)

Entry OR

(link is TLS-encrypted)

Exit ORMiddel OR

(link is TLS-encrypted)

Server

(unencrypted)

Fig. 4. Data transmission over the circuit

RELAY COMMAND BEGIN cell to the exit onion router, and

the cell is encrypted as {{{Begin < IP, P ort >}

}

where the subscript refers to the key used for encryption of

one speciﬁc onion skin. The three layers of onion skin are

removed one by one each time the cell traverses an onion

router through the circuit. When exit onion router removes

the last onion skin by decryption, it recognizes that the request

intends to open a TCP stream to a port at the destination IP

pointing to the remote server. Therefore, the exit onion router

acts as a proxy, builds a TCP connection with the server, and

sends a RELAY COMMAND CONNECTED cell back to the

client. Then the client can download the ﬁle.

Tor Network

Directory

Servers

RPO

Middle

Entry

Onion Routers

Legend

Hidden

Server

Client

(OP)

Exit

Middle

Entry

Middle

IPO

Exit

Middle

Entry

Middle

Exit

Entry

Middle

Exit

Fig. 5. Tor hidden service

C. How Does the Hidden Service Work

A hidden service involves six participants, including a Tor

client, the directory server, onion routers, a rendezvous point,

an introduction point, and a hidden server. Since the ﬁrst

three participants have been described above, we will introduce

the functionality of the rendezvous point and the introduction

point and how they work together to support a hidden service.

• Introduction point (IPO). An introduction point is select-

ed and published in the directory server associated with

the descriptor of the hidden service by the hidden server.

Once the introduction point is decided, the hidden server

establishes a circuit to the introduction point. Then the

introduction point plays as the front interface to Tor client

and waits until a Tor client creates a three-hop circuit to

the introduction point and forwards the request data from

the Tor client side circuit to the hidden server side circuit.

• Rendezvous point (RPO). A rendezvous point is chosen

by the Tor client. Both the Tor client and the hidden

server will establish a three-hop circuit to the RPO, which

acts as a message relay to transmit the application data

between the hidden server side circuit and the Tor client

side circuit.

• Hidden server. A hidden server provides various TCP

applications such as web server and IRC server. It could

be deployed over OP or OR.

Figure 5 depicts the procedure of establishing a connection

between the Tor client and the speciﬁc hidden server.

1) The hidden server ﬁrst selects several onion

routers as introduction points and builds the

circuits to these introduction points. To build

a circuit, the hidden server will send the

RELAY COMMAND ESTABLISH INTRO cell to an

introduction point, and the introduction point will reply

with the RELAY COMMAND INTRO ESTABLISHED

cell to inform the hidden server that the circuit is

established.

2) Once the circuits to introduction points are established,

the hidden server establishes a circuit to the directory

server and advertises the service descriptor to the direc-

tory server, including the public key of the hidden server

and the information regarding the introduction points.

Then the owner of the hidden server can post the onion

address

in a public place to attract users to access the

hidden service via Tor.

3) When a Tor client obtains the onion address, the client

creates a circuit to the directory server and fetches the

relevant information advertised by the hidden service.

Then the client learns the introduction points of the

hidden service.

4) The client selects a rendezvous point and builds a

circuit to the rendezvous point. The client will send a

RELAY COMMAND ESTABLISH RENDEZVOUS

cell, which carries a rendezvous cookie, to

the rendezvous point, which replies with a

RELAY COMMAND RENDEZVOUS ESTABLISHED

cell to indicate the successful circuit establishment.

5) The client creates a three-hop circuit to one

of the introduction points and transmits a

RELAY COMMAND INTRODUCE1 cell to the chosen

introduction point. The cell carries the information such

as the rendezvous point, rendezvous cookie and the

Difﬁe-Hellman data g

generated by the Tor client.

6) Once the introduction point receives the

RELAY COMMAND INTRODUCE1 cell, it replies

with a RELAY COMMAND INTRODUCE ACK cell to

the client. After the client receives this ACK cell, it

tears down this circuit to the introduction point.

7) The introduction point repacks the RE-

Onion address is generated by the hidden server. It is a hostname of the

form “x.onion”, where “x” consists of 16 random characters.

LAY COMMAND INTRODUCE1 cell into a

RELAY COMMAND INTRODUCE2 cell, and then

sends the RELAY COMMAND INTRODUCE2 cell to

the hidden server.

8) After the hidden server receives the RE-

LAY COMMAND INTRODUCE2 cell, it knows

the information of the rendezvous point, rendezvous

cookie and Difﬁe-Hellman data g

. The hidden server

can generate the Difﬁe-Hellman data g

and derive

the key K = g

. This key is used for end-to-end

encryption between client and server. Then the hidden

server builds a circuit to the rendezvous point and

sends a RELAY COMMAND RENDEZVOUS1 cell to

the rendezvous point via the circuit. The cell includes

the key data g

, the hash value of the key H(K) and

the rendezvous cookie.

9) The rendezvous point obtains the RE-

LAY COMMAND RENDEZVOUS1 cell and

compares the rendezvous cookie from the cell

and the one from the Tor client. Once the

rendezvous cookies are matched, the rendezvous

point removes the rendezvous cookie from the

RELAY COMMAND RENDEZVOUS1 cell and repacks

the rest data into RELAY COMMAND RENDEZVOUS2

cell, and then forwards the cell to the client.

10) When the Tor client receives the RE-

LAY COMMAND RENDEZVOUS2 cell, it can generate

the key K = g

using g

and verify it based on

H(K). The key is used to encrypt the data between

client and server and is the same as the one generated

by the hidden server at Step 8. In this way, the client

and hidden server complete the handshake. Then the

client sends a RELAY COMMAND BEGIN cell to

establish a stream to the hidden server via the six-hop

circuit.

From the above procedure, the Tor client only knows the

introduction point instead of the hidden server directly, and

the hidden server only knows the rendezvous point instead

of the Tor client directly. In addition, the introduction point

acts as the front interface to the Tor client for service query

and request only, and it does not get involved any more once

the six-hop circuit passing through the rendezvous point is

established for data communication between the Tor client

and the hidden server. Since either introduction point or

rendezvous point knows neither the location of Tor client nor

the location of hidden server, anonymous web service is hence

achieved. Note that in the Tor network, only the entry onion

router knows IP addresses of hidden servers.

III. PROTOCOL-LEVEL HIDDEN SERVER DISCOVERY

In this section, we ﬁrst introduce the basic idea of discover-

ing a Tor hidden server and then present detailed algorithms.

A. Basic Idea

Since only entry onion routers may know the real IP address

of a hidden server, we assume that we are able to control

several entry onion routers

. In addition, we need a client

and rendezvous point to cooperate with entry onion routers.

A central server is used to record information of related

cells forwarded from the Tor client, entry onion routers, and

rendezvous point.

The discovery is conducted as follows: (i) Our Tor client

obtains the introduction point information from the directory

server, and builds a circuit to the introduction point and also

reports the circuit creation to our central server to indicate

the start of discovery. (ii) The hidden server also establishes

a circuit to the rendezvous point. If the hidden server chooses

our entry router, our entry router will see a special combination

of cells of different types, denoted as protocol-level features,

during the creation of those circuits. However, such protocol-

level features do not necessarily imply the hidden server

chooses our entry router. We perform the following actions

to conﬁrm it is a true positive. (iii) Once the connection is

established between the client and the hidden server, the client

will send cells that contain application data to the hidden

server as illustrated in Step 10 of Figure 5. Our rendezvous

point manipulates an appropriate cell [9], and forwards the

mangled cell to the hidden server. The rendezvous point also

reports this cell to the central server. (iv) The mangled cell

arrives at the hidden server. Since the hidden server cannot

correctly decrypt the manipulated cell, it will destroy the

circuit between the client and hidden server by sending a

destroy cell to the client. This cell traverses along the circuit

to the client. The rendezvous point will detect it and report

it to the central server to indicate the end of the discovery

process. In addition, our controlled entry onion routers will

report to the central server immediately when a destroy cell

is received. (v) To determine if the hidden server chooses

one of our entry onion routers, the central server searches

for correlation between the time when the rendezvous point

sends the manipulated cell, the time when the rendezvous

point receives the destroy cell, and the time when the entry

onion router receives the destroy cell. Since the entry onion

router knows the IP address of the circuit creator, once such

time correlation is found, we can identify the hidden server.

Figure 6 illustrates the work ﬂow of the protocol-level hidden

server discovery approach.

RPO

Entry

Onion Routers

Legend

Hidden

Server

Client

(OP)

Exit

EntryMiddle

IPO

Exit

Middle

Entry

Middle

Receive

Create/Relay Cell

Send

Intro Cell

Modify

Begin Cell

Receive

Destory Cell

Tor Network

Fig. 6. Circuit creation and data transmission

The same assumption was made in virtually all attacks towards the Tor

network. This is reasonable because onion routers are set up by volunteers.

B. Details of Protocol-level Hidden Server Discovery

The hidden server discovery process can be divided into

three phases. Phase I: Presumably identify the hidden server

- the client continues to create circuits to the hidden server

until one of our entry routers sees a special combination of

cells of different types, i.e., protocol-level features. Phase II:

Verify the hidden server - our rendezvous point manipulates a

data cell and creates a decryption error at the hidden server,

which has to send out a destroy cell to destroy the circuit.

The destroy cell can be recognized by our entry router if the

hidden server uses our entry router. Phase III: Conclude by

time correlation - the central server uses timing information

of collected cells to correlate the unique sequence of events of

our discovery actions and draw a conclusion if the presumably

identiﬁed entry router is chosen by the hidden server and locate

the hidden server accordingly.

Relay C1,

{{{Extended,g^y4,H(K4)}}}

Relay C1,

{{{Extend,RPO,E(g^x4)}}}

Relay C2,

Relay C3,

{Extend,RPO,E(g^x4)}

Relay C4,

{Rendezvous1}

Relay C3,

Relay C2,

{{{Rendezvous1}}}

Relay C1,

{{{{Rendezvous1}}}}

Create C4,

E(g^x4)

Created C4,

g^y4,H(K4)

Created C3,

{g^y4,H(K4)}

Relay C2,

Create C3,

E(g^x3)

ttttt

Created C3,

g^y3,H(K3)

Relay C2,

{Extended,g^y3,H(K3)}

Relay C1,

Relay C2,

{Extend,OR3,E(g^x3)}

Relay C1,

Created C2,

g^y2,H(K2)

Relay C1,

{Extended,g^y2,H(K2)}

Create C2,

E(g^x2)

Relay C1,

{Extend,OR2,E(g^x2)}

Rendezvous

Point

Created_Fast C1,

g^y1,H(K1)

Create_Fast C1,

E(g^x1)

Entry OR

Exit ORMiddel OR

(link is TLS-encrypted) (link is TLS-encrypted) (link is TLS-encrypted)

Legend:

E(x) --- RSA encryption

{X} --- AES encryption

CN --- a circuit ID numbered N

Hidden

Server

(link is TLS-encrypted)

Fig. 7. Hidden server creating a circuit to the rendezvous point

Phase I: Presumably identify the hidden server. Recall

that the Tor client can derive the introduction point information

from directory server as illustrated in Step 3 of Figure 5. After

the client establishes a circuit to the rendezvous point, the

client will send a RELAY COMMAND INTRODUCE1 cell to

the introduction point in order to negotiate the Difﬁe-Hellman

key with the hidden server. We select this cell as a begin-sign

of our discovery approach and send this information to the

central server.

When the hidden server receives the RE-

LAY COMMAND INTRODUCE2 cell forwarded from

the introduction point, the hidden server will build a circuit to

the rendezvous point as shown in Step 8 of Figure 5. Once the

circuit is established, the hidden server will promptly send a

RELAY COMMAND RENDEZVOUS1 cell to the rendezvous

point. The circuit creation process is illustrated in Figure 7. As

we can see from Figure 7, the entry onion router will receive

one CELL CREATE FAST cell and four CELL RELAY

cells, including a RELAY COMMAND INTRODUCE2 cell,

and relay one CELL CREATED FAST cell and three

CELL RELAY cells to the hidden server.

Let us denote

the cells as protocol-level feature. Moreover, the entry

onion router will report the related information of each cell,

including the cell type, circuit ID, and the IP address of circuit

creator, to the central server. Furthermore, after the rendezvous

point receives the RELAY COMMAND RENDEZVOUS1 cell,

the rendezvous point needs to report the information to the

central server immediately.

Destroy C4',

Destroy C3',

{{{Reason}}}

Destroy C1',

{{{{{Reason}}}}}

Destroy C2',

{{{{Reason}}}}

Relay C1',

{{{{{XXXX}}}}}

Entry OR

Relay C2',

{{{{{XXXX}}}}

Relay C3',

{{{XXXX}}}

Relay C4',

Hidden

Server

Exit OR

Middle OR

(TLS Link)

(TLS Link)(TLS Link)

(TLS Link)

Fig. 8. Modify a cell at the RPO

Phase II: Verify the hidden server. Recall

that after the rendezvous point receives a RE-

LAY COMMAND RENDEZVOUS1 cell from the hidden serv-

er, it repacks it into a RELAY COMMAND RENDEZVOUS2

cell and forwards it to the client as illustrated in

Steps 8 and 9 of Figure 5. The client will send the

RELAY COMMAND BEGIN cell to the hidden server

through the circuit in order to open a stream between the

client and the hidden server. Then our controlled rendezvous

point can detect this special cell based on the hidden service

protocol, even if the rendezvous point cannot decrypt the cell

and obtain the content. Once the rendezvous point catches

this cell, we modify one bit of the cell and forward it to

the hidden server. Due to the lack of integrity veriﬁcation,

other onion routers cannot detect the manipulated cell. For

detection purpose, the rendezvous point needs to send the

timestamp of the manipulated cell to the central server.

Figure 8 illustrates the procedure of modifying the cell at the

rendezvous point.

When the manipulated cell reaches the hidden server, the

hidden server cannot correctly recognize this cell. According

to the design of Tor, the hidden server will tear down the

circuit between the client and hidden server by sending a CEL-

L Destroy cell promptly. The controlled entry onion router

will be the ﬁrst router that receives this cell, and it reports

the cell type, the timestamp of the cell, circuit ID and the

source IP address of the cell to the central server. Moreover,

the rendezvous point will receive this CELL Destroy cell as

well. The rendezvous point also needs to report the timestamp

of this cell to the central server.

Phase III: Conclude by time correlation. Since the central

server may receive many cells from our entry onion routers,

we carefully choose several appropriate feature cells and

use them to ﬁlter out useless cell information. The central

server records the source IP address of each cell, circuit ID,

The CELL CREATE FAST cell and CELL CREATED FAST are used

in the ﬁrst hop creation of hidden server instead of CELL CREATE and

CELL CREATED.

HTML Viewer

Frequently Asked Questions (8)

Q1. What contributions have the authors mentioned in the paper "Protocol-level hidden server discovery" ?

In this paper, the authors propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. The authors investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. The authors manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. The authors conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of their approach. Once their controlled entry onion routers detect such a feature, the authors can confirm the IP address of the hidden server.

Q2. What is the hidden server on the PlanetLab node?

The hidden server on the PlanetLab node is deployed on a Tor client and Apache server is installed as the HTTP server so as to offer hidden service.

Q3. What are the two groups of traffic analysis techniques?

Existing traffic analysis attacks against anonymous communication can be largely categorized into two groups: passive traffic analysis and active watermarking techniques.

Q4. How many routers are there in the Tor network?

There are 3101 onion routers in the Tor network, including 814 pure entry routers and 776 pure exit routers, 273 EE routers and 1238 N-EE routers.

Q5. What is the secret server's reaction to the mangled cell?

Since the hidden server cannot correctly decrypt the manipulated cell, it will destroy the circuit between the client and hidden server by sending a destroy cell to the client.

Q6. What is the procedure of establishing a circuit to introduction points?

2) Once the circuits to introduction points are established, the hidden server establishes a circuit to the directory server and advertises the service descriptor to the directory server, including the public key of the hidden server and the information regarding the introduction points.

Q7. What is the secret server discovery process?

Phase I: Presumably identify the hidden server - the client continues to create circuits to the hidden server until one of their entry routers sees a special combination of cells of different types, i.e., protocol-level features.

Q8. What is the effect of the active watermarking techniques on traffic?

Such techniques can reduce the false positive rate significantly if the signal is long enough and does not require massive training study of traffic cross correlation as required in passive traffic analysis.

Protocol-level hidden server discovery

Summary (4 min read)

Introduction

A. Components of the Tor Network

B. Circuit Selection and Creation

C. How Does the Hidden Service Work

A. Basic Idea

B. Details of Protocol-level Hidden Server Discovery

C. Make the Discovery Automatic

IV. ANALYSIS

A. Catch Probability

B. How to Improve the Catch Probability

V. EVALUATION

A. Experiment setup

B. Experiment Results

VI. DISCUSSION

VIII. CONCLUSION

Figures (11)

Citations

Cites methods from "Protocol-level hidden server discov..."

Cites background or methods from "Protocol-level hidden server discov..."

References

"Protocol-level hidden server discov..." refers background in this paper

"Protocol-level hidden server discov..." refers background in this paper

"Protocol-level hidden server discov..." refers background in this paper

Related Papers (5)

Frequently Asked Questions (8)

Q1. What contributions have the authors mentioned in the paper "Protocol-level hidden server discovery" ?

Q2. What is the hidden server on the PlanetLab node?

Q3. What are the two groups of traffic analysis techniques?

Q4. How many routers are there in the Tor network?

Q5. What is the secret server's reaction to the mangled cell?

Q6. What is the procedure of establishing a circuit to introduction points?

Q7. What is the secret server discovery process?

Q8. What is the effect of the active watermarking techniques on traffic?

Trending Questions (1)