Protocol-level hidden server discovery
Summary (4 min read)
Introduction
- Tor is a broadly-used low-latency anonymous communication system which supports TCP applications over the Internet [1].
- Unfortunately, hidden services have been misused or abused for various illegal purposes.
- The attacker controls both malicious client and entry routers, and uses a simple passive timing analysis, i.e., cell counting, to discover the same patten in the client’s traffic and the entry’s traffic to identify the hidden service at the entry side.
- In Phases I and II, related information of these cells observed at their clients, entry routers, and the rendezvous point has been sent to their central server.
A. Components of the Tor Network
- Figure 1 illustrates the basic architecture of Tor network.
- The first threebye header of the Tor cell is not encrypted so that the Tor onion router can read this header.
- The first two bytes is the circuit ID, while the third byte, is used to indicate the specific command of this cell.
- The authors categorize the Tor cell into two types: the control cell as illustrated in Figure 2 (a) and the relay cell as shown in Figure 2 (b).
- The filed Command of relay cell is CELL RELAY that is used to relay the application data.
B. Circuit Selection and Creation
- To communicate with an application server via Tor, a Tor client first downloads all of the onion router information from the directory server and uses source routing by choosing a series of onion routers as a route.
- The authors call the sequence of onion routers as the path through Tor.
- After that, the client initiates the procedure of creating a circuit incrementally, one hop at a time.
- Then the client sends a CELL CREATE cell through the TLS connection and uses the Diffie-Hellman (DH) handshake protocol to negotiate a base key K1 = gxy with entry onion router, which responds with a CELL CREATED cell.
- Once the circuit is established, the client sends a RELAY COMMAND BEGIN cell to the exit onion router, and the cell is encrypted as {{{Begin < IP, Port >}kf1}kf2}kf3 , where the subscript refers to the key used for encryption of one specific onion skin.
C. How Does the Hidden Service Work
- A hidden service involves six participants, including a Tor client, the directory server, onion routers, a rendezvous point, an introduction point, and a hidden server.
- Once the introduction point is decided, the hidden server establishes a circuit to the introduction point.
- This key is used for end-to-end encryption between client and server.
- 9) The rendezvous point obtains the RELAY COMMAND RENDEZVOUS1 cell and compares the rendezvous cookie from the cell and the one from the Tor client.
- From the above procedure, the Tor client only knows the introduction point instead of the hidden server directly, and the hidden server only knows the rendezvous point instead of the Tor client directly.
A. Basic Idea
- Since only entry onion routers may know the real IP address of a hidden server, the authors assume that they are able to control several entry onion routers3.
- The authors rendezvous point manipulates an appropriate cell [9], and forwards the mangled cell to the hidden server.
- The rendezvous point also reports this cell to the central server.
- (v) To determine if the hidden server chooses one of their entry onion routers, the central server searches for correlation between the time when the rendezvous point sends the manipulated cell, the time when the rendezvous point receives the destroy cell, and the time when the entry onion router receives the destroy cell.
- This is reasonable because onion routers are set up by volunteers.
B. Details of Protocol-level Hidden Server Discovery
- The hidden server discovery process can be divided into three phases.
- When the hidden server receives the RELAY COMMAND INTRODUCE2 cell forwarded from the introduction point, the hidden server will build a circuit to the rendezvous point as shown in Step 8 of Figure 5.
- Let us denote the cells as protocol-level feature.
- After the central server receives the manipulated cell information, it can first filter out the circuits along which their entry onion routers are not chosen as the first router, by counting the number of CELL CREATE, CELL CREATED and CELL RELAY cells based on the Tor circuit creation protocol.
- Eventually, the authors choose the CELL DESTROY cell information received from the rendezvous point as the endsign cell of Phase II.
C. Make the Discovery Automatic
- The authors want to emphasize that their discovery of hidden server is conducted fully automatically.
- The central server builds tcp connections to the Tor client, entry onion routers and rendezvous points, and receives the information from those nodes.
- Subsequently, the central server receives the begin-sign cell of Phase II from the rendezvous point and records the cell type, circuit ID, and the timing of the begin-sign cell, denoted as Tb.
- Once the circuit is found, the central server compares the timing of the CELL DESTROY cell from the entry onion router by using the condition Tb < Td < Te. Eventually, the authors make the entire discovery process automatic.
IV. ANALYSIS
- For their approach to be effective, the key issue is that one of their controlled entry onion routers should be selected as the entry onion router by the hidden server.
- The authors analyze the chance that the hidden server selects one of their onion routers as the entry router, and propose a strategy that can greatly increases this chance without incurring large cost.
A. Catch Probability
- To evaluate the effectiveness of their discovery approach, the authors analyze the catch probability, i.e., the probability that a circuit from a hidden server selects one of their controlled onion routers as the entry onion router.
- To start with, the authors need to introduce how Tor determines a circuit among the many possible paths.
- To be specific, the onion routers are categorized into four classes: pure entry routers (entry guards), pure exit routers, both entry and exit routers (denoted as EE routers), and neither entry nor exit routers (denoted as N-EE routers).
- According to the Tor onion router selection algorithm, the bandwidth of each entry onion router is weighted.
- The authors intentionally denote the catch probability as P(k, b) to emphasize that the probability depends on the number of injected onion routers and the claimed bandwidth of each injected onion router.
B. How to Improve the Catch Probability
- Based on Equation (2), it is easy to prove the following claims and the proof can be found in their technical report [10]: .
- The catch probability is determined by the aggregated bandwidth contributed by the controlled Tor entry routers.
V. EVALUATION
- The authors have implemented the proposed Tor hidden service discovery approach in Section III.
- The authors elaborate the results of the empirical evaluation of the approach.
- The authors experimental results match the theoretical analysis presented in Section IV well.
A. Experiment setup
- Figure 9 illustrates the experiment setup for protocol-level hidden service discovery over the real-world Tor.
- The authors deployed a Tor client, entry onion router, and central server at two different campuses in the north American.
- To implement the discovery, the authors revised the source code of Tor at the Tor client side, rendezvous point, and entry router in order to establish the connections to their central server and send the related information.
- The version of Tor in their experiment is the latest stable version 0.2.2.37.
- In addition, at the client side, the authors implemented a web client to automatically send a http request of the specific onion address, and then installed the HTTP proxy, i.e., Privoxy [12], in order to relay the http request into the OP.
B. Experiment Results
- Table I gives the detection rate by their protocol-level hidden server discovery approach.
- The experiments were repeated for 1000 times for deriving the true positive rate, the probability that the hidden server is detected if it uses their entry server.
- One experiment lasts for around 15 seconds.
- This figure demonstrates that the catch probability increases quickly with the number of controlled entry routers and the bandwidth of each entry router.
- The current Tor network has 776 pure entry routers and 273 EE routers.
VI. DISCUSSION
- There are various complicated cases in discovering a hidden server via their protocol-level discovery approach.
- Once the hidden server chooses one of their entry routers, it will use their entry router in the following 30-60 days and will be exposed by their discovery approach.
- Another complicated case is that the operator of a hidden server can choose three their trusted entry guards or Tor bridges6 to avoid choosing their surveillance entry guards.
- Recall in Phase I of their original approach, if their controlled middle router is selected by the hidden server side circuit, the middle router will receive one CELL CREATE cell and three CELL RELAY cells, including a 6Tor bridges are a type of hidden onion routers that are not public in the directory server.
- The two-steps discovery approach above can also be used to tackle Complicated case I.
VII. RELATED WORK
- Øverlier and Syverson proposed the packet counting based traffic analysis to identify the hidden server at the entry onion router.
- All those methods are based on traffic analysis, which may suffer a high rate of false positives due to various factors.
- The authors approach is largely different from the existing approaches.
- Existing traffic analysis attacks against anonymous communication can be largely categorized into two groups: passive traffic analysis and active watermarking techniques.
VIII. CONCLUSION
- The hidden service over Tor is a double-edged sword.
- A system that tracks down a hidden service can effectively deter malicious users from abusing the Tor network for illegal usage.
- The authors design, implement, and evaluate such a system.
- The authors method augments the arsenal of existing detection tools, but it is unique in that the authors do not rely on conventional time consuming traffic analysis or watermarking techniques.
- The authors also expect the debate of detrimental sides of anonymous communication to continue in the long run, and hope to give a choice to law enforcement for tracking notorious hidden services.
Did you find this useful? Give us your feedback
Citations
118 citations
84 citations
56 citations
34 citations
Cites methods from "Protocol-level hidden server discov..."
...To anonymously communicate with the server over Tor, the client downloads onion router information from directory servers and chooses a series of onion routers to establish a three-hop path2, referred to as circuit....
[...]
32 citations
Cites background or methods from "Protocol-level hidden server discov..."
...To anonymously communicate with the remote server over Tor, the client downloads onion router information from a directory server and chooses a series of onion routers to establish a three-hop path, referred to as circuit....
[...]
...…Engineering Research Council of Canada, in part by the U.S. National Science Foundation under Grant 1461060, Grant 1116644, Grant 1350145, and Grant CNS 1117175, in part by Jiangsu Provincial Natural Science Foundation of China under Grant BK20150637, Grant BK20150628, and Grant BK20150629, in…...
[...]
...This work was supported in part by the China National High Technology Research and Development Program under Grant 2013AA013503, in part by the National Natural Science Foundation of China under Grant 61502100, Grant 61572130, Grant 61532013, Grant 61502098, Grant 61502099, Grant 61272054, Grant 61202449, Grant 61402104, and Grant 61320106007, in part by a Discovery Grant (195819339) from the Natural Sciences and Engineering Research Council of Canada, in part by the U.S. National Science Foundation under Grant 1461060, Grant 1116644, Grant 1350145, and Grant CNS 1117175, in part by Jiangsu Provincial Natural Science Foundation of China under Grant BK20150637, Grant BK20150628, and Grant BK20150629, in part by Jiangsu Provincial Key Technology R&D Program under Grant BE2014603, in part by Jiangsu Provincial Key Laboratory of Network and Information Security under Grant BM2003201, and in part by Key Laboratory of Computer Network and Information Integration of Ministry of Education of China under Grant 93K-9....
[...]
...To transmit data to a remote server, the client packs the data into basic transmission units, referred to as cells, where each cell has a fixed size of 512 bytes....
[...]
References
3,960 citations
"Protocol-level hidden server discov..." refers background in this paper
...I NTRODUCTION Tor is a broadly-used low-latency anonymous communication system which supports TCP applications over the Internet [1]....
[...]
595 citations
438 citations
362 citations
"Protocol-level hidden server discov..." refers background in this paper
...…are deployed with the hidden service over Tor [4], they are hard to take down because of the anonymity protected with the hidden service; if a hidden service hosts child pornography website [5] [6]1, the hidden service actually blindly provides a protection tthe illegal content in most countries....
[...]
...Existing research work [7], [8] has been carried out to investigate the attacks which can locate the Tor hidden server....
[...]
278 citations
"Protocol-level hidden server discov..." refers background in this paper
...…are deployed with the hidden service over Tor [4], they are hard to take down because of the anonymity protected with the hidden service; if a hidden service hosts child pornography website [5] [6]1, the hidden service actually blindly provides a protection tthe illegal content in most countries....
[...]
...The attacker controls both malicious client and entry routers, and uses a simple passive timing analysis, i.e., cell counting, to discover the same patten in the client’s traffic and the entry’s traffic to identify the hidden service at the entry side....
[...]
Related Papers (5)
Frequently Asked Questions (8)
Q1. What contributions have the authors mentioned in the paper "Protocol-level hidden server discovery" ?
In this paper, the authors propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website. The authors investigate the Tor hidden server protocol and develop a hidden server discovery system, which consists of a Tor client, a Tor rendezvous point, and several Tor entry onion routers. The authors manipulate Tor cells, the basic transmission unit over Tor, at the Tor rendezvous point to generate a protocol-level feature at the entry onion routers. The authors conduct extensive analysis and experiments to demonstrate the feasibility and effectiveness of their approach. Once their controlled entry onion routers detect such a feature, the authors can confirm the IP address of the hidden server.
Q2. What is the hidden server on the PlanetLab node?
The hidden server on the PlanetLab node is deployed on a Tor client and Apache server is installed as the HTTP server so as to offer hidden service.
Q3. What are the two groups of traffic analysis techniques?
Existing traffic analysis attacks against anonymous communication can be largely categorized into two groups: passive traffic analysis and active watermarking techniques.
Q4. How many routers are there in the Tor network?
There are 3101 onion routers in the Tor network, including 814 pure entry routers and 776 pure exit routers, 273 EE routers and 1238 N-EE routers.
Q5. What is the secret server's reaction to the mangled cell?
Since the hidden server cannot correctly decrypt the manipulated cell, it will destroy the circuit between the client and hidden server by sending a destroy cell to the client.
Q6. What is the procedure of establishing a circuit to introduction points?
2) Once the circuits to introduction points are established, the hidden server establishes a circuit to the directory server and advertises the service descriptor to the directory server, including the public key of the hidden server and the information regarding the introduction points.
Q7. What is the secret server discovery process?
Phase I: Presumably identify the hidden server - the client continues to create circuits to the hidden server until one of their entry routers sees a special combination of cells of different types, i.e., protocol-level features.
Q8. What is the effect of the active watermarking techniques on traffic?
Such techniques can reduce the false positive rate significantly if the signal is long enough and does not require massive training study of traffic cross correlation as required in passive traffic analysis.
Trending Questions (1)
In this paper, we propose a protocollevel hidden server discovery approach to locate the Tor hidden server that hosts the illegal website.