Blockchain CyberWarfare / ExoWarfare

Research: Deanonymization of apps on an android mobile device running Tor

Tor is by far the most popular anonymity communication network, boasting over two million daily users from all parts of the globe. In a desktop setting, Tor is namely used to preserve anonymity while browsing the internet. Moreover, Tor may be used to protect any form of TCP based network traffic. On mobile devices, Tor is used to protect the identity of their users and shield them against mobile user profiling which is attempted for marketing, governmental surveillance, and other purposes.

A recently published research paper aims at proving that Tor is vulnerable to smartphone application deanonymization attacks on Android mobile devices via means of network traffic analysis. To achieve this goal, authors of the paper proposed a method for launching an attack that can deanonymize the android apps running on a mobile device using Tor. The paper also presents a proof-of-concept of the proposed method, which details how the attack can be launched and assesses its deanonymization effectiveness. Throughout this article, we will overview the deanonymization method presented via this paper and the results of the experiments conducted by its authors.

The deanonymization attack model:

The threat model involves an adversary who aims at deanonymizing the apps on a target android smartphone using Tor. In other words, they want to identify which apps are running on the target android device at any given time. It is to be assumed that the target device is connected to the internet via a wireless network, either through a Wi-Fi LAN or a cellular WAN, and that the adversary is capable of capturing the traffic between the access point and the target device. For example, the adversary might be the administrator of the Wi-Fi access point or the basic station to which the target device is connected. It is also to be assumed that the Tor client is installed on the android mobile device and all of its apps’ internet traffic go through the Tor client.

Figure (1) illustrates an overview of the deanonymization method. The method is based on the fact that different apps yield different patterns of network traffic, which are discernible using network traffic analysis techniques, even if the traffic is being anonymized via Tor.


Figure (1): An overview of the method for deanonymization of android apps running behind Tor


The method implements network traffic analysis via machine learning and involves two separate phases:

1- Training phase:

This represents the attack’s preparation phase and involves constructing a machine learning model for the identifiable features of Tor traffic of different apps. For each app, raw traces of Tor traffic, such as PCAP files, are collected. Such traces can be obtained from public datasets or generated synthetically. The gathered network traffic traces are then processed to extract distinct features which will be input to the machine learning training module that is based on a multi-class classifier.

2- Deanonymization phase:

This represents the actual attack phase against the target device, which is conducted via monitoring of its traffic using the constructed machine learning model in order to identify which apps are being used by the victim. This phase is composed of two separate stages: a monitoring stage and a classification stage. The monitoring stage involves capturing of the target’s network traffic, while the classification stage involves processing of network traces and outputting the resultant classification. For each app, a classifier network traffic output is created, which leads to deanonymization of the apps.

Proof-of -Concept and practical implementation of the deanonymization method:

For the method’s proof-of-concept, the setup in figure (2) was built. The target android smartphone connects to the internet via means of a wireless router connected to a workstation. The workstation is used to setup the router to be capable of capturing the target’s traffic and to run the machine learning processes.


Figure (2): Proof-of-concept setup of the deanonymization method


A Xiaomi MiWiFi router was used. The experiments were conducted using two target smartphones: a Motorolla Moto G running Android 6.0, and a Samsung Galaxy Nexus using Lineage OS, which is a mobile operating system based on Android 6.0. On both mobile devices, Orbot was used, which is a proxy android app that enables the usage of Tor on smartphones. A standard PC was used as the setup’s workstation.

Tcpdump was used for network traffic sniffing at the router level, while Wireshark was run on the workstation to perform network analysis. Machine learning was implemented via Scikit-Learn, a popular python library that deploys a large number of machine learning algorithms.

This setup was used in multiple experiments to evaluate the efficiency of the proposed deanonymization method. Experiments showed that the proposed method has an accuracy of 97.3% in identifying the apps running on an android mobile device using Tor. However, there has been variability in how accurately the method can recognize various apps. YouTube, Spotify, and the Tor Browser were the most difficult apps to be identified on an android device. It was shown that the traffic patterns of these three apps are confused for one another. As both YouTube and Spotify involve video streaming content, they mostly generate relatively similar patterns of internet traffic. The same explanation can be applied to the Tor Browser, especially if used to visit websites with video streaming content.

Experiments showed that the classifier of network traffic traces is misled mostly by three apps, Instagram, Facebook, and the Tor Browser. This is not strange given the typical patterns of usage of these apps. Instagram, Facebook, and the Tor Browser are apps associated with relatively long idle periods, i.e. the user’s “thinking or viewing time,” when compared with other apps that mainly involve video streaming content; thus, they are associated with shorter and less frequent idle periods. The most easily identified app throughout the experiments was utorrent with an accuracy rate of 99%, while Skype could be identified with 98% accuracy. Replaio_radio could be identified was an accuracy of 95%.

Authors of the paper made the software developed for the Proof-of-Concept, and all the datasets that were built during the conducted experiments are publicly available, so that they can be used to evaluate Tor’s vulnerability to these forms of attacks, compare other potential methods, and work on developing possible countermeasures.

Final thoughts:

Tor is relatively vulnerable to android app deanonymization attacks, as proven by the effectiveness of the method proposed by the paper we presented. It is rather simple to successfully identify which apps are being used on an android smartphone running Tor. Further research is needed to formulate countermeasures that can protect android devices against app deanonymization.