TorMalTraffic2019 Dataset
These are the datasets used for my research entitled, "Characterization and Classification of Malware Traffic over the Tor Network." Please read the conference paper to determine how the datasets were generated.
TorTraffic2019
The dataset contains the following traffic:
Web - http, https
Mail - gmail, uplb
Chat - hangouts, messenger, utox
Audio - soundcloud, streamsquid
Video - tedtalks, youtube
FTP - mmnt, rebex, wftserver
VoIP - messenger, mumble, utox
P2P - qBittorent, deluge
Whonix - TLS
TorMal2019
The dataset contains the following traffic:
Web - http, https
Malware - dexter, kazy, locky, parite, wannacry
License
These datasets are publicly available for researchers. If you are to use this dataset, you should cite our research paper which discusses the details of how the datasets were generated:
Marie Betel B. de Robles, Joseph Anthony C. Hermocilla, and Jaderick P. Pabico. Characterization and classification of malware traffic over the tor network. In Proceedings of the 20th Philippine Computing Science Congress (PCSC 2020), ISSN: 1908-1146, pages 78--87, Philippines, 2020. Computing Society of the Philippines.
@inproceedings{derobles-pcsc2020-characterization,
author = {de Robles, Marie Betel B. and Hermocilla, Joseph Anthony C. and Pabico, Jaderick P.},
title = {Characterization and Classification of Malware Traffic over the Tor Network},
booktitle = {Proceedings of the 20th Philippine Computing Science Congress (PCSC 2020)},
year = 2020,
issn = 1908-1146,
month = march,
publisher = {Computing Society of the Philippines},
address = {Philippines},
pages = {78-87}
}
Download
Download and verify the file (Linux):
Choose and download the dataset:
[TorTraffic2019](6.76 GB) [SHA][MD5]
[TorMal2019](575.3 MB) [SHA][MD5]
Open a terminal and change to the directory with the downloaded file.
cd <path_to_file>
Type the following commands:
md5sum --check <dataset_filename>.tar.xz.md5
sha256 --check <dataset_filename>tar.xz.sha256
When both commands return an OK, the checksums match. If the checksums do not match, your downloaded dataset file is broken. Please try to download again to get a valid file.