J 2019

Big Data Sanitization and Cyber Situational Awareness: A Network Telescope Perspective

BOU-HARB, Elias, Martin HUSÁK, Mourad DEBBABI and Chadi ASSI

Basic information

Original name

Big Data Sanitization and Cyber Situational Awareness: A Network Telescope Perspective

Authors

BOU-HARB, Elias (124 Canada), Martin HUSÁK (203 Czech Republic, guarantor, belonging to the institution), Mourad DEBBABI (124 Canada) and Chadi ASSI (124 Canada)

Edition

IEEE Transactions on Big Data, IEEE, 2019, 2332-7790

Other information

Language

English

Type of outcome

Článek v odborném periodiku

Field of Study

10201 Computer sciences, information science, bioinformatics

Country of publisher

United States of America

Confidentiality degree

není předmětem státního či obchodního tajemství

References:

RIV identification code

RIV/00216224:14610/19:00108740

Organization unit

Institute of Computer Science

UT WoS

000501301600003

Keywords in English

Darknet sanitization;Time series analytics;Security analytics;Cyber threat intelligence

Tags

Tags

International impact, Reviewed
Změněno: 30/3/2023 16:15, Mgr. Alena Mokrá

Abstract

V originále

This paper addresses the problems of data sanitization and cyber situational awareness by analyzing 910 GB of real Internet-scale traffic, which has been passively collected by monitoring close to 16.5 million darknet IP addresses from a /8 and a /13 network telescopes. First, the paper offers a novel probabilistic darknet preprocessing model, which aims at sanitizing darknet data to prepare it for effective use in the task of cyber threat intelligence generation. Such model has been engineered using a distributed multithreaded approach, rendering it highly effective on darknet big data. Second, the paper further contributes by presenting an innovative approach to infer large-scale orchestrated probing campaigns by leveraging darknet data, for Internet cyber situational awareness. The approach uniquely reduces the dimensionality of such big data by utilizing its artifacts, instead of processing the actual raw data. This is accomplished by extracting and analyzing probing time series using formal methods rooted in Fourier transform and Kalman filtering. Thorough empirical evaluations indeed validate the accuracy and the performance of the proposed methods. We assert that such approaches are of significant value, given their highly applicable nature to the field of Internet measurements for cyber security in the era of big data.