Using machine learning to speed up attack attribution on the Internet of Vehicles ecosystem

Authors: George Raptis, Christina Katsini

Edited by: Christos Alexakos

Internet of Vehicles (IoV) ecosystems attract a wide and diverse range of threats such as jamming, eavesdropping, and interference, which could influence system stability, robustness, security, and privacy [1]. Their successful confrontation is critical, considering that such vulnerabilities could lead to non-effective services and even accidents damaging to life quality and human safety.

Besides developing defense mechanisms to detect attacks and apply response or mitigation strategies, performing a successful attack attribution is crucial for an organization. But what is attack attribution? To put it simply, attack attribution is the process that gives us the answer on who, how, and why performed an attack.

Attack attribution is a challenging task that takes time and requires patience. In some cases, it requires years of extensive forensic investigations [2]. These investigations are not an automatic process, but they typically require the involvement of security experts. They collect and analyze digital forensic evidence and historical data, they assess the possible incentives and motives by taking into consideration the uniqueness of the case they investigate. The accuracy and the confidence of the attack attribution depend on the available evidence and the complexity of the system that is under attack. Considering that IoV systems are complex and emerging systems, with limited available data, attack attribution is even harder.

To overcome such challenges, we can adopt machine learning approaches to provide the security experts with tools that speed up the attack attribution process. By adopting machine learning techniques, we can quickly extract and analyze pieces of code within malicious files automatically. Then, the extracted pieces are compared with previously investigated samples from advanced persistent threats. The level of similarity could lead to identifying the threat actors behind the attack. Hence, machine learning can quickly link the new attack with known threats, actors, and campaigns.

An example of such an attack attribution process is the discovery of Lamberts by Kaspersky [3]. Lambers is a collection of attack tools used against high-profile victims since 2008. In October 2014, the Lambert family malware was disclosed publicly. Over the next years, several related malware was discovered, which shared code, coding style, data formats, command and control servers, victims, etc. By putting all the pieces of information together, the attack attribution process led to a common family of attack tools: the Lamberts family. Therefore, having all this information in one place, accessible, processed, and analyzed, helps us use it preventatively.

Another direction is the profiling of threat actors based on attack patterns extracted from threat intelligence reports by following machine learning and natural language processing approaches [4]. When analyzing an attack, the extracted information is typically recorded in threat intelligence reports. This information varies, as it could be about the characteristics of the detected attacks, the analysis methods, the adversary tactics and techniques, the forensics investigation, etc. Despite the vast amount of this information, it is scattered in various sources, such as the MITRE ATT&CK framework [1]. However, attacks can be linked with each other (e.g., have common characteristics, share common adversary tactics, have been analyzed, and confronted with similar tools). Therefore, it is important to equip security experts with tools that analyze the intelligence threat reports and extract common patterns quickly and automatically. We can use cluster analysis techniques to discover similarities and differences between the collected reports, and thus, be closer to a possible identification or exclusion of known threat actors.

In nIoVe, we adopt machine learning techniques to provide the security experts with a family of tools to link a detected attack with reported ones. We use a series of clustering algorithms to find similarities between collected threat intelligence reports recorded in the nIoVe repository. These algorithms create groups of similar reports and indicate threats that belong to the same family. For example, they might target the same IoV component types (e.g., devices, sensors), share similar attack techniques, etc. Based on the clustered information, the security experts can directly compare the new evidence to existing knowledge resulting from past events, evaluate tactics, techniques, and procedures of known threat actors, determine how confident they are about their judgments, and consider alternative scenarios to trace malicious operations back to their sources and, thus, assess attack attribution.


Hence, through machine learning, it is possible to have a more precise post-incident forensics investigation and accelerate the attack attribution process. As a result, we are able not only to confront the detected attack and limit its impact but also, to deter similar attacks in the future, keeping IoV a safe place.


[1] Zacharaki, I. Paliokas, K. Votis, C. Alexakos, D. Serpanos, and D. Tzovaras, Complex Engineering Systems as an Enabler for Security in Internet of Vehicles: The nIoVe Approach. in 2019 First International Conference on Societal Automation (SA). IEEE, 2019, pp. 1–8. [Online]. Available:

[2] C. Alexakos, C. Katsini, K. Votis, A. Lalas, D. Tzovaras, and D. Serpanos, Enabling Digital Forensics Readiness for Internet of Vehicles. Transportation Research Procedia, vol. 52, pp. 339–346, 2021, 23rd EURO Working Group on Transportation Meeting, EWGT2020, 16-18 September 2020, Paphos, Cyprus. [Online]. Available:

[3] Namestnikov, Y. Attribution in a World of Cyberespionage. Industrial Cybersecurity: Opportunities and challenges in Digital Transformation, 19-21 September 2018, Sochi, Russia. [Online]. Available:

[4] Noor, U., Anwar, Z., Amjad, T., & Choo, K.-K. R. (2019). A Machine Learning-based FinTech Cyber Threat Attribution Framework using High-level Indicators of Compromise. Future Generation Computer Systems, 96, 227–242.

By accepting you will be accessing a service provided by a third-party external to