Megha Bose supervised by Dr. Praveen Paruchuri received her Master of Science – Dual Degree in Computer Science and Engineering (CSE). Here’s a summary of her research work on Learning Adaptive Strategies for Moving Target Defense in Uncertain and Dynamic Environments:
As cyber threats grow in complexity and sophistication, vulnerabilities are increasingly exploited due to expanding attack surfaces. Knowing all possible vulnerabilities in advance is impractical. Traditional defense techniques often fall short as attackers can conduct thorough reconnaissance on static systems and launch targeted attacks at their convenience. Moving Target Defense (MTD) offers a promising alternative by dynamically altering system configurations, making it harder for attackers to gain a stable understanding of the attack surfaces. However, the implementation of MTD can be costly, with too frequent configuration changes increasing maintenance costs and impacting the quality of service, potentially negating its benefits. Thus, there is a pressing need for cost-efficient MTD strategies that continuously balance robust defense mechanisms with minimal switching overhead. Many existing techniques rely on assumptions about prior knowledge of the attacker’s intentions and payoffs, which limits their applicability in the real-world where such information is often not readily available. This thesis proposes a novel MTD framework designed to not only thwart attacks but also to be cost-aware and continuously adapt to the evolving attack landscape. We leverage the mathematical framework of Factored Markov Decision Process (FMDP) to develop effective defense policies without assuming prior knowledge of the attacker side payoffs. Instead, we incorporate real-time attacker responses into the defender’s MDP using a dynamic Bayesian network. By employing an FMDP, we create a compact yet realistic representation of the system which often consists of multiple changeable aspects that influence the attacker’s response and the rewards. We demonstrate theoretically that considering adaptive attackers in real-world scenarios and devising effective strategies presents inherent challenges, including a negative result on regret bound. However, with certain assumptions, we show that the average regret of our method can be bounded. We validate the efficacy of our method empirically in two domains: a web application and a network system. Our experiments cover various scenarios involving evolving and unknown attackers. The results highlight the potential of our approach in significantly improving defense outcomes and enhancing the scope of modelling complex, real-world scenarios. We include a related co-authored work for comparison, which addresses this problem using a multi-armed bandit (MAB) framework. This approach employs a “follow-the-perturbed-leader” style algorithm to learn effective MTD strategies.
February 2025