Overview
A research testbed that couples multi-agent reinforcement learning with a realistic mobile ad-hoc network (MANET) simulator. Agents must learn both how to act and how to communicate — deciding what to share, with whom, and when, while the underlying network is lossy, mobile, and bandwidth-constrained. The framework is built to study emergent coordination, protocol adaptation, and robustness under adversarial conditions.
The Problem
Classical MARL benchmarks assume perfect, cost-free communication. Real tactical networks do not: links drop, bandwidth is scarce, topology changes as units move, and adversaries jam or spoof traffic. Policies trained on idealized channels collapse when deployed over real MANETs. This testbed closes that gap by making the network a first-class part of the environment.
My Role & Contribution
- Designed the coupling layer between the MARL environment and the network simulator
- Implemented baseline cooperative and competitive agent policies
- Built the experiment harness and evaluation metrics for communication efficiency and task success
Approach
- Wrap an NS-3-based MANET simulator as a Gymnasium/PettingZoo environment so standard MARL algorithms can train against it
- Model agent messages as explicit actions — sending costs bandwidth, and dropped messages are not delivered
- Train with centralized-training / decentralized-execution methods (MAPPO, QMIX, IPPO) from Stable-Baselines3 and custom PyTorch implementations
- Evaluate on tasks that require coordination under degraded links — patrol, coverage, convoy escort, contested search
- Ablate against idealized-channel baselines to quantify the policy gap induced by realistic networking
Tech Stack
Python
PyTorch
PettingZoo
Gymnasium
Stable-Baselines3
NumPy
NS-3 / ns3-gym
Matplotlib
Results & Impact
- Reproducible training and evaluation of MARL policies over realistic MANET conditions
- Open-source release so other researchers can build on the testbed
// TODO: add diagrams / screenshots