Mission Autonomy for Underwater Mine Countermeasures

Autonomous underwater navigation, object classification and mapping in a cluttered harbor environment using a lightweight world model.

Mission autonomy AI-driven perception Semantic mapping World model & AI-driven decision making

Simulated harbor environment used for autonomous mine countermeasure experiments.

Motivation

Mine Countermeasures (MCM) operations represent one of the most challenging and resource-intensive tasks in the maritime domain. Autonomous underwater systems (AUV) have the potential to significantly improve the safety and efficiency of such missions by reducing human exposure to hazardous environments while increasing operational endurance.

The objective of this project is to investigate the application of modern artificial intelligence techniques to underwater mission autonomy within a simulated harbor environment. A BlueROV2, autonomous underwater robot is tasked with approaching a harbor from open water, detecting, classifying and mapping mine-like objects, building situational awareness of the environment, and identifying safe navigation corridors toward the objective (world model).

The project combines several emerging technologies, including synthetic perception, object detection, semantic mapping, lightweight world models, and autonomous decision-making. Particular focus is placed on the use of compact AI world models capable of maintaining an internal representation of the operational environment and predicting future actions in GPS-denied underwater conditions.

Through the development of a digital twin and simulation-based experimentation, the project aims to explore how next-generation maritime autonomous systems could support future mine countermeasure, harbor inspection, and maritime security operations.

Mission Scenario

The vehicle is deployed from open water and tasked with approaching a protected harbor area. During transit, the system must detect mine-like objects, maintain situational awareness, and identify a safe route through the harbor entrance while operating in a GPS-denied underwater environment.

Open sea
Mine-like object
Debris
Quay wall
Harbor entrance
Safe navigation corridor

Harbor layout in Gazebo with explanations

The vehicle approaching from open water in possible mission scenario searching for the safe navigation corridor to the harbor entrance.

Research Objectives

The project investigates how modern AI techniques can be applied to maritime mission autonomy in GPS-denied underwater environments. Particular focus is placed on situational awareness, hazard detection, semantic mapping, and decision-making under uncertainty.

Can a compact world model reduce the need for extensive hand-crafted mission logic?
Can an autonomous underwater vehicle learn harbor approach and navigation behaviors from simulated experience?
How efficiently can mine-like objects and other hazards be detected and incorporated into a semantic map during exploration?
Can future mission states and safe navigation corridors be predicted before executing a maneuver?
What information must be retained to maintain situational awareness in a dynamic maritime environment?
How can natural-language mission objectives be transformed into autonomous underwater behaviors?
How can mission progress be transferred to the operator in real-time?
Can a world model act as an information compression mechanism for underwater autonomy, transmitting only mission-relevant knowledge instead of large volumes of raw sensor data?

Project Architecture

The autonomy stack is designed around mission-level decision-making rather than direct teleoperation. A natural-language mission interface defines the operator's intent, while the perception layer builds situational awareness from simulated sensor inputs. A lightweight world model maintains an internal representation of the environment and supports action prediction for autonomous navigation through a potentially hazardous harbor approach. The mission decision layer evaluates potential courses of action and selects behaviors that best satisfy mission objectives while minimizing operational risk. Selected actions are executed through the vehicle control layer, while a communication layer continuously relays mission-critical information, detected hazards, and mission progress to the operator via low-bandwidth underwater communication links.

Architecture diagram. Current implementation: digital twin, mission scenario, preliminary perception pipeline, world model design. In development: semantic mapping, world model training, autonomous decision layer.

Mission Intent Layer

The mission intent layer provides a high-level interface between the operator and the autonomous system. Mission objectives can be specified using natural language and are translated into structured mission goals that can be executed by the autonomy stack. Examples include harbor reconnaissance, mine countermeasure surveys, infrastructure inspection, and autonomous navigation tasks.

Inputs:

Natural language mission descriptions (i.e. "Investigate sector 5.")
Operator mission objectives (e.g. "Map mine-like objects and identify a safe navigation corridor into the harbor.")
Mission constraints (e.g. "Avoid entering restricted zones and maintain a minimum distance from detected hazards.")

Outputs:

Structured mission goals (e.g. Survey designated search sectors and generate a hazard map.)
Navigation objectives (e.g. Reach the harbor entrance while maintaining a safe stand-off distance from detected threats.)
Mission priorities (e.g. Threat detection and classification take precedence over minimizing transit time.)

Perception Layer

The perception layer provides the vehicle's understanding of the surrounding environment. Sensor observations are fused to detect mine-like objects, identify harbor structures, and estimate free navigable space. The generated environmental representation is subsequently used by the world model to maintain situational awareness and support mission-level decision-making.

Inputs:

Vehicle pose
Sonar data
Camera image
Harbor elements observations

Outputs:

Semantic environment representation
Mine-like object detections
Harbor structure detections
Navigable regions
Environmental observations for the world model

A first YOLO26s object-detection model was trained on a synthetic underwater dataset generated from a Gazebo maritime scene. The dataset includes visual variation in lighting, fog density, water color, and image blur to approximate different underwater visibility conditions.

Validation results:

Metric	Value
Precision	0.898
Recall	0.904
mAP50	0.955
mAP50-95	0.791

The strongest current classes are mine and mine_anchor. Other classes, including barrel, tire, stone, crate, and mine_chain, are underrepresented and will be expanded in the next dataset version.

First results obtained using a synthetic dataset of 288 images (70:20:10 train/validation/test split), trained for 150 epochs under varying ambient lighting, water color, fog density, and blur conditions.

Source code: GitHub repository

Semantic Mapping Layer

The semantic mapping layer maintains a structured representation of the operational environment with combined observations collected over mission time. Detected objects (mine-like objects, harbor structures, navigable regions) are stored in a continuously updated mission map. This representation provides the foundation for situational awareness and future decision-making.

Inputs:

Mine-like object detections
Harbor structure detections
Navigable region estimates
Vehicle pose estimates
Environmental observations

Outputs:

Semantic environment map
Hazard locations
Harbor structure map
Navigable corridor representation
World model observations

World Model

The world model maintains information from semantic representation and predicts future mission states based on current observations and historical information. By compressing relevant environmental information into a compact latent representation, the model aims to support long-horizon planning and autonomous decision-making in uncertain underwater environments.

Inputs:

Semantic environment map
Vehicle state
Mission objectives
Historical observations

Outputs:

Internal environment representation
Predicted future states
Estimated mission progress
Navigation recommendations
Decision-support information

A recurrent neural network (GRU) was developed to learn the temporal dynamics of an autonomous underwater mission directly from structured mission logs. Each input sequence consists of the recent history of the vehicle state, detected objects, mission context, and executed actions. The network compresses this history into a latent representation and predicts the next world state using multiple output heads for continuous state variables (vehicle motion and object positions), binary events (object visibility), and discrete mission variables (contact type, decision, action, and mission state). The current implementation serves as a proof of concept and provides the foundation for future training on real mission data generated by the ROS2/Gazebo simulation environment and, ultimately, field deployments.

Architecture diagram of GRU network of a world model.

Current status: Proof of concept trained on synthetic data from the simulator. The architecture is ready for trainingon real mission data.

Mission Decision Layer

The mission decision layer determines the most appropriate course of action based on evaluation of the current mission objectives, environmental observations and predictions generated by the world model. The objective is to maximize mission success while minimizing operational risk and avoiding potential hazards.

Inputs:

Mission objectives
World model predictions
Semantic map
Vehicle state
Mission constraints

Outputs:

High-level navigation actions
Inspection commands
Hazard avoidance maneuvers
Mission execution decisions
Vehicle control objectives

Vehicle Control Layer

The vehicle control layer converts mission-level decisions into executable commands for the autonomous underwater vehicle. This layer is responsible for maneuvering the vehicle, controlling additional devices (i.e. cameras, sonars, or inspection payloads), following planned trajectories, and maintaining stable operation within the underwater environment.

Inputs:

Navigation objectives
Desired vehicle actions
Desired additional device actions
Mission execution commands

Outputs:

Thruster commands
Device commands
Vehicle motion control
Platform status information

Communication Layer

The communication layer addresses one of the basic challenges of autonomous underwater operations: the reliable transfer of mission information in an environments with no GPS coverage and communication constraines. Unlike surface or aerial systems, vehicles operating under the water cannot rely on conventional radio communication, cellular networks, or GPS. Instead, mission information is typically exchanged using acoustic communication systems.

Modern autonomous underwater vehicles use acoustic modems, which transmit low-bandwidth messages through the water using sound waves. While unsuitable for continuous video streaming, acoustic communication is sufficient for transmitting mission-critical information such as vehicle status, detected hazards, mission progress, and environmental observations.

Future versions of this project will investigate the integration of a suitable communication layer, that is capable of transmitting compact semantic information generated by the world model, including detected mine-like objects, navigable corridors, and mission status updates. Such an approach may reduce communication bandwidth requirements while maintaining operator situational awareness during long-duration autonomous missions.

Inputs:

Semantic map updates
Mission status
Vehicle state information
Hazard detections
World model observations

Outputs:

Mission progress reports
Detected mine-like object locations
Safe navigation corridor updates
Vehicle health and status information
Operator situational awareness updates

Current Status

The project is currently focused on the development of a simulation environment and the definition of an autonomy architecture for underwater mine countermeasure missions. A digital twin of a harbor environment has been created in Gazebo, including harbor infrastructure, mine-like objects, and underwater debris. The simulation environment serves as the foundation for future perception, mapping, and mission autonomy experiments.

Current efforts are focused on the generation of synthetic datasets and the development of an object detection pipeline capable of identifying mine-like objects within the simulated environment. In parallel, the architecture of a lightweight world model is being defined to support future semantic mapping and autonomous decision-making capabilities.

Task	Status
Harbor digital twin environment in Gazebo	Implemented
BlueROV2 simulation platform	Implemented
Harbor infrastructure and obstacle modeling	Implemented
Mine-like object and debris scenarios	Implemented
Mission autonomy architecture design	Implemented
World model concept and research framework	Implemented
Synthetic dataset generation	Implemented
Object detection and classification pipeline	In progress
Semantic environment representation	In progress
World model implementation	Implemented
Autonomous harbor approach navigation	Planned
Safe corridor identification	Planned
Mission-level decision making	Planned
Natural-language mission specification	Planned
Acoustic communication concepts for operator situational awareness	Planned
Multi-mission evaluation and benchmarking	Planned

Future Work

The current implementation focuses on the development of a digital twin environment and the design of an autonomy architecture for simulated mine countermeasure missions. Future work will gradually introduce additional capabilities required for mission-level autonomy in realistic underwater environments.

Planned developments:

Synthetic dataset generation for maritime object detection
Detection and classification of mine-like objects and underwater debris
Semantic mapping of hazards and harbor structures
Training of a lightweight world model for environment representation
Prediction of future mission states and navigation outcomes
Autonomous harbor approach and safe corridor identification
Natural-language mission specification using large language models
Integration of acoustic communication concepts for operator situational awareness
Evaluation across multiple harbor layouts and mission scenarios
Simulation-to-real transfer for deployment on physical underwater vehicles

Long-Term Vision

The long-term objective of the project is to investigate how compact AI-based autonomy systems can support future maritime autonomous operations. By combining perception, semantic mapping, world models, mission planning, and low-bandwidth communication concepts, the project aims to explore technologies relevant to mine countermeasure missions, harbor security, infrastructure inspection, and autonomous underwater reconnaissance.