DOMAIN MAP 4: Machine Learning, Digital Twins, and Decision Theory for Offshore Wind Geotechnics¶
Synthesised from: 45 batch summary files (batch01_agent1 through batch09_agent5), covering approximately 1,600 literature entries (1902--2026).
Domain at a glance¶
mindmap
root((D4 · ML + DT + decision<br/>1 600 papers))
Representation learning
Autoencoders VAE
Encoder models
Self-supervised
Latent space compression
Physics-informed constraints
Digital twins
Grey-box Ritto Cross
Physics-first Branlard
Data-driven Zhong Zhang
Ensemble Bull 2025
Stadtmann survey
Physics-informed ML
PINN Raissi 2019
PI-autoencoders
Hybrid residual models
Manifold constraints
Equation-informed loss
Bayesian decision
Cost-of-inspection
Value-of-information
POD / POFOD
Risk-based inspection
MDPs for maintenance
Reliability updating
Bayesian networks
Particle filtering
Kalman / ensemble
Importance sampling
Surrogate-assisted
Surrogates / emulators
GP regression
Kriging
Neural network
Polynomial chaos
Cheap Monte Carlo
Optimisation
Bayesian optimisation
Active learning
Gradient-free search
Topology optimisation
Relevance to PhD
E encoder cross-soil
A Bayesian loop
B feature ranking
J5 Monte Carlo
Op3 pipeline
The closed-loop decision pipeline the portfolio builds¶
%%{init: {"theme":"base","themeVariables":{"fontSize":"15px","fontFamily":"Inter, system-ui, -apple-system, sans-serif","primaryTextColor":"#1a1a1a","lineColor":"#666"},"flowchart":{"nodeSpacing":22,"rankSpacing":40,"padding":14,"useMaxWidth":true}}}%%
flowchart TB
subgraph Measurement [" <b>Measurement layer</b> "]
Field["<b>V1</b> · 32-month<br/>field monitoring"]:::f
end
subgraph Processing [" <b>Processing layer</b> "]
direction LR
Features["<b>B</b> · Buckingham-Pi<br/>feature ranking"]:::ft
EOV["<b>V2</b> · state-function<br/>EOV compensation"]:::v
end
subgraph Representation [" <b>Representation layer</b> "]
direction LR
Encoder["<b>E</b> · physics-informed<br/>encoder"]:::e
Capacity["<b>J5</b> · Monte Carlo<br/>capacity distribution"]:::j
end
subgraph Decision [" <b>Decision layer</b> "]
Bayes["<b>A</b> · Bayesian decision node"]:::a
end
Action["<b>Inspect · repair · wait</b><br/><span style='font-size:14px'>cost-optimal schedule</span>"]:::act
Field ==> EOV
Field ==> Features
EOV ==> Bayes
Features ==> Bayes
Encoder ==> Bayes
Capacity ==> Bayes
Encoder -. "compresses" .-> Capacity
Bayes ==> Action
Action -. "re-tests" .-> Field
classDef f fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#0d47a1
classDef ft fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
classDef v fill:#fce4ec,stroke:#c2185b,stroke-width:2px
classDef e fill:#fff8e1,stroke:#f57f17,stroke-width:2px
classDef j fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
classDef a fill:#fff3e0,stroke:#e65100,stroke-width:3px
classDef act fill:#ffebee,stroke:#c62828,color:#b71c1c,stroke-width:3px
style Measurement fill:#f0f7ff,stroke:#90caf9,stroke-dasharray:5 5
style Processing fill:#f1f8e9,stroke:#aed581,stroke-dasharray:5 5
style Representation fill:#fff8e1,stroke:#ffcc80,stroke-dasharray:5 5
style Decision fill:#fff3e0,stroke:#ffb74d,stroke-dasharray:5 5
Physics vs data axis — where the portfolio lives¶
quadrantChart
title Physics-first vs data-first · simple vs complex
x-axis "Data-first" --> "Physics-first"
y-axis "Simple model" --> "Complex model"
quadrant-1 Physics-heavy · complex
quadrant-2 Data-heavy · complex
quadrant-3 Data-heavy · simple
quadrant-4 Physics-heavy · simple
J11: [0.95, 0.85]
J2: [0.88, 0.80]
J3: [0.82, 0.60]
J5: [0.70, 0.55]
Op3: [0.75, 0.70]
V1: [0.55, 0.45]
V2: [0.45, 0.55]
E: [0.55, 0.85]
A: [0.50, 0.80]
B: [0.30, 0.40]
Placement approximate, based on the portfolio's methodology distribution.
1. ESTABLISHED KNOWLEDGE¶
1.1 Machine Learning for Geotechnics and SHM¶
ANN-based bearing capacity prediction is now routine. Deep networks (5--7 layers) outperform shallow ANNs even with scarce training data -- as few as 6 samples for foundation capacity in Baginska & Srokosz (2018). Ensemble methods consistently dominate tabular geotechnical data: XGBoost achieves R2 > 0.9 for pile bearing capacity (Arbi 2025; Gendy 2025), while AdaBoost reaches R2 = 0.881 for shallow foundations on 169 data points (Shah 2025). ANN remains the most widely used architecture in geotechnical engineering (35% of applications per Liu 2024 review), followed by random forest (19%) and SVM (17%). Mechanical property prediction accounts for 59% of all AI applications in geotechnical computational mechanics.
For structural health monitoring, unsupervised methods trained on healthy-condition data are practical and scalable. Gaussian process time-series models track modal variability due to wind speed changes (Avendano-Valencia 2017). LSTM autoencoders combined with SVM achieve stable unsupervised outlier detection for multivariate environmental sensor data (Park 2023; Wei 2022, achieving 99.5% accuracy for CO2 anomaly detection). Unsupervised novelty detection performs nearly as well as supervised methods for bridge fatigue damage detection (Svendsen 2021), which has significant practical implications because labeled damage data is rare for real structures.
Surrogate models are established for reducing computational cost: polynomial chaos expansion provides accurate site-specific fatigue load predictions (Dimitrov 2018); Kriging meta-models can replace millions of aeroelastic simulations for fatigue assessment with acceptable accuracy, covering both normal operation and idling states (Schmidt 2025/2026). Gaussian process regression surrogates trained on small sea-state subsets accurately predict long-term fatigue damage at tower base and mooring fairlead (Liu 2023).
1.2 Vibration-Based SHM and Natural Frequency Monitoring¶
Natural frequency is the primary dynamic health indicator for both OWT and bridge foundations. Every study examining scour effects on monopiles, suction buckets, or bridge piers reports capacity reduction and frequency decrease. The magnitude varies: 5--15% frequency drop per 1D scour depth for monopiles (Li 2020; Jawalageri 2022); up to 40% frequency change for 30% embedment loss in bridges (Kariyawasam 2020); maximum 5.3% reduction at 0.6D scour for tripod suction buckets (Kim 2025). The Weijtjens group at VUB demonstrated across 15+ turbine-years that tracked resonance frequencies shift measurably with scour, confirming OMA-based SHM viability (Weijtjens 2014--2017). Higher modes and mode-shape-based indicators (MAC, MNMS) offer improved sensitivity over first-mode frequency alone (Jawalageri 2022; Malekjafarian 2020).
Automated operational modal analysis pipelines are mature for continuous monitoring. Field-monitored natural frequencies are consistently 5--15% higher than design predictions (Stuyts 2022/2023), establishing a key benchmark for model calibration. PISA-based soil-pile interaction frameworks consistently outperform legacy API p-y curves for predicting monopile stiffness (Jurado 2022; Kheffache 2024; McAdam 2023).
1.3 Bayesian Methods and Reliability Analysis¶
Probabilistic and reliability-based design is displacing deterministic approaches. Monte Carlo simulation, FOSM, FORM, and LRFD-calibrated resistance factors are becoming standard for foundations and ground improvement. The JCSS Probabilistic Model Code (2006) provides the canonical framework for soil property distributions as stationary random fields with specified scales of fluctuation. Hierarchical Bayesian models combining generic databases (e.g., Ching's CLAY-Cc/6/6203 with 6,203 records from 429 studies) with sparse site-specific data consistently outperform single-source estimation for soil properties (Ching 2022; Tao 2023). Bayesian model updating reduces uncertainty in OWT foundation stiffness estimation (Simpson 2024; Moynihan 2023), with deterministic updating giving precise fits and Bayesian updating providing uncertainty quantification that tightens with more data.
Fragility analysis using lognormal distribution families is the accepted approach for seismic risk quantification of offshore structures (Kennedy & Ravindra 1984; Shinozuka 2000; Kim 2014; Wilkie 2019). Subset simulation outperforms FOSM, FORM, and direct MCS at low failure probabilities Pf < 0.001 (Kar 2022).
1.4 Software Frameworks¶
OpenFAST (and its predecessor FAST) is the de facto reference aeroelastic simulation tool for OWT analysis, used across dozens of papers for coupled aero-hydro-servo-elastic simulation. OpenSees serves as the primary platform for nonlinear structural and geotechnical FEM. The OxCaisson family (Suryasentana 2017--2023) provides FEM-calibrated Winkler-based surrogate models for suction caisson design at orders-of-magnitude lower computational cost. PLAXIS 2D/3D remains the industry-standard commercial geotechnical FEM tool. The PISA framework represents the state-of-the-art replacement for API p-y curves for large-diameter monopile design.
2. ACTIVE FRONTIERS (2023--2025)¶
2.1 Physics-Informed Neural Networks and Hybrid ML¶
Karniadakis et al. (2021, Nature Reviews Physics) established the taxonomy of PIML approaches: observational, learning, and inductive biases for embedding physics into ML. Cuomo et al. (2022) reviewed PINNs comprehensively, identifying variants (VPINN, CPINN, PCNN) but noting that theoretical foundations remain weak -- convergence and generalization theory is unresolved. Shen et al. (2023, Nature Reviews) advocate differentiable modelling that embeds physical equations within neural network training loops for end-to-end gradient-based learning. Cross et al. (2021) demonstrated physics-informed Gaussian process regression with physics kernels for SHM grey-box models that generalize beyond training regimes. Chen & Zhang (2020) applied physics-constrained LSTM to geomechanical log prediction, achieving higher accuracy than pure data-driven LSTM.
Bakarji et al. (2022) proposed three data-driven methods (constrained optimization, BuckiNet deep learning, SINDy) that embed Buckingham Pi theorem as a constraint to discover optimal dimensionless groups. The debate between tight physics integration (differentiable programming) and loose coupling (hybrid/ensemble) remains active (Shen 2023 vs. Terbuch 2023).
2.2 Autoencoder Architectures for Engineering Data¶
Autoencoders are proliferating across domains: hierarchical CNNs for fluid field decomposition (Fukami 2020), stacked denoising autoencoders for missing data imputation (Kim 2020), LSTM-autoencoders for time-series anomaly detection (Wei 2022; Park 2023), physics-informed vector-quantized VAE achieving CR=85 with MSE O(10^-3) for turbulence compression (Momenifar 2022), and sparse autoencoder-SVM for high-dimensional anomaly detection (Wei 2023). However, no paper in the corpus applies autoencoder-based representation learning specifically to geotechnical or OWT structural health monitoring sensor data -- a clear transfer opportunity identified by multiple batch summaries.
2.3 Digital Twins for Offshore Wind¶
Digital twin maturity is advancing rapidly. Branlard et al. (2024) validated a full-scale floating turbine digital twin (TetraSpar) using Kalman filter + aerodynamic estimator, achieving 10--15% error on damage-equivalent tower loads. Weil et al. (2023) demonstrated quantifiable scour assessment (not just detection) from resonance frequency shifts using a digital twin + automated OMA. Bull et al. (2025) argued for probabilistic ensembles of digital twins rather than single best-fit models for risk-based inspection planning. Stadtmann et al. (2023) demonstrated DT capability levels 0--3 for an operational FOWT using neural networks and transfer learning. Stuyts (2022) pioneered a cloud-based serverless application for parametric back-analysis across entire wind farms. However, predictive and prescriptive DT levels (3--4) have only been demonstrated in limited prototypes; autonomous DTs (level 5) remain aspirational (Nickpasand 2023).
2.4 Value of Information and Risk-Based Inspection¶
Kamariotis et al. (2021) demonstrated Bayesian decision analysis for quantifying the Value of Information from vibration-based SHM, modelling the full data-to-decision chain for deteriorating bridges. Giordano et al. (2022) quantified monetary benefit of permanent seismic SHM for evacuation decisions, balancing casualty risk against business interruption. The JCSS framework (Goyet et al. 2010) established RBI using Bayesian Probabilistic Networks for FPSO inspection planning. Bull et al. (2025) integrated probabilistic digital twins with SHM-OMA for risk-based inspection decisions. Efficient nested Monte Carlo (Heath & Baio 2018) reduced EVSI computation from days to seconds. Thons (2019, TU1402 COST Action) provided a framework for VoI analysis for SHM decision support. The LQI (Life Quality Index) criterion has been applied to structural design acceptability with SWTP approximately 1.9--5 M USD per statistical life (Rackwitz 2008).
2.5 Transfer Learning and Domain Adaptation¶
Transfer learning is demonstrated for battery capacity estimation across chemistries (Zhu 2022, RMSE 1.1% base, <1.7% transfer), for DEL prediction under wake conditions (Haghi 2024), and for predictive digital twins (Stadtmann 2023). However, cross-site generalization of ML models in geotechnics remains unaddressed -- most studies (Arbi 2025; Gendy 2025; Huynh 2025) train and validate on single datasets. Gueye (2025) is the only paper explicitly testing model degradation under noisy inputs.
3. CONTESTED CLAIMS¶
3.1 When Deep Learning Adds Value Over Classical ML¶
Ng et al. (2020) showed CNN outperforms PLSR/Cubist only above approximately 2,000 training samples for spectral soil data; below this threshold simpler models are competitive. Schulz et al. (2020) found that for brain imaging at approximately 10k subjects, linear models match DL. Belhaouate et al. (2025) found shallow ANNs outperform deeper ones for earthen construction. Yaghoubi et al. (2024) meta-analysis concluded ensemble learning outperforms ANN, standalone ML, and DL overall in geotechnical prediction tasks, yet ANN remains most widely used. The threshold sample size, data nonlinearity, and architectural depth for justified DL deployment remain domain-specific and unresolved.
3.2 Physics-Based vs. Data-Driven Digital Twins¶
Rathore et al. (2021) systematic review emphasizes data-driven DTs powered by big data and AI, while Ritto et al. (2021) argue physics-based models provide essential interpretability and that ML should augment, not replace, physics. Branlard (2024) advocates physics-based models for transferability; Zhong (2023) uses deep-learning surrogates for 500x speedup. Bull (2025) demands probabilistic ensembles rather than deterministic DTs. The optimal ratio of physics to data remains contested.
3.3 Feature Selection for DL Models¶
Chen & Fan (2023) found that traditional feature selection methods (IGR, RFE) actually reduce DL model accuracy for landslide prediction, contradicting the common practice of pre-filtering inputs. Autoencoder-based feature extraction is recommended for DL, creating a methodological split between ML and DL preprocessing pipelines.
3.4 Low-Frequency Fatigue Dynamics¶
Sadeghi et al. (2023) demonstrated that up to 65% of fatigue damage relates to low-frequency dynamics (periods > 1 day) missed by standard 10-minute DLC analysis windows. This directly challenges the DNV industry-standard framework. Whether correction factors suffice or fundamental changes to data segmentation are needed is unresolved.
3.5 Constitutive Model Selection for Cyclic SSI¶
Orakci et al. (2024) compared SANISAND-MS and PM4SAND for cyclic strain accumulation in monopile sands and found different accuracy and calibration complexity. Cudny & Truty (2020) addressed overshooting in HSS with the HS-Brick model. LeBlanc (2010) showed pile stiffness increases with cycling in sand (contradicting API degradation assumption), while Lombardi (2013) found frequency changes depend on shear strain level in clay. No single constitutive model satisfies all cyclic loading requirements.
4. VERIFIED GAPS¶
4.1 Scour + Cyclic + Seismic Multi-Hazard Coupling¶
Individual papers address scour, cyclic capacity, and seismic loading separately. No paper in the corpus integrates all three into a unified probabilistic fatigue framework. Kontoni (2023) considers wind+wave+earthquake; Hachem (2023) considers scour alone. Ngo (2022) addresses scour+seismic fragility for suction buckets but not with cyclic degradation. Jia (2024) combines seismic and scour for bucket OWT but without fatigue accumulation.
4.2 Autoencoder-Based SHM for Geotechnical and OWT Sensor Data¶
Six autoencoder papers target biomedical and remote-sensing domains per batch; zero apply autoencoders to vibration signals, soil sensor data, or load-displacement curves for OWT foundations. This is explicitly identified as a transfer opportunity by batch03_agent1, batch03_agent4, and batch03_agent5.
4.3 Long-Term Field Validation of ML-Based Geotechnical Predictions¶
Most ML models train on limited datasets (125--472 points) from laboratory or literature sources. Cross-site generalization, domain adaptation, and noise robustness are unaddressed. Systematic validation against large, diverse field databases is absent (Phoon 2019; Stuyts 2024).
4.4 Integrated Scour-SHM-Digital-Twin Closed Loop¶
Although digital twins (Branlard 2024; Sindi 2024; Tian 2024), scour monitoring (Weijtjens 2017; Kim 2025), and structural reliability updating (Bull 2025) exist independently, no paper connects real-time scour evolution into a full structural digital twin with automated decision support for OWT.
4.5 Population-Level Bayesian SHM for Offshore Wind Farms¶
Only Smith et al. (2023) explicitly address population-based SHM (PBSHM) across a wind farm using hierarchical Bayesian models. Scaling to full-size farms with hundreds of turbines and diverse soil conditions remains unexplored.
4.6 Data Augmentation for Small Geotechnical Datasets¶
Chadebec (2021) demonstrates geometry-based VAE augmentation for medical imaging; equivalent techniques have not been applied to geotechnical data (CPT profiles, load-displacement curves, sensor streams) despite similar small-sample challenges.
4.7 VoI-Driven Inspection Planning for OWT Foundations¶
While Kamariotis (2021) demonstrated VoI for bridge SHM and JCSS/Goyet (2010) for FPSO RBI, VoI analysis specifically integrating scour monitoring data with OWT foundation remaining-life estimation and inspection scheduling does not exist.
5. QUANTITATIVE BENCHMARKS¶
5.1 ML Model Performance¶
| Task | Best Model | Metric | Source |
|---|---|---|---|
| Pile bearing capacity (layered) | XGBoost + Grid Search | R2 > 0.9 | Arbi 2025 |
| Pile load-settlement | GPR | Top performer | Gendy 2025 |
| Scour depth (bridge pier) | 1D CNN + Buckingham Pi | R2 = 0.85 | Huynh 2025 |
| Scour depth (tripod) | ANN-PSO | R2 = 0.99 | Jatoliya 2024 |
| Shallow foundation reliability | FNN on MC data | Relative error < 10^-5 | Savvides 2024 |
| Slope stability | LSTM | Outperforms CNN/SVM/RF | Huang 2023 |
| WT fault diagnosis | Decision tree / RF | 92.7% / 92.0% accuracy | Hsu 2020 |
| Wind profile extrapolation | Random forest | Outperforms ERA5 at 100 m | Frech 2025 |
| Battery capacity (transfer) | ML on voltage relaxation | RMSE 1.1%, <1.7% transfer | Zhu 2022 |
5.2 SHM and Digital Twin Performance¶
| Metric | Value | Source |
|---|---|---|
| Digital twin fatigue DEL accuracy (field) | 10--15% error | Branlard 2024 |
| Digital twin fatigue DEL accuracy (simulation) | 5--10% error | Branlard 2024 |
| ROM surrogate speedup vs physics model | 500x | Zhong 2023 |
| OxCaisson vs 3D FEM cost reduction | Several orders of magnitude | Suryasentana 2020 |
| OWT fn design underestimation | 5--15% below monitored | Stuyts 2022 |
| Arany closed-form fn error | <3.5% across 10 EU farms | Arany 2016 |
| Monopile fn drop per 1D scour | 5--15% | Li 2020, Jawalageri 2022 |
| Tripod bucket fn drop at 0.6D scour | 5.3% max | Kim 2025 |
| Scour-induced fatigue life reduction | ~24% at 1.3D scour | Cao 2024 |
| TMD fatigue life increase (1% mass) | ~65% at 1.3D scour | Cao 2024 |
| EOV frequency scatter reduction | 70% via RANSAC+CUSUM | Kim 2026 (submitted) |
| Scour detection threshold | 0.39D at 95% probability, zero false alarms | Kim 2026 (submitted) |
5.3 Reliability and Risk Benchmarks¶
| Parameter | Value | Source |
|---|---|---|
| IEC OWT design MRP | 50 years | Wilkie 2019 |
| O&G platform design MRP | 10,000 years | Wilkie 2019 |
| Foundation cost share of OWT | 15--40% of total | Houlsby 2000 |
| Monopile market share for OWT | 77--80% | Abdelhak 2024 |
| Soil damping ratio range (monopile) | 0.17--1.3% critical | Rezaei 2018 |
| SWTP per statistical life (LQI) | 1.9--5 M USD | Rackwitz 2008 |
| JCSS soil CoV tables | Per PMC Section 3.7 | JCSS 2006 |
| Fatigue damage variation from foundation model choice | Up to 180% | Katsikogiannis 2019 |
| Kriging training points for acceptable idling fatigue | ~2,000 | Schmidt 2025 |
5.4 Reference Turbines and Databases¶
| Benchmark | Description | Primary Users |
|---|---|---|
| NREL 5 MW | Legacy reference turbine | Robertson, Park, Smith, Sorum |
| DTU 10 MW | Intermediate reference | Katsikogiannis, Dimitrov |
| IEA 15 MW + VolturnUS-S | Modern floating reference | Guo, Mroczek, Wiley, Branlard |
| OC5 Alpha Ventus | Multi-code validation dataset | Popko 2019 |
| PISA Dunkirk/Cowden | Large-scale monopile field tests | Orakci, McAdam, Kheffache |
| Ching CLAY-Cc/6/6203 | 6,203 clay compressibility records | Ching 2022 |
| CWRU bearing dataset | Machinery fault detection benchmark | Neupane 2020, Zhang S 2020 |
| Gottardi (1999) VHM dataset | Circular footings on sand | All macro-element calibration studies |
| Borkum Riffgrund 1 SBJ | First full-scale suction bucket jacket monitoring | Shonberg 2017 |
| Robin Rigg failure case | Monopile decommissioned after scour | Carlos/Vicente 2023 |
Generated: 2026-04-17. Approximately 2,500 words. Source: 45 literature summary files spanning ~1,600 references.