DOMAIN MAP 4: Machine Learning, Digital Twins, and Decision Theory for Offshore Wind Geotechnics¶

Synthesised from: 45 batch summary files (batch01_agent1 through batch09_agent5), covering approximately 1,600 literature entries (1902--2026).

Domain at a glance¶

mindmap
  root((D4 · ML + DT + decision<br/>1 600 papers))
    Representation learning
      Autoencoders VAE
      Encoder models
      Self-supervised
      Latent space compression
      Physics-informed constraints
    Digital twins
      Grey-box Ritto Cross
      Physics-first Branlard
      Data-driven Zhong Zhang
      Ensemble Bull 2025
      Stadtmann survey
    Physics-informed ML
      PINN Raissi 2019
      PI-autoencoders
      Hybrid residual models
      Manifold constraints
      Equation-informed loss
    Bayesian decision
      Cost-of-inspection
      Value-of-information
      POD / POFOD
      Risk-based inspection
      MDPs for maintenance
    Reliability updating
      Bayesian networks
      Particle filtering
      Kalman / ensemble
      Importance sampling
      Surrogate-assisted
    Surrogates / emulators
      GP regression
      Kriging
      Neural network
      Polynomial chaos
      Cheap Monte Carlo
    Optimisation
      Bayesian optimisation
      Active learning
      Gradient-free search
      Topology optimisation
    Relevance to PhD
      E encoder cross-soil
      A Bayesian loop
      B feature ranking
      J5 Monte Carlo
      Op3 pipeline

The closed-loop decision pipeline the portfolio builds¶

%%{init: {"theme":"base","themeVariables":{"fontSize":"15px","fontFamily":"Inter, system-ui, -apple-system, sans-serif","primaryTextColor":"#1a1a1a","lineColor":"#666"},"flowchart":{"nodeSpacing":22,"rankSpacing":40,"padding":14,"useMaxWidth":true}}}%%
flowchart TB
    subgraph Measurement ["&nbsp;<b>Measurement layer</b>&nbsp;"]
        Field["<b>V1</b> · 32-month<br/>field monitoring"]:::f
    end

    subgraph Processing ["&nbsp;<b>Processing layer</b>&nbsp;"]
        direction LR
        Features["<b>B</b> · Buckingham-Pi<br/>feature ranking"]:::ft
        EOV["<b>V2</b> · state-function<br/>EOV compensation"]:::v
    end

    subgraph Representation ["&nbsp;<b>Representation layer</b>&nbsp;"]
        direction LR
        Encoder["<b>E</b> · physics-informed<br/>encoder"]:::e
        Capacity["<b>J5</b> · Monte Carlo<br/>capacity distribution"]:::j
    end

    subgraph Decision ["&nbsp;<b>Decision layer</b>&nbsp;"]
        Bayes["<b>A</b> · Bayesian decision node"]:::a
    end

    Action["<b>Inspect · repair · wait</b><br/><span style='font-size:14px'>cost-optimal schedule</span>"]:::act

    Field ==> EOV
    Field ==> Features
    EOV ==> Bayes
    Features ==> Bayes
    Encoder ==> Bayes
    Capacity ==> Bayes
    Encoder -. "compresses" .-> Capacity
    Bayes ==> Action
    Action -. "re-tests" .-> Field

    classDef f fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#0d47a1
    classDef ft fill:#e8f5e9,stroke:#2e7d32,stroke-width:2px
    classDef v fill:#fce4ec,stroke:#c2185b,stroke-width:2px
    classDef e fill:#fff8e1,stroke:#f57f17,stroke-width:2px
    classDef j fill:#f3e5f5,stroke:#7b1fa2,stroke-width:2px
    classDef a fill:#fff3e0,stroke:#e65100,stroke-width:3px
    classDef act fill:#ffebee,stroke:#c62828,color:#b71c1c,stroke-width:3px

    style Measurement fill:#f0f7ff,stroke:#90caf9,stroke-dasharray:5 5
    style Processing fill:#f1f8e9,stroke:#aed581,stroke-dasharray:5 5
    style Representation fill:#fff8e1,stroke:#ffcc80,stroke-dasharray:5 5
    style Decision fill:#fff3e0,stroke:#ffb74d,stroke-dasharray:5 5

Physics vs data axis — where the portfolio lives¶

quadrantChart
    title Physics-first vs data-first · simple vs complex
    x-axis "Data-first" --> "Physics-first"
    y-axis "Simple model" --> "Complex model"
    quadrant-1 Physics-heavy · complex
    quadrant-2 Data-heavy · complex
    quadrant-3 Data-heavy · simple
    quadrant-4 Physics-heavy · simple
    J11: [0.95, 0.85]
    J2: [0.88, 0.80]
    J3: [0.82, 0.60]
    J5: [0.70, 0.55]
    Op3: [0.75, 0.70]
    V1: [0.55, 0.45]
    V2: [0.45, 0.55]
    E: [0.55, 0.85]
    A: [0.50, 0.80]
    B: [0.30, 0.40]

Placement approximate, based on the portfolio's methodology distribution.

1. ESTABLISHED KNOWLEDGE¶

1.1 Machine Learning for Geotechnics and SHM¶

ANN-based bearing capacity prediction is now routine. Deep networks (5--7 layers) outperform shallow ANNs even with scarce training data -- as few as 6 samples for foundation capacity in Baginska & Srokosz (2018). Ensemble methods consistently dominate tabular geotechnical data: XGBoost achieves R2 > 0.9 for pile bearing capacity (Arbi 2025; Gendy 2025), while AdaBoost reaches R2 = 0.881 for shallow foundations on 169 data points (Shah 2025). ANN remains the most widely used architecture in geotechnical engineering (35% of applications per Liu 2024 review), followed by random forest (19%) and SVM (17%). Mechanical property prediction accounts for 59% of all AI applications in geotechnical computational mechanics.

For structural health monitoring, unsupervised methods trained on healthy-condition data are practical and scalable. Gaussian process time-series models track modal variability due to wind speed changes (Avendano-Valencia 2017). LSTM autoencoders combined with SVM achieve stable unsupervised outlier detection for multivariate environmental sensor data (Park 2023; Wei 2022, achieving 99.5% accuracy for CO2 anomaly detection). Unsupervised novelty detection performs nearly as well as supervised methods for bridge fatigue damage detection (Svendsen 2021), which has significant practical implications because labeled damage data is rare for real structures.

Surrogate models are established for reducing computational cost: polynomial chaos expansion provides accurate site-specific fatigue load predictions (Dimitrov 2018); Kriging meta-models can replace millions of aeroelastic simulations for fatigue assessment with acceptable accuracy, covering both normal operation and idling states (Schmidt 2025/2026). Gaussian process regression surrogates trained on small sea-state subsets accurately predict long-term fatigue damage at tower base and mooring fairlead (Liu 2023).

1.2 Vibration-Based SHM and Natural Frequency Monitoring¶

Natural frequency is the primary dynamic health indicator for both OWT and bridge foundations. Every study examining scour effects on monopiles, suction buckets, or bridge piers reports capacity reduction and frequency decrease. The magnitude varies: 5--15% frequency drop per 1D scour depth for monopiles (Li 2020; Jawalageri 2022); up to 40% frequency change for 30% embedment loss in bridges (Kariyawasam 2020); maximum 5.3% reduction at 0.6D scour for tripod suction buckets (Kim 2025). The Weijtjens group at VUB demonstrated across 15+ turbine-years that tracked resonance frequencies shift measurably with scour, confirming OMA-based SHM viability (Weijtjens 2014--2017). Higher modes and mode-shape-based indicators (MAC, MNMS) offer improved sensitivity over first-mode frequency alone (Jawalageri 2022; Malekjafarian 2020).

Automated operational modal analysis pipelines are mature for continuous monitoring. Field-monitored natural frequencies are consistently 5--15% higher than design predictions (Stuyts 2022/2023), establishing a key benchmark for model calibration. PISA-based soil-pile interaction frameworks consistently outperform legacy API p-y curves for predicting monopile stiffness (Jurado 2022; Kheffache 2024; McAdam 2023).

1.3 Bayesian Methods and Reliability Analysis¶

Probabilistic and reliability-based design is displacing deterministic approaches. Monte Carlo simulation, FOSM, FORM, and LRFD-calibrated resistance factors are becoming standard for foundations and ground improvement. The JCSS Probabilistic Model Code (2006) provides the canonical framework for soil property distributions as stationary random fields with specified scales of fluctuation. Hierarchical Bayesian models combining generic databases (e.g., Ching's CLAY-Cc/6/6203 with 6,203 records from 429 studies) with sparse site-specific data consistently outperform single-source estimation for soil properties (Ching 2022; Tao 2023). Bayesian model updating reduces uncertainty in OWT foundation stiffness estimation (Simpson 2024; Moynihan 2023), with deterministic updating giving precise fits and Bayesian updating providing uncertainty quantification that tightens with more data.

Fragility analysis using lognormal distribution families is the accepted approach for seismic risk quantification of offshore structures (Kennedy & Ravindra 1984; Shinozuka 2000; Kim 2014; Wilkie 2019). Subset simulation outperforms FOSM, FORM, and direct MCS at low failure probabilities Pf < 0.001 (Kar 2022).

1.4 Software Frameworks¶

OpenFAST (and its predecessor FAST) is the de facto reference aeroelastic simulation tool for OWT analysis, used across dozens of papers for coupled aero-hydro-servo-elastic simulation. OpenSees serves as the primary platform for nonlinear structural and geotechnical FEM. The OxCaisson family (Suryasentana 2017--2023) provides FEM-calibrated Winkler-based surrogate models for suction caisson design at orders-of-magnitude lower computational cost. PLAXIS 2D/3D remains the industry-standard commercial geotechnical FEM tool. The PISA framework represents the state-of-the-art replacement for API p-y curves for large-diameter monopile design.

2. ACTIVE FRONTIERS (2023--2025)¶

2.1 Physics-Informed Neural Networks and Hybrid ML¶

Karniadakis et al. (2021, Nature Reviews Physics) established the taxonomy of PIML approaches: observational, learning, and inductive biases for embedding physics into ML. Cuomo et al. (2022) reviewed PINNs comprehensively, identifying variants (VPINN, CPINN, PCNN) but noting that theoretical foundations remain weak -- convergence and generalization theory is unresolved. Shen et al. (2023, Nature Reviews) advocate differentiable modelling that embeds physical equations within neural network training loops for end-to-end gradient-based learning. Cross et al. (2021) demonstrated physics-informed Gaussian process regression with physics kernels for SHM grey-box models that generalize beyond training regimes. Chen & Zhang (2020) applied physics-constrained LSTM to geomechanical log prediction, achieving higher accuracy than pure data-driven LSTM.

Bakarji et al. (2022) proposed three data-driven methods (constrained optimization, BuckiNet deep learning, SINDy) that embed Buckingham Pi theorem as a constraint to discover optimal dimensionless groups. The debate between tight physics integration (differentiable programming) and loose coupling (hybrid/ensemble) remains active (Shen 2023 vs. Terbuch 2023).

2.2 Autoencoder Architectures for Engineering Data¶

Autoencoders are proliferating across domains: hierarchical CNNs for fluid field decomposition (Fukami 2020), stacked denoising autoencoders for missing data imputation (Kim 2020), LSTM-autoencoders for time-series anomaly detection (Wei 2022; Park 2023), physics-informed vector-quantized VAE achieving CR=85 with MSE O(10^-3) for turbulence compression (Momenifar 2022), and sparse autoencoder-SVM for high-dimensional anomaly detection (Wei 2023). However, no paper in the corpus applies autoencoder-based representation learning specifically to geotechnical or OWT structural health monitoring sensor data -- a clear transfer opportunity identified by multiple batch summaries.

2.3 Digital Twins for Offshore Wind¶

Digital twin maturity is advancing rapidly. Branlard et al. (2024) validated a full-scale floating turbine digital twin (TetraSpar) using Kalman filter + aerodynamic estimator, achieving 10--15% error on damage-equivalent tower loads. Weil et al. (2023) demonstrated quantifiable scour assessment (not just detection) from resonance frequency shifts using a digital twin + automated OMA. Bull et al. (2025) argued for probabilistic ensembles of digital twins rather than single best-fit models for risk-based inspection planning. Stadtmann et al. (2023) demonstrated DT capability levels 0--3 for an operational FOWT using neural networks and transfer learning. Stuyts (2022) pioneered a cloud-based serverless application for parametric back-analysis across entire wind farms. However, predictive and prescriptive DT levels (3--4) have only been demonstrated in limited prototypes; autonomous DTs (level 5) remain aspirational (Nickpasand 2023).

2.4 Value of Information and Risk-Based Inspection¶

Kamariotis et al. (2021) demonstrated Bayesian decision analysis for quantifying the Value of Information from vibration-based SHM, modelling the full data-to-decision chain for deteriorating bridges. Giordano et al. (2022) quantified monetary benefit of permanent seismic SHM for evacuation decisions, balancing casualty risk against business interruption. The JCSS framework (Goyet et al. 2010) established RBI using Bayesian Probabilistic Networks for FPSO inspection planning. Bull et al. (2025) integrated probabilistic digital twins with SHM-OMA for risk-based inspection decisions. Efficient nested Monte Carlo (Heath & Baio 2018) reduced EVSI computation from days to seconds. Thons (2019, TU1402 COST Action) provided a framework for VoI analysis for SHM decision support. The LQI (Life Quality Index) criterion has been applied to structural design acceptability with SWTP approximately 1.9--5 M USD per statistical life (Rackwitz 2008).

2.5 Transfer Learning and Domain Adaptation¶

Transfer learning is demonstrated for battery capacity estimation across chemistries (Zhu 2022, RMSE 1.1% base, <1.7% transfer), for DEL prediction under wake conditions (Haghi 2024), and for predictive digital twins (Stadtmann 2023). However, cross-site generalization of ML models in geotechnics remains unaddressed -- most studies (Arbi 2025; Gendy 2025; Huynh 2025) train and validate on single datasets. Gueye (2025) is the only paper explicitly testing model degradation under noisy inputs.

3. CONTESTED CLAIMS¶

3.1 When Deep Learning Adds Value Over Classical ML¶

Ng et al. (2020) showed CNN outperforms PLSR/Cubist only above approximately 2,000 training samples for spectral soil data; below this threshold simpler models are competitive. Schulz et al. (2020) found that for brain imaging at approximately 10k subjects, linear models match DL. Belhaouate et al. (2025) found shallow ANNs outperform deeper ones for earthen construction. Yaghoubi et al. (2024) meta-analysis concluded ensemble learning outperforms ANN, standalone ML, and DL overall in geotechnical prediction tasks, yet ANN remains most widely used. The threshold sample size, data nonlinearity, and architectural depth for justified DL deployment remain domain-specific and unresolved.

3.2 Physics-Based vs. Data-Driven Digital Twins¶

Rathore et al. (2021) systematic review emphasizes data-driven DTs powered by big data and AI, while Ritto et al. (2021) argue physics-based models provide essential interpretability and that ML should augment, not replace, physics. Branlard (2024) advocates physics-based models for transferability; Zhong (2023) uses deep-learning surrogates for 500x speedup. Bull (2025) demands probabilistic ensembles rather than deterministic DTs. The optimal ratio of physics to data remains contested.

3.3 Feature Selection for DL Models¶

Chen & Fan (2023) found that traditional feature selection methods (IGR, RFE) actually reduce DL model accuracy for landslide prediction, contradicting the common practice of pre-filtering inputs. Autoencoder-based feature extraction is recommended for DL, creating a methodological split between ML and DL preprocessing pipelines.

3.4 Low-Frequency Fatigue Dynamics¶

Sadeghi et al. (2023) demonstrated that up to 65% of fatigue damage relates to low-frequency dynamics (periods > 1 day) missed by standard 10-minute DLC analysis windows. This directly challenges the DNV industry-standard framework. Whether correction factors suffice or fundamental changes to data segmentation are needed is unresolved.

3.5 Constitutive Model Selection for Cyclic SSI¶

Orakci et al. (2024) compared SANISAND-MS and PM4SAND for cyclic strain accumulation in monopile sands and found different accuracy and calibration complexity. Cudny & Truty (2020) addressed overshooting in HSS with the HS-Brick model. LeBlanc (2010) showed pile stiffness increases with cycling in sand (contradicting API degradation assumption), while Lombardi (2013) found frequency changes depend on shear strain level in clay. No single constitutive model satisfies all cyclic loading requirements.

4. VERIFIED GAPS¶

4.1 Scour + Cyclic + Seismic Multi-Hazard Coupling¶

Individual papers address scour, cyclic capacity, and seismic loading separately. No paper in the corpus integrates all three into a unified probabilistic fatigue framework. Kontoni (2023) considers wind+wave+earthquake; Hachem (2023) considers scour alone. Ngo (2022) addresses scour+seismic fragility for suction buckets but not with cyclic degradation. Jia (2024) combines seismic and scour for bucket OWT but without fatigue accumulation.

4.2 Autoencoder-Based SHM for Geotechnical and OWT Sensor Data¶

Six autoencoder papers target biomedical and remote-sensing domains per batch; zero apply autoencoders to vibration signals, soil sensor data, or load-displacement curves for OWT foundations. This is explicitly identified as a transfer opportunity by batch03_agent1, batch03_agent4, and batch03_agent5.

4.3 Long-Term Field Validation of ML-Based Geotechnical Predictions¶

Most ML models train on limited datasets (125--472 points) from laboratory or literature sources. Cross-site generalization, domain adaptation, and noise robustness are unaddressed. Systematic validation against large, diverse field databases is absent (Phoon 2019; Stuyts 2024).

4.4 Integrated Scour-SHM-Digital-Twin Closed Loop¶

Although digital twins (Branlard 2024; Sindi 2024; Tian 2024), scour monitoring (Weijtjens 2017; Kim 2025), and structural reliability updating (Bull 2025) exist independently, no paper connects real-time scour evolution into a full structural digital twin with automated decision support for OWT.

4.5 Population-Level Bayesian SHM for Offshore Wind Farms¶

Only Smith et al. (2023) explicitly address population-based SHM (PBSHM) across a wind farm using hierarchical Bayesian models. Scaling to full-size farms with hundreds of turbines and diverse soil conditions remains unexplored.

4.6 Data Augmentation for Small Geotechnical Datasets¶

Chadebec (2021) demonstrates geometry-based VAE augmentation for medical imaging; equivalent techniques have not been applied to geotechnical data (CPT profiles, load-displacement curves, sensor streams) despite similar small-sample challenges.

4.7 VoI-Driven Inspection Planning for OWT Foundations¶

While Kamariotis (2021) demonstrated VoI for bridge SHM and JCSS/Goyet (2010) for FPSO RBI, VoI analysis specifically integrating scour monitoring data with OWT foundation remaining-life estimation and inspection scheduling does not exist.

5. QUANTITATIVE BENCHMARKS¶

5.1 ML Model Performance¶

Task	Best Model	Metric	Source
Pile bearing capacity (layered)	XGBoost + Grid Search	R2 > 0.9	Arbi 2025
Pile load-settlement	GPR	Top performer	Gendy 2025
Scour depth (bridge pier)	1D CNN + Buckingham Pi	R2 = 0.85	Huynh 2025
Scour depth (tripod)	ANN-PSO	R2 = 0.99	Jatoliya 2024
Shallow foundation reliability	FNN on MC data	Relative error < 10^-5	Savvides 2024
Slope stability	LSTM	Outperforms CNN/SVM/RF	Huang 2023
WT fault diagnosis	Decision tree / RF	92.7% / 92.0% accuracy	Hsu 2020
Wind profile extrapolation	Random forest	Outperforms ERA5 at 100 m	Frech 2025
Battery capacity (transfer)	ML on voltage relaxation	RMSE 1.1%, <1.7% transfer	Zhu 2022

5.2 SHM and Digital Twin Performance¶

Metric	Value	Source
Digital twin fatigue DEL accuracy (field)	10--15% error	Branlard 2024
Digital twin fatigue DEL accuracy (simulation)	5--10% error	Branlard 2024
ROM surrogate speedup vs physics model	500x	Zhong 2023
OxCaisson vs 3D FEM cost reduction	Several orders of magnitude	Suryasentana 2020
OWT fn design underestimation	5--15% below monitored	Stuyts 2022
Arany closed-form fn error	<3.5% across 10 EU farms	Arany 2016
Monopile fn drop per 1D scour	5--15%	Li 2020, Jawalageri 2022
Tripod bucket fn drop at 0.6D scour	5.3% max	Kim 2025
Scour-induced fatigue life reduction	~24% at 1.3D scour	Cao 2024
TMD fatigue life increase (1% mass)	~65% at 1.3D scour	Cao 2024
EOV frequency scatter reduction	70% via RANSAC+CUSUM	Kim 2026 (submitted)
Scour detection threshold	0.39D at 95% probability, zero false alarms	Kim 2026 (submitted)

5.3 Reliability and Risk Benchmarks¶

Parameter	Value	Source
IEC OWT design MRP	50 years	Wilkie 2019
O&G platform design MRP	10,000 years	Wilkie 2019
Foundation cost share of OWT	15--40% of total	Houlsby 2000
Monopile market share for OWT	77--80%	Abdelhak 2024
Soil damping ratio range (monopile)	0.17--1.3% critical	Rezaei 2018
SWTP per statistical life (LQI)	1.9--5 M USD	Rackwitz 2008
JCSS soil CoV tables	Per PMC Section 3.7	JCSS 2006
Fatigue damage variation from foundation model choice	Up to 180%	Katsikogiannis 2019
Kriging training points for acceptable idling fatigue	~2,000	Schmidt 2025

5.4 Reference Turbines and Databases¶

Benchmark	Description	Primary Users
NREL 5 MW	Legacy reference turbine	Robertson, Park, Smith, Sorum
DTU 10 MW	Intermediate reference	Katsikogiannis, Dimitrov
IEA 15 MW + VolturnUS-S	Modern floating reference	Guo, Mroczek, Wiley, Branlard
OC5 Alpha Ventus	Multi-code validation dataset	Popko 2019
PISA Dunkirk/Cowden	Large-scale monopile field tests	Orakci, McAdam, Kheffache
Ching CLAY-Cc/6/6203	6,203 clay compressibility records	Ching 2022
CWRU bearing dataset	Machinery fault detection benchmark	Neupane 2020, Zhang S 2020
Gottardi (1999) VHM dataset	Circular footings on sand	All macro-element calibration studies
Borkum Riffgrund 1 SBJ	First full-scale suction bucket jacket monitoring	Shonberg 2017
Robin Rigg failure case	Monopile decommissioned after scour	Carlos/Vicente 2023

Generated: 2026-04-17. Approximately 2,500 words. Source: 45 literature summary files spanning ~1,600 references.