J2 Results — Seven-Phase Claim-Driven Prototype¶

Date: 2026-04-18 Status: Fifth prototype; companion to the J2 methodology prototype. Applied retrospectively to J2 results sections (§3 Validation + §4 Results and Discussion) ahead of R2 submission 2026-04-29.

Results sections have a different claim structure than methodology sections: headline finding / sensitivity / comparison against prior work / edge cases rather than framework / assumptions / setup / V&V. Running the workflow on J2's results surfaces a different class of defect than the methodology pass did.

Phase 1 — Thesis statement¶

The 3D-FE-calibrated Winkler model predicts the baseline natural frequency of the 4.2 MW Gunsan tripod within −3.9 % of 32 months of field data, establishes a power-law \(|\Delta f / f_0| = 0.167 (S/D)^{1.47}\) across 11 scour depths, and — across a four-representation comparison — is the only model that simultaneously preserves depth-resolved stiffness evolution and runs five orders of magnitude faster than direct 3D analysis.

Phase 2 — Claim list¶

R1 — Baseline field match. Predicted \(f_1\) within −3.9 % of 32-month Gunsan field data; prediction and measurement overlaid with scoured-frequency evolution annotated.
R2 — Power-law across 11 scour depths. Monotone, well-fit by \(|\Delta f / f_0| = 0.167 (S/D)^{1.47}\); residual plot confirms no systematic departure.
R3 — Four-representation comparison. Fixed-base, macro-element, standard BNWF, distributed BNWF on identical cases. Only distributed BNWF captures the depth-resolved evolution; fixed-base predicts zero response; macro-element misses scour-depth localisation.
R4 — Five-orders-of-magnitude speed-up. Wall-clock runtime of distributed BNWF eigenvalue analysis vs. direct 3D FE eigenvalue analysis; averaged over 11 scenarios with per-case timing table.
R5 — Per-bucket load-sharing and lid contribution. Load-share ratio per bucket at \(S/D \in \{0, 0.2, 0.4, 0.6\}\); lid-bearing vs skirt-friction breakdown as scour progresses.
R6 — SHM threshold alignment. 1P-resonance boundary crossed at \(S/D = 0.43\); mapped against DNV-RP-C212 and IEC 61400-3 fatigue DEL thresholds.

Phase 3 — Claim → evidence map¶

flowchart TD
    T["<b>Thesis (results)</b><br/>−3.9% field match · PL-1 power-law ·<br/>10⁵× speedup · depth-resolved BNWF wins"]
    T --> R1["R1 · Baseline field match"]
    T --> R2["R2 · Power-law across 11 depths"]
    T --> R3["R3 · Four-representation comparison"]
    T --> R4["R4 · 5-orders speedup"]
    T --> R5["R5 · Per-bucket load-sharing"]
    T --> R6["R6 · SHM threshold alignment"]
    R1 --> E1["Fig · predicted vs field f1 time series (32 months)<br/>Stat · relative error = −3.9%"]
    R2 --> E2["Fig · PL-1 fit across 11 scour depths + residual panel<br/>Table · power-law coefficients with 95% CI"]
    R3 --> E3["Fig · four-representation f1 vs S/D overlay<br/>Table · per-representation error + computational cost"]
    R4 --> E4["Table · wall-clock benchmark (single case + batch of 100)"]
    R5 --> E5["Fig · per-bucket reactions at 4 scour depths<br/>Fig · lid vs skirt contribution as S/D progresses"]
    R6 --> E6["Fig · f1(S/D) vs DNV-RP-C212 + IEC fatigue DEL thresholds"]
    E1 --> S1["spec · fig-field-match-32month"]
    E2 --> S2["spec · fig-powerlaw-fit<br/>table · pl1-coefficients"]
    E3 --> S3["spec · fig-four-representation<br/>table · representation-comparison"]
    E4 --> S4["table · runtime-benchmark"]
    E5 --> S5["spec · fig-per-bucket-share<br/>spec · fig-lid-vs-skirt"]
    E6 --> S6["spec · fig-shm-thresholds"]

Seven figures, three tables.

Phase 4 — Figure specs¶

fig-field-match-32month (R1). 32-month Gunsan field \(f_1\) time series (grey) overlaid with the Winkler model's predicted \(f_1\) (blue). Annotation: relative error −3.9 %. Shaded region for the calibration window vs. post-calibration prediction. Residual panel below.

fig-powerlaw-fit (R2). 11 scour depths' \(|\Delta f / f_0|\) vs. \(S/D\) on log-log axes with PL-1 fit line \(|\Delta f / f_0| = 0.167 (S/D)^{1.47}\). Residual panel below shows deviation from fit, confirming no systematic departure. \(R^2\) annotated.

table-pl1-coefficients (R2). Row: coefficient \(a\), exponent \(b\), \(R^2\), RMSE, per-depth max residual. 95 % CI on \(a\) and \(b\) from bootstrap.

fig-four-representation (R3). Four curves (fixed-base, macro-element, standard BNWF, distributed BNWF) of \(f_1/f_{1,0}\) vs \(S/D\) on the same axes. Field data point overlay. Annotation: only distributed BNWF matches the observed trend.

table-representation-comparison (R3). Columns: representation, relative error at \(S/D = 0\), \(S/D = 0.25\), \(S/D = 0.5\); computational cost per case; verdict (✓ or specific-failure-mode note).

table-runtime-benchmark (R4). Rows: 3D FE (OptumGX), Mode A fixed-base, Mode B 6×6, Mode C distributed BNWF, Mode D dissipation-weighted. Columns: single-case wall-clock, batch of 100, speedup ratio vs 3D FE, hardware spec. Matches reviewer comment R2.4 from the J2 R2 response.

fig-per-bucket-share (R5). Stacked bar chart at \(S/D \in \{0, 0.2, 0.4, 0.6\}\) showing load-share fraction across bucket A, B, C. Annotation: as scour progresses asymmetrically, load redistributes by 20–40 %.

fig-lid-vs-skirt (R5). Dual-axis line plot: lid-bearing capacity contribution (%) vs scour depth (left axis); skirt-friction contribution (%) vs scour depth (right axis). Annotation: at \(S/D = 0.5\), lid contributes ~30 % of lateral resistance — a mechanism absent in monopile theory.

fig-shm-thresholds (R6). \(f_1/f_{1,0}\) vs \(S/D\) with four horizontal threshold bands coloured green / yellow / orange / red; 1P-resonance boundary annotated at \(S/D = 0.43\); DNV-RP-C212 and IEC 61400-3 fatigue-DEL thresholds overlaid as dashed lines.

Phase 5 — Paragraph skeletons¶

§3 Validation. ¶1 (field baseline) [→ fig-field-match-32month]. Predicted \(f_1\) within −3.9 % of 32 months of Gunsan field data. Framing: baseline match is a necessary condition for the Winkler model's credibility; it is the first number a reviewer wants. ¶2 (centrifuge cross-check). Reference the J3 centrifuge power-law (|a| = 0.044–0.086 in saturated sand) vs the numerical |a| = 0.167; disparity attributed to the centrifuge-protocol OCR artefact (J3 §2.4.1), directionally conservative. ¶3 (validation coverage). Baseline-field + centrifuge-cross-check + benchmark-against-literature = three independent validation anchors. Stress that further scour-history validation requires bathymetric ground truth, flagged in limitations.

§4 Results and discussion. ¶1 (power-law extraction) [→ fig-powerlaw-fit, table-pl1-coefficients]. PL-1 fit across 11 scour depths; well-constrained power-law. ¶2 (four-representation comparison) [→ fig-four-representation, table-representation-comparison]. Only distributed BNWF captures both magnitude and depth-resolved evolution. Fixed-base ≡ 0 (no scour sensitivity by construction). Macro-element misses the scour-depth localisation because it condenses to a single node. ¶3 (computational cost) [→ table-runtime-benchmark]. Five-orders-of-magnitude speedup. Critical for real-time digital-twin updating (Paper A) and population-scale Monte Carlo (Paper J5). ¶4 (per-bucket load-sharing) [→ fig-per-bucket-share, fig-lid-vs-skirt]. Three-bucket load redistribution and lid-bearing contribution quantified — the mechanism-level answer to why tripods are less scour-sensitive than monopiles. Addresses reviewer R2.5. ¶5 (SHM thresholds) [→ fig-shm-thresholds]. 1P-resonance boundary at \(S/D = 0.43\) (\(\approx 3.4\) m scour); mapped against DNV-RP-C212 + IEC 61400-3 fatigue DEL. Addresses reviewer R2.2.

Phase 6 — Generate figures (figure_generator integration)¶

Seven figure_generator sessions + three tables. fig-field-match-32month and fig-powerlaw-fit are the hero pair — if either fails, the paper's headline fails.

Highest priorities before R2 submission (2026-04-29):

fig-per-bucket-share + fig-lid-vs-skirt are NEW figures for R2, addressing reviewer R2.5. Without them, the reviewer response to R2.5 is narrative only.
table-runtime-benchmark is NEW, addresses R2.4.
fig-shm-thresholds needs the DNV-RP-C212 + IEC 61400-3 overlay added for R2.2.

Each is one figure_generator session.

Phase 7 — Coherence passes¶

Pass A — paragraphs only. - §3 (validation) currently narrates the baseline match before showing the figure. Fix: hero figure goes first (¶1 sentence: "figure X shows field match within −3.9 %"), then narrative. - §4 ¶4 (load-sharing) currently claims the mechanism but does not derive it from the figures. Fix: the paragraph must walk the reader from the per-bucket reaction plot to the lid-vs-skirt decomposition and then to the mechanism conclusion. Three sentences, each tied to a figure element. - §4 ¶5 (SHM thresholds) frames the result as operational guidance but does not say who is the intended user. Fix: one sentence naming the audience ("an asset manager setting inspection intervals under DNV-RP-C212 compliance").

Pass B — figures only. - Seven figures is borderline dense for a 9,000-word paper. Consider: combine fig-per-bucket-share + fig-lid-vs-skirt into a single two-panel figure (panel a = bucket shares, panel b = lid vs skirt). Saves one figure slot; the mechanism argument still reads. - fig-powerlaw-fit with a residual panel underneath is visually heavy. Decide: split into main figure (top panel) + supplementary residual, or keep combined with a smaller residual panel height. Either works.

Reviewer objection rehearsal (results-specific):

"How do you know the 32-month baseline match is not overfit?" — answered by ¶2 centrifuge cross-check + ¶3 three-anchor validation narrative. Strengthen: cite the hold-out window (if any) where the model was not calibrated; if no hold-out, acknowledge and frame the −3.9 % as a single-anchor match.
"Why 11 scour depths rather than a coarser set?" — answered by convergence rationale in methodology §2.2.3. Pointer needed in results §4 ¶1.
"Is the power-law exponent 1.47 stable across soils?" — currently not addressed. Add one sentence on the coefficient's soil-dependence expected behaviour + citation to Paper B's Buckingham-Pi framework, or flag as future work if out of scope.

Actionable items before R2 submission:

#	Item	Driving reviewer comment	Cost
1	Build `fig-per-bucket-share` + `fig-lid-vs-skirt` (or combined)	R2.5	1–2 sessions
2	Build `table-runtime-benchmark` with actual wall-clock measurements	R2.4	1 session
3	Overlay DNV-RP-C212 + IEC fatigue DEL on SHM threshold figure	R2.2	30 min
4	Add one-sentence pointer to soil-dependence of power-law exponent	new	5 min
5	Coherence edit on §4 ¶4 walking figure → figure → mechanism	new	15 min

Four new figures/edits, total ~4 sessions. Completable in the first two days of the R2 work plan.

What this pass surfaces that the methodology pass did not¶

Three of the five R2 reviewer items (R2.2, R2.4, R2.5) are results-section figures, not methodology-section figures. The methodology prototype flagged R2.4 and R3.1 as missing evidence; this results-prototype pass catches R2.5 and R2.2 additionally.
The paragraph-level coherence fixes are different between methodology and results. Methodology fixes tend to be framing sentences (claim-to-section links). Results fixes tend to be mechanism walk-throughs tying figures together.
The "who is the intended user" sentence on the SHM thresholds slide is a results-section-specific fix — in methodology sections the reader is the peer reviewer; in results sections it is the engineer or asset manager. This distinction should propagate into other papers' results-section prototypes (J3 results comes next).