C CONV2D_FP64 — 2D convolution, double-precision floating point, C single-image, multi-channel input, multi-filter output. C Stride 1, no padding (valid convolution). C C Reference: standard textbook formulation; see e.g. C Goodfellow, Bengio, Courville, "Deep Learning" (2016), ch. 9. C Equivalent shape to cuDNN's CUDNN_TENSOR_NCHW direct convolution C with stride 1 and zero padding. C C Hand-written reference for the Dark Factory's Phase 3 inference C kernel ladder, 2026-05-24. Public domain. C C Mathematical definition (valid convolution): C For every output position (OC, OH, OW): C Y(OC, OH, OW) = sum over (IC, KR, KS) of C X(IC, OH + KR - 1, OW + KS - 1) C * KER(OC, IC, KR, KS) C with IC in 1..C, KR in 1..KH, KS in 1..KW. C C Shapes (column-major Fortran): C X is C x H x W (input feature map) C KER is OC x C x KH x KW (filter bank) C Y is OC x HO x WO (output feature map) C where HO = H - KH + 1, WO = W - KW + 1. C C Overflow note: each output element is a sum of C * KH * KW C double-precision products. For typical inputs this is well C within IEEE_754 binary64 dynamic range; no special handling. SUBROUTINE CONV2D_FP64 + (H, W, C, KH, KW, OC, X, KER, Y) C Inputs: C H, W — input feature-map height and width (positive) C C — input channel count (positive) C KH, KW — kernel height and width (positive, KH <= H, KW <= W) C OC — output channel count (positive) C X — C x H x W input feature map C KER — OC x C x KH x KW filter bank C Output: C Y — OC x (H - KH + 1) x (W - KW + 1) output feature map INTEGER H, W, C, KH, KW, OC DOUBLE PRECISION X(C, H, W) DOUBLE PRECISION KER(OC, C, KH, KW) DOUBLE PRECISION Y(OC, H - KH + 1, W - KW + 1) INTEGER OF, OH, OW, IC, KR, KS, HO, WO DOUBLE PRECISION ACC HO = H - KH + 1 WO = W - KW + 1 DO 50 OF = 1, OC DO 40 OH = 1, HO DO 30 OW = 1, WO ACC = 0.0D0 DO 20 IC = 1, C DO 15 KR = 1, KH DO 10 KS = 1, KW ACC = ACC + + X(IC, OH + KR - 1, OW + KS - 1) + * KER(OF, IC, KR, KS) 10 CONTINUE 15 CONTINUE 20 CONTINUE Y(OF, OH, OW) = ACC 30 CONTINUE 40 CONTINUE 50 CONTINUE RETURN END