EPC-3D-Diff: Equivariant Physics Consistent Conditional 3D Latent Diffusion for CBCT to CT Synthesis

arXiv 21 May 2026 2 min read

AI Insight

EPC-3D-Diff is a conditional 3D latent diffusion model designed to synthesize high-quality CT images from Cone-beam CT (CBCT) volumes, addressing the degraded Hounsfield Unit accuracy caused by scatter, noise, and reconstruction artifacts in CBCT. The framework introduces a physics-based equivariance loss derived from the relationship between in-plane volume rotations and angular shifts in projection data, enforcing geometric consistency between synthesized and target images during training. Validated on both phantom and clinical head datasets, the method achieved improvements of 7.4 dB (phantom) and 1.8 dB (clinical) in PSNR over state-of-the-art methods, with gains in SSIM and HU accuracy.

Why it matters

Accurate HU values in CBCT-derived images are essential for adaptive radiotherapy dose planning, and a method that reliably converts CBCT to CT-quality images could reduce the need for additional CT acquisitions, lowering patient radiation exposure and streamlining clinical workflows.

Confidence

5/10Peer-reviewedPhysics

arXiv:2605.20470v1 Announce Type: cross
Abstract: Cone-beam CT (CBCT) is routinely acquired during radiotherapy for patient setup, but its quantitative reliability is degraded by scatter, noise, and reconstruction artifacts, limiting Hounsfield Unit (HU) accuracy. We propose EPC-3D-Diff, a novel conditional 3D latent diffusion framework for volumetric CBCT to CT synthesis that introduces a projection domain equivariance loss derived from acquisition physics. Unlike common image domain equivariance, we exploit the fact that an in plane rotation of the volume corresponds to an angular shift in its projections. During training, we enforce this relationship by forward projecting rotated synthesized CT volumes and matching them to appropriately angle shifted projections of the paired target CT, yielding a physics consistent equivariance constraint integrated into the diffusion objective. To capture full 3D context efficiently, conditional diffusion is performed in a compact latent space learnt by a lightweight 3D autoencoder, preserving axial depth while downsampling in plane resolution for stable training. We validate on a paired head CBCT/CT phantom dataset, including repeat scans, and paired clinical data using patient wise splits, and perform single and mixed domain training, ablations, and comparisons with diffusion and CycleGAN. EPC-3D-Diff generalizes well and achieved substantial improvements, +7.4 dB (phantom) and +1.8 dB (clinical data) in PSNR compared to state of the art methods, alongside improved SSIM and HU accuracy, within tissue boundaries. Overall, EPC-3D-Diff improves robustness and physics consistency, supporting HU aware synthesis for downstream radiotherapy workflows.

Source: EPC-3D-Diff: Equivariant Physics Consistent Conditional 3D Latent Diffusion for CBCT to CT Synthesis

Source
arXiv