Calibrating Generative Models to Distributional Constraints

arXiv 29 May 2026 2 min read

Image generated by AI

AI Insight

This paper addresses miscalibration in generative models, where statistical properties of generated outputs deviate from intended distributions. The researchers formulate calibration as a constrained optimization problem and propose two fine-tuning methods: a relaxation loss that penalizes miscalibration and a reward-based loss that reframes calibration as a reward optimization task. Testing on models up to nine billion parameters across protein design, image generation, and language modeling demonstrates substantial reductions in calibration error when enforcing hundreds of simultaneous distributional constraints.

Why it matters

Improved calibration of generative models can enhance reliability and control in critical applications like protein design for drug development and content generation systems. The methods provide practical tools for ensuring that AI-generated outputs match desired statistical properties, which is essential for deployment in scientific and industrial contexts where distributional guarantees are necessary.

Confidence

6/10Peer-reviewedBiology

arXiv:2510.10020v4 Announce Type: replace-cross
Abstract: Generative models frequently suffer miscalibration, wherein statistics of the sampling distribution, such as the fraction of generations in a given class, deviate from desired values. We frame calibration as a constrained optimization problem and seek the closest model in Kullback-Leibler divergence satisfying a calibration constraint. To address the intractability of imposing these constraints exactly, we introduce two surrogate objectives for fine-tuning: (1) the relax loss, which replaces the constraint with a miscalibration penalty, and (2) the reward loss, which converts calibration into a reward fine-tuning problem. We demonstrate that these approaches substantially reduce calibration error across hundreds of simultaneous constraints and models with up to nine billion parameters, spanning applications in protein design, image generation, and language modeling.

Source: Calibrating Generative Models to Distributional Constraints