The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning

arXiv AI 11 Jun 2026 2 min read

AI Insight

Researchers developed MiDiGap, a machine learning approach that enables robots to learn manipulation tasks from as few as five video demonstrations using only camera observations. The system can learn diverse tasks including long-horizon behaviors, constrained motions, and dynamic actions in under one minute on a standard CPU, while also supporting real-time adjustments for obstacle avoidance and transferring learned behaviors between different robot types. MiDiGap demonstrated substantial performance improvements over existing methods, increasing success rates by 48-76 percentage points across various benchmarks and achieving over double the success rate in cross-embodiment transfer tasks.

Why it matters

This work significantly lowers the barriers to teaching robots new tasks by requiring minimal demonstrations and computational resources. The ability to quickly adapt learned behaviors to new situations and transfer skills between different robot platforms could accelerate the deployment of flexible robotic systems in manufacturing, healthcare, and domestic environments.

Confidence

6/10Peer-reviewedAI & Computational Science

arXiv:2505.03296v2 Announce Type: replace-cross
Abstract: We present Mixture of Discrete-time Gaussian Processes (MiDiGap), a novel approach for flexible policy representation and imitation learning in robot manipulation. MiDiGap enables learning from as few as five demonstrations using only camera observations and generalizes across a wide range of challenging tasks. It excels at long-horizon behaviors such as making coffee, highly constrained motions such as opening doors, dynamic actions such as scooping with a spatula, and multimodal tasks such as hanging a mug. MiDiGap learns these tasks on a CPU in less than a minute and scales linearly to large datasets. We also develop a rich suite of tools for inference-time steering using evidence such as collision signals and robot kinematic constraints. This steering enables novel generalization capabilities, including obstacle avoidance and cross-embodiment policy transfer. MiDiGap achieves state-of-the-art performance on diverse few-shot manipulation benchmarks. On constrained RLBench tasks, it improves policy success by 76 percentage points and reduces trajectory cost by 67%. On multimodal tasks, it improves policy success by 48 percentage points and increases sample efficiency by a factor of 20. In cross-embodiment transfer, it more than doubles policy success. We make the code publicly available at https://midigap.cs.uni-freiburg.de.

Source: The Unreasonable Effectiveness of Discrete-Time Gaussian Process Mixtures for Robot Policy Learning