Accepted at LLA 2026 & ReALM-GEN 2026 · ICLR 2026 Workshops
Diffusion models and flow matching have become a cornerstone of robotic imitation learning, yet they suffer from a structural inefficiency where inference is often bound to a fixed integration schedule that is agnostic to state complexity. This paradigm forces the policy to expend the same computational budget on trivial motions as it does on complex tasks. We introduce Generative Control as Optimization (GeCO), a time-unconditional framework that transforms action synthesis from trajectory integration into iterative optimization.
GeCO learns a stationary velocity field in the action-sequence space where expert behaviors form stable attractors. As a result, test-time inference becomes adaptive: it can exit early for simple states while refining longer for difficult ones. The same stationary geometry also provides an intrinsic, training-free safety signal, since the field norm at the optimized action acts as a robust out-of-distribution detector.
We validate GeCO on standard simulation benchmarks and demonstrate that it scales naturally to π0-series Vision-Language-Action models. As a plug-and-play replacement for standard flow-matching heads, GeCO improves both success rate and efficiency while offering an optimization-native mechanism for safer deployment.
Replace fixed-step trajectory integration with an adaptive iterative optimization process in action space.
Allocate more refinement to hard states and stop early on easy states, improving inference efficiency.
Use the stationary field norm as a training-free signal for anomaly detection and safer deployment.
Conventional diffusion and flow-matching policies learn time-dependent vector fields and rely on a pre-defined inference schedule. GeCO removes this dependency by learning a single stationary velocity field in the action-sequence space. Expert actions become stable attractors, turning action generation into a convergence problem rather than a rollout through fictitious time.
To ensure robust optimization in continuous control, GeCO introduces a velocity rescaling mechanism that modulates the field magnitude based on the distance to the expert manifold. This creates a geometric sink around valid action modes and yields a practical inference loop that is both adaptive and stable.
Because the learned field is stationary, its residual norm directly reflects whether the current state-action pair lies near a learned in-distribution manifold. This provides a simple but effective uncertainty estimate without auxiliary heads, ensembles, or extra safety networks.
Coming Soon