If \(z(p) > p^\top b,\) the instance is infeasible. Otherwise, let \(x(p)\) be a minimiser. We assign the payoff \((b_i - a_i x(p))\) to each expert. The expected payoff for the mixed strategy \(p\) is non-positive (\(z(p) \leq p^\top b\)), and each expert's payoff corresponds to its constraint's violation. After \(T\) iteration, the regret is at most \(\mathcal{O}(\sqrt{T \ln k})\), and we know that our mixed strategies' total payoff is at most 0. This means that the sum of violations for each expert is at most \(\mathcal{O}(\sqrt{T \ln k})\), so we can simply take the average of the \(T\) solutions \(x(p)\) to obtain a solution where each constraint is violated by at most \(\mathcal{O}\left(\frac{\sqrt{T \ln k}}{T}\right) = \mathcal{O}\left(\sqrt{\frac{\ln k}{T}}\right)\). If we use the MWU or AdaHedge algorithms, and the payoffs (violation / satisfaction) are always in \([-1, 1]\), we can let \(T \geq 4 \left(\frac{\ln k}{\varepsilon}\right)^2\) and know that we will always find an \(\varepsilon\)-feasible solution.
This doesn't work for optimisation: we're using Lagrangian optimisation with weights normalised to 1 (to fit the online learning mould), and we can only apply this normalisation for pure feasibility problems.
Surrogate relaxation to reduce linear optimisation to the experts problem
The reduction for feasibility works with any closed convex set \(X\) in