and find a solution that maximises feasibility, while keeping the objective value above our current best upper bound.
When \(X_0\) is a box and we solve the surrogate subproblem as a linear knapsack problem, we simply stop accumulating elements greedily as soon as either the knapsack is full, or the objective value equal to the best upper bound.
I believe this shows the true nature of the reduction from linear programming to no-regret online learning is a surrogate relaxation \cite{glover1965} \cite{Glover_1975} method, not a Lagrangian relaxation. The fact that the weights are in the probability simplex (non-negative and sum to 1) is a nice hint that we're dealing with surrogate constraints. It also helps us understand why the reduction gives us stronger worst-case bounds than optimal subgradient methods: even when everything is nicely normalised, subgradient methods must search the space of Lagrange multipliers \([0, 1]^k\), while the surrogate relaxation's master problem only needs to explore the much smaller probability simplex \(\Delta^k\). While the surrogate relaxation approach scales its iteration count with \(\ln k\), the log of the number of relaxed constraints, subgradient methods scale with \(\| w_0 - w^\star \|^2\), the square of the Euclidean distance between the initial solution and the closest optimal vector of multipliers. In general, that squared distance is linear in the number of relaxed constraints.
The surrogate subproblem is slightly more complicated than the Lagrangian relaxation subproblem, but, as the instance grows, the pricing problem becomes exponentially easier for surrogate relaxation than for Lagrangian relaxation! However, there's a hidden problem here: the number of iteration grows with the maximum violation (or satisfaction) over all constraints, and that's not polynomial. We can fix that by making sure the relaxed constraints can only be satisfied or violated by a bounded amount.
Surrogate decomposition for free
Lagrangian decomposition dominates Lagrangian relaxation \cite{kim1987}. Instead of relaxing constraints from the original formulation, we introduce new auxiliary variable and constraints, and relax these constraints \cite{guignard2003}.
For example, in order to relax