Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For linear programming (i.e. all variables are continuous):

Geometrically in R^n, the set of all points that satisfy the constraints is a high-dimensional polyhedron. The best points, the ones that minimize or maximize the linear objective function, are on the boundary of that polyhedron. There is either one (a vertex) or possibly infinitely many (a face), but in some standard reformulations (e.g. all variables are >= 0), there is always at least one optimal vertex. Optimal points can be found using iterative methods either going through the polyhedron ("interior point methods"), or going from vertex to vertex ("simplex method").

In theory, interior point methods can solve the problem to optimality in a time that is polynomial in the size of the input problem. All known variants of the simplex method are exponential in the worst case.

In practice, interior point methods are extremely fast and solve most problems in 10-100 iterations regardless of size! You can expect a problem with 100k variables and constraints to be solved within 5-20 minutes. However, using floating-point arithmetic, the solutions are not always very accurate, so it is customary to add some iterations of the simplex method at the end to get a vertex (vertices have a combinatorial structure; there is no accumulation of error, and one can get their value to high accuracy). On its own, the simplex method is still very fast, count 10-60 minutes for the same 100k-variable problem (with very high variance, though).

For integer programming (i.e. some or all of the variables are discrete):

This is much harder. The problem is NP-hard. In theory it looks hopeless. Indeed brute force is hopeless in practice.

But one can still to a lot better than brute force. IP solvers use branch-and-cut. The fastest ones (in which millions of PhD-holder hours have been poured) can solve problems with hundreds of thousands of variables in typically less than 24h. Of course, it is possible to construct a problem with 4 variables that will take years to solve, but that typically does not happen with naturally-occuring problems. For specially-structured problems (like knapsack, TSP or vehicle routing), customized codes can use special techniques to solve problems with tens of billions of variables.



To add some insight to what branch-and-cut actually does: even with integer programming you can still work with the high-dimensional polyhedron. If you end up being lucky and the optimal vertex turns out to be integral, you know you have the optimal integer solution. But of course, it may turn out that the vertex turns out to have non-integer components that should be integer, and that is where the real work starts. A lot of the magic in commercial solvers involves the development of clever tricks to get from this optimal non-integer point to an optimal integer point. With branching you split up the polyhedron eliminating your current non-integer point, and with cutting you add constraints that remove the non-integer point but are guaranteed to not remove any optimal integer points. There is typically a lot of choice in which thing to try first, and these choices typically have huge impact on how fast the solver is.

I am always happy to see people point out that NP-hard problems are solved to optimality in practice. Too often you come across someone who believes it is impossible to prove you have an optimal solution for an NP-hard problems, no matter which instance. But NP-hardness does not rule out the possibility that you might be much more lucky than the worst case with the particular instance you are interested in. It only says that there exist worst case classes of instances that become much more difficult to solve as their size increases.

By the way, I remember having learned that Integer Programming with a constant number of variables but a variable number of constraints can actually be solved in polynomial time with help from the Lenstra–Lenstra–Lovász lattice basis reduction algorithm, so I would not expect an integer problem with four variables to need years to solve. A constant number of constraints is actually more difficult than a constant number of variables (the Knapsack Problem is NP-hard and has a single constraint). AFAIK, for a constant number of constraints we have nothing better than a pseudo-polynomial time algorithm by Papadimitriou.


(The generalized term for a polyhedron in higher dimensions is a polytope, by the way.)

Don't forget that you can also use LP relaxation ("well, what if we didn't need everything to be integral?"). If it comes out with integral weights (as was the case in the original example, modulo floating point error), then you're done as the sibling commenter points out. If you're willing to accept approximate solutions to your integer program, you can just pick a point near the "optimal" solution.

And, of course, modern IP solvers will use the relaxed LP problem to implement "cheap" upper bounds (assuming MAX, per standard form) for entire branches of the solution space, as the optimal solution to a less-constrained problem is never worse than for the more-constrained problem (since any feasible assignment remains so if you remove a constraint).


This is exactly right — too many people equate NP-hard to unsolvable, but in fact the modern world (logistics, plant operations, airline scheduling, etc.) relies on us solving NP-hard problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: