Integer vs. Linear Programming in Python

monkeybutton · on May 6, 2022

Only tangentially related, but I solved a problem at my work this week with a couple of cleverly crafted MILP programs. Its really exciting in the nerdiest way possible and I need to share it with someone, anyone! I can't write a blog post about it because I don't have a personal website, plus it'd be a trade secret anyways. Most of my coworkers aren't technical so they're happy to get an answer to their problem, but they don't really care about the how part.

ironSkillet · on May 7, 2022

The coolest instance of MILP I have encountered "in the wild" at work was a disaggregation problem where we could only receive vendor reports that contained aggregated statistics, each report aggregating the original data along a different dimension (say one by customer zip code, another by age range, etc). By lining up these aggregated reports, you could turn that into an MILP system that could be solved to unwind all of the aggregation and get back to the original data. This gave us way more flexibility in deciding how to use the vendor data.

_Algernon_ · on May 7, 2022

Was the purpose of aggregating the data to "anonymize" it? If so, this just illustrates how useless anonymizing data is.

lrem · on May 7, 2022

Yup, k-anonymity is hard. State of the art is rapidly evolving, requiring higher and higher k (i.e. more coarse/less useful reports). Most companies out there prefer to, um, not notice that.

ironSkillet · on May 7, 2022

Yes, and I mostly agree. Most companies just randomize an ID or do some group-bys and consider it "anonymized", which are susceptible to a variety of correlation techniques (such as setting up this MILP problem). And the more data sets you have with the same underlying masked primary key (e.g. person), the easier it can get. I have seen these weak attempts at hiding information in multiple industries. Just for the record, in my use case there was never an intent to map data to individuals, just to get the most granular information possible about key metrics.

webcomthrowaway · on May 7, 2022

Hello, would this have been on vendor reports from a certain large online advertising and search company?

ironSkillet · on May 7, 2022

Ha yes, I probably know you.

User23 · on May 7, 2022

Write it up. Even if you can’t publish it now, you may not recall later. It’s quite likely circumstances will change such that your current employer has no interest in blocking publication so if you already have a document you’ll be able to advance human knowledge.

daenz · on May 7, 2022

Doesn't hurt to ask for written approval from your manager to write about it! I'd love to read it as well. Getting to use optimization algorithms at your job is super rare (unless you're in specific industries).

cm2187 · on May 7, 2022

I actually found a few use cases recently.

Solving for the funding plan of a bank to balance the balance sheets and meet all the regulatory constraints (it actually uses a linearisation of a piecewise linear convex problem that I had learned in college decades ago, but completely forgot about, I asked for guidance from one of my professors who asked for permission to use the case as an exercise for his students).

Assigning cash flows to changes in balances when the two feeds are inconsistent and the timing is off (you need to do some complex combinations to match them).

But same as OP, you get a sense of self-satisfaction but that won’t get you any credit internally.

monkeybutton · on May 7, 2022

Super rare indeed! For me, it seems to be around one every couple of years. Just long enough that it becomes a novelty again.

xarope · on May 7, 2022

Please do write it up! It's always interesting to read how others solve problems using similar tools. If you have a github account you can always gist it. If not, perhaps one of the prolific writers in HN would be willing to publish it for you?

OccamsRazr · on May 7, 2022

> I don't have a personal website

It's so easy to make a free blog on github (like sub 5 minutes) with some version of jekyl. You should probably just do it.

protomikron · on May 7, 2022

Was it a classic scheduling problem (e.g. distributing resources)?

monkeybutton · on May 7, 2022

I suppose so. A first pass on which resources to acquire with different costs and different (and diminishing) rates of return. Then a second pass to assign those resources to workers who have different capacities to do work, and also can't do every kind of work (so bin packing?).

jethkl · on May 6, 2022

Great to see OR highlighted here! Thank you for sharing.

The syntax demonstrated is very similar to the syntax of other libraries, including the python tools provided by Gurobi [1].

For experimentation and rapid prototyping LP, QP, MIP and geometric programming problems, I highly recommend CVXPY [2]. CVXPY allows the user to select a solver. Among the solvers is GLOP [3].

[1] https://www.gurobi.com/resource/modeling-examples-using-the-...

[2] http://cvxpy.org

[3] https://www.cvxpy.org/tutorial/advanced/index.html#choosing-...

stncls · on May 7, 2022

For modelling libraries in general-purpose languages, Gurobi's python bindings have the best reputation. But of course Gurobi is very expensive (I have heard about $50k for a fully unrestricted license, plus $10k yearly for support). On the open-source side, besides Google's OR-Tools, there is Pyomo [1] and PuLP [2] in Python (as the article mentions). In Julia, there is JuMP [3], whose development community is extremely enthusiastic.

Traditionally, however, mathematical models were encoded in domain-specific languages. The most prominent one is AMPL [4] which is proprietary. The glpk [5] people have developed a very neat open source clone of AMPL: the GNU MathProg language. For a more modern take on AMPL-type modelling DSLs, look at ZIMPL [6], which is open source as well.

[1] http://www.pyomo.org/

[2] https://coin-or.github.io/pulp/

[3] https://jump.dev/JuMP.jl/stable/

[4] https://ampl.com

[5] https://www.gnu.org/software/glpk/

[6] https://zimpl.zib.de/

ant6n · on May 7, 2022

I’ve used pulp a lot to set up MIP problems in python. The problem is that every variable often uses 10kb of memory or so. When problems get large, the machine that I have that has access to gurobi will run out of memory juts setting up the problem in python (or when calling gurobi from within python, which for a short time seems to double memory usage when calling an external process) - even though gurobi can still handle the actual model.

I’ve often given up and just build the problems in python by outputting text files, which is very cumbersome.

I wonder whether there’s a better alternative, i.e. a python library that will deal well with large problems.

adammarples · on May 7, 2022

Cvxpy is what you're looking for, your variables and constraints can be defined as numpy arrays, it's game changing. It also has a giant selection of solvers, is open source and actively developed, and even has tensorflow integrations

qsort · on May 7, 2022

I don't know if it's applicable to your use case, but if you work on these models a lot you might want to consider using Minizinc/Flatzinc as a modeling language.

Nothing against python, but those models get unwieldy very quickly, especially when there are tons of overlapping constraints popping up.

ant6n · on May 7, 2022

Well, I usually use MIP in larger processing pipelines, which I tend to write in python. For example see step 4. in this algorithm about automatically generating a public transit map layer: https://site.transitapp.com/how-we-built-the-worlds-pretties...

adammarples · on May 7, 2022

Use cvxpy, much nicer api and you can drop numpy arrays into it so your optimizations can be one line instead of awful lists of constraint objects. Also it can be a tensorflow layer now with cvxlayers

_z2co · on May 7, 2022

The use of `enumerate` in some of the general examples doesn't seem right to me.

Take, for instance, the part of the general solution:

    for r, _ in enumerate(RESOURCES):
        solver.Add(sum(DATA[u][r] * units[u] for u, _ in enumerate(units)) <= RESOURCES[r])

the author is basically using enumerate to get the indices from the list, and nothing else. It seems like it would be much better to either do:

    for r, resource in enumerate(RESOURCES):
        solver.Add(sum(DATA[u][r] * unit for u, unit in enumerate(units)) <= resource)

`enumerate` is supposed to return both the index and the value, so indexing the list you just passed to `enumerate` seems like bad form.

Is there some reason not to do it this way?

Helmut10001 · on May 7, 2022

This is the way.

cruzoz · on May 7, 2022

How efficient are these solvers?

The article makes it sound like they are guaranteed to find the optimum solution. However, the amount of iterations/execution time implies they don't simply use brute force.

So how do these solver guarantee an optimal solution? My first thought is that they somehow model the problem as an equation whose minima/maxima they can determine analytically but I'm unsure of how to build such an equation automatically (especially when multiple constraints like food, gold, wood etc. are involved).

stncls · on May 7, 2022

For linear programming (i.e. all variables are continuous):

Geometrically in R^n, the set of all points that satisfy the constraints is a high-dimensional polyhedron. The best points, the ones that minimize or maximize the linear objective function, are on the boundary of that polyhedron. There is either one (a vertex) or possibly infinitely many (a face), but in some standard reformulations (e.g. all variables are >= 0), there is always at least one optimal vertex. Optimal points can be found using iterative methods either going through the polyhedron ("interior point methods"), or going from vertex to vertex ("simplex method").

In theory, interior point methods can solve the problem to optimality in a time that is polynomial in the size of the input problem. All known variants of the simplex method are exponential in the worst case.

In practice, interior point methods are extremely fast and solve most problems in 10-100 iterations regardless of size! You can expect a problem with 100k variables and constraints to be solved within 5-20 minutes. However, using floating-point arithmetic, the solutions are not always very accurate, so it is customary to add some iterations of the simplex method at the end to get a vertex (vertices have a combinatorial structure; there is no accumulation of error, and one can get their value to high accuracy). On its own, the simplex method is still very fast, count 10-60 minutes for the same 100k-variable problem (with very high variance, though).

For integer programming (i.e. some or all of the variables are discrete):

This is much harder. The problem is NP-hard. In theory it looks hopeless. Indeed brute force is hopeless in practice.

But one can still to a lot better than brute force. IP solvers use branch-and-cut. The fastest ones (in which millions of PhD-holder hours have been poured) can solve problems with hundreds of thousands of variables in typically less than 24h. Of course, it is possible to construct a problem with 4 variables that will take years to solve, but that typically does not happen with naturally-occuring problems. For specially-structured problems (like knapsack, TSP or vehicle routing), customized codes can use special techniques to solve problems with tens of billions of variables.

pcbouman · on May 7, 2022

To add some insight to what branch-and-cut actually does: even with integer programming you can still work with the high-dimensional polyhedron. If you end up being lucky and the optimal vertex turns out to be integral, you know you have the optimal integer solution. But of course, it may turn out that the vertex turns out to have non-integer components that should be integer, and that is where the real work starts. A lot of the magic in commercial solvers involves the development of clever tricks to get from this optimal non-integer point to an optimal integer point. With branching you split up the polyhedron eliminating your current non-integer point, and with cutting you add constraints that remove the non-integer point but are guaranteed to not remove any optimal integer points. There is typically a lot of choice in which thing to try first, and these choices typically have huge impact on how fast the solver is.

I am always happy to see people point out that NP-hard problems are solved to optimality in practice. Too often you come across someone who believes it is impossible to prove you have an optimal solution for an NP-hard problems, no matter which instance. But NP-hardness does not rule out the possibility that you might be much more lucky than the worst case with the particular instance you are interested in. It only says that there exist worst case classes of instances that become much more difficult to solve as their size increases.

By the way, I remember having learned that Integer Programming with a constant number of variables but a variable number of constraints can actually be solved in polynomial time with help from the Lenstra–Lenstra–Lovász lattice basis reduction algorithm, so I would not expect an integer problem with four variables to need years to solve. A constant number of constraints is actually more difficult than a constant number of variables (the Knapsack Problem is NP-hard and has a single constraint). AFAIK, for a constant number of constraints we have nothing better than a pseudo-polynomial time algorithm by Papadimitriou.

vitus · on May 7, 2022

(The generalized term for a polyhedron in higher dimensions is a polytope, by the way.)

Don't forget that you can also use LP relaxation ("well, what if we didn't need everything to be integral?"). If it comes out with integral weights (as was the case in the original example, modulo floating point error), then you're done as the sibling commenter points out. If you're willing to accept approximate solutions to your integer program, you can just pick a point near the "optimal" solution.

And, of course, modern IP solvers will use the relaxed LP problem to implement "cheap" upper bounds (assuming MAX, per standard form) for entire branches of the solution space, as the optimal solution to a less-constrained problem is never worse than for the more-constrained problem (since any feasible assignment remains so if you remove a constraint).

wenc · on May 7, 2022

This is exactly right — too many people equate NP-hard to unsolvable, but in fact the modern world (logistics, plant operations, airline scheduling, etc.) relies on us solving NP-hard problems.

jethkl · on May 7, 2022

It is unclear whether you are asking about integer programs or linear, so I'll address both.

Integer programming is very hard, and is an active area of research. The space of problems that arise in practice typically have sparsity, special structure, and other features that allow for shortcuts to be employed so that brute force isn't needed.

Interior point solvers are guaranteed to find a global solution, and it can be proved that they will do so efficiently [1]. Practical implementations leverage decades of experience to improve upon the theoretical guarantees.

[1] https://en.wikipedia.org/wiki/Convex_optimization

OccamsRazr · on May 7, 2022

There are ways (called heuristics) to solve some NP hard problems faster than exhaustive search.

Of course, for theoretical reasons, a heuristic is not guaranteed to be faster on all problem instances.

The general idea is to do some engineering and find a heuristic that works well for your problem. More sophisticated algorithms will try multiple heuristics at the same time (I believe OR tools has options that falll in this category). There is even research by now that aims to use ML to find good heuristics for a given family of problems (where the common characteristics of the problems might lend to a dominant heuristic).

These heuristics can still have guarantees of correctness. A good place to start is by looking up "branch and bound" algorithms, e.g., for ILP problems. The idea there is to narrow the search space while maintaining some theoretical guarantees.

(Not an expert but i did a project on this once. I'm sure someone more expert can come along and correct where I'm wrong.)

plafl · on May 7, 2022

> There is even research by now that aims to use ML to find good heuristics for a given family of problems (where the common characteristics of the problems might lend to a dominant heuristic).

I think I have spotted some papers about this for combinatorial problems but not for continuous variables, which makes sense given their relative hardness.

Do you know some papers, journals, conferences about this?

I'm interested specially about the view from people working in optimization research more than in AI.

OccamsRazr · on May 7, 2022

I think we're probably thinking of the same papers. There was a competition /session at last year's neurips and there's related papers linked from their webpage.

https://www.ecole.ai/2021/ml4co-competition/

bunje · on May 7, 2022

In university I learned about the simplex method which can be applied to linear programming. I don't know if it's used by modern solvers.

wenc · on May 7, 2022

Also, just FYI, there are actually two simplex methods. One is an algorithm to solve LPs (by George Dantzig), while the other is a derivative-free optimization algorithm for solving general unconstrained problems.

It’s unfortunate they both share the same name, because they’re completely different algorithms.

rich_sasha · on May 7, 2022

Simplex AFAIK isn’t used to solve real world “regular” linear programs as worst-case complexity is exponential, and polynomial algorithms exist.

As it happens, MILP solvers often do use the simplex solver internally, as it can be hot-restarted - you can modify the problem a bit and quickly get an amended solution.

wenc · on May 7, 2022

On the contrary the simplex method IS used to solve regular linear programs, despite the existence of polynomial time barrier/interior-point algorithms.

Yes, simplex is worst case exponential and barrier is worst-case polynomial, but depending on the problem, the average case characteristics are very different. Depending on the problem, simplex can beat barrier methods. For many large LP/MILP problems, heuristics and randomness usually dominate solution time rather than simplex vs barrier. That’s why commercial solvers so thoroughly beat open-source solvers —- their heuristics are far superior.

For NP-hard problems, it’s not a good idea to judge algorithms by their worst-case complexity —- which only provides bound — but by their real world performance. Before Karmarkar’s method, there was another polynomial time algorithm by Khachiyan. Despite being polynomial time, it was too slow to be of any practical interest.

sph · on May 7, 2022

I love this, it was very well explained and I immediately understood the pain of figuring out the best resource allocation in a strategy game.

Any suggestion for good LP libraries/solvers for someone that isn't fond of Python? Isn't Prolog well suited for this kind of process?

triska · on May 7, 2022

Yes it is true, logic programming languages like Prolog are indeed often a good fit to solve such combinatorial tasks.

For example, using Scryer Prolog and its CLP(ℤ) library for constraint logic programming over integers, we can solve the first optimization task from the article on the Prolog toplevel, without even having to define a predicate:

    ?- use_module(library(clpz)).
       true.
    ?- Vs = [Swordsmen,Bowmen,Horsemen],
       Vs ins 0..sup,
       60*Swordsmen + 80*Bowmen + 140*Horsemen #=< 1200,
       20*Swordsmen + 10*Bowmen #=< 800,
       40*Bowmen + 100*Horsemen #=< 600,
       Power #= Swordsmen*70 + Bowmen*95 + Horsemen*230,
       labeling([max(Power)], Vs).

yielding solutions in decreasing amount of Power:

       Vs = [6,0,6], Swordsmen = 6, Bowmen = 0, Horsemen = 6, Power = 1800
    ;  Vs = [7,1,5], Swordsmen = 7, Bowmen = 1, Horsemen = 5, Power = 1735
    ;  Vs = [5,0,6], Swordsmen = 5, Bowmen = 0, Horsemen = 6, Power = 1730
    ;  ... .

On backtracking, alternatives are automatically generated. The above interaction shows that in this concrete example, the optimal solution is unique. In Scryer Prolog, we can press "a" on the toplevel, and it will automatically enumerate all solutions, 558 in total for this example, ending with:

    ...,
    ;  Vs = [2,0,0], Swordsmen = 2, Bowmen = 0, Horsemen = 0, Power = 140
    ;  Vs = [0,1,0], Swordsmen = 0, Bowmen = 1, Horsemen = 0, Power = 95
    ;  Vs = [1,0,0], Swordsmen = 1, Bowmen = 0, Horsemen = 0, Power = 70
    ;  Vs = [0,0,0], Swordsmen = 0, Bowmen = 0, Horsemen = 0, Power = 0
    ;  false.

If we omit the actual search for solutions from the query above, i.e., if we post only the part before labeling/2:

    ?- Vs = [Swordsmen,Bowmen,Horsemen],
       Vs ins 0..sup,
       60*Swordsmen + 80*Bowmen + 140*Horsemen #=< 1200,
       20*Swordsmen + 10*Bowmen #=< 800,
       40*Bowmen + 100*Horsemen #=< 600,
       Power #= Swordsmen*70 + Bowmen*95 + Horsemen*230.

then we see from the answer that the constraint solver has automatically deduced, from the posted constraints, significantly restricted domains of the occurring integer variables:

       ...,
       clpz:(Swordsmen in 0..20),
       clpz:(Bowmen in 0..15),
       clpz:(Horsemen in 0..6),
       clpz:(Power in 0..4205).

This automatic restriction is called constraint propagation, and this is what reduces the search space so significantly. It also allows the application of heuristics that often significantly speed up up the search in practice, such as starting with domain variables with smallest domains, in the hope to detect infeasible cases earlier. We can flexibly select these heuristics as options of labeling/2, while leaving all other parts of the query the same. For instance, in this concrete example, I obtained a more than 2-fold speedup by using the "min" labeling option, i.e., writing labeling([max(Power),min], Vs), so that the search always selects a variable with smallest lower bound for branching.

Constraint solvers blend in naturally in Prolog, and are often a reason for buying a fast commercial Prolog system. In fact, Prolog itself can be regarded as an instance of constraint programming: constraint logic programming over Herbrand terms, CLP(H), where (=)/2, i.e., terms are the same, and dif/2, i.e., terms are different, are the only constraints. Other specialized domains include CLP(B) over Boolean values, and CLP(Q) over the rational numbers, where dedicated algorithms allow very efficient and convenient solutions.

Constraint programming was also recently discussed in The Programming Paradigm to Find One Solution Among 8,080,104 Candidates:

https://news.ycombinator.com/item?id=31216847

quibono · on May 7, 2022

Thanks for the detailed write-up, I find this stuff super interesting. If I wanted to read up and study the maths involved (CLP(Z)) what would be the best place to start (textbooks, courses etc)?

i_don_t_know · on May 8, 2022

I have enjoyed these courses on Coursera. You can audit them for free.

Discrete Optimization https://www.coursera.org/learn/discrete-optimization

Solving Algorithms for Discrete Optimization https://www.coursera.org/learn/solving-algorithms-discrete-o...

The second course is the last part of a three part series. The first two parts are on modeling of problems with MiniZinc. They are also quite enjoyable.

sph · on May 7, 2022

I appreciate the effort you have put into this comment, you have renewed my interest in learning Prolog. It's such a beautiful and still esoteric language to my eyes.

adammarples · on May 7, 2022

Use cvxpy, much nicer api and you can drop numpy arrays into it so your optimizations can be one line instead of awful lists of constraint objects. Also it can be a tensorflow layer now with cvxlayers

xarope · on May 7, 2022

Thanks, very useful. I've been meaning to try the OR tools for a while, the examples given help me figure out how to decompose a problem.

TedDoesntTalk · on May 7, 2022

That seems to be the crux of it, right? How to decompose a problem into bounds, constraints, and an objective function? I’m thinking of finance applications.

cm2187 · on May 7, 2022

The nice thing with Google OR tools is that it gives you several solvers with common API. In my experience a solver might struggle sometimes with a problem that another one will solve without problem. So it’s useful to allow your users to switch the solver used with a simple drop down.

Annatar · on May 7, 2022

Here is also a nice article on operations research using the graphical method in R:

https://medium.com/swlh/operations-research-with-r-graphical...

nazgul17 · on May 7, 2022

Great article, thanks for sharing!

By the way, isn't that for-loop in section IV, point 2, iterating over resources, unnecessary? You're not using the loop variable r, and the constraint is on the minimum power of the army, which has nothing to do with the resources.

tomrod · on May 6, 2022

Thanks for sharing!

I often look at packages like this by use. Large, heavily utilized, corporate usage are indicative of a healthy project to me.

What metrics do other HN readers use for evaluating projects like this?

Frotag · on May 6, 2022

The issues tab. Mostly to check for any critical / common pitfalls (by searching for issues with the most comments) and to get a sense of how battle-hardened the repo is (how many issues have been raised).

turminal · on May 6, 2022

I really hate to use this metric, but the language used is often a good indicator of what qualities and values the project authors find important.

not2b · on May 6, 2022

For Python libraries that do significant computation, the actual code is often in C, C++, or Fortran with the Python part just serving to tie the libraries together.

TedDoesntTalk · on May 7, 2022

Fortran???

AnimalMuppet · on May 7, 2022

If I understand correctly, numpy was (still is?) a wrapper around Fortran libraries such as linpack.

TimTheTinker · on May 7, 2022

Like it or not, Fortran is a very fast and memory efficient compiled language with compilers supporting most platforms and many open source libraries written in it.

These days, it survives mostly as an authoring language for math and statistical libraries.

cm2187 · on May 7, 2022

I was told recently to use some Math library for .net that was clearly a wrapper around a Fortran library. All method names were a 4 characters hex, I think the same for parameter names. I just refused.

syockit · on May 7, 2022

That's probably a wrapper for BLAS/LAPACK. The majority of BLAS/LAPACK implementations use the original functions names as the reference implementation found on Netlib. The names are not in hex, but acronyms and short forms e.g. GEMV refers to double general matrix-vector multiplication.

What did you end up using as alternative? In C++ for example, there's a template based library (Eigen) that fuses multiple operations and gives you some speedup compared to the usual libraries, at the cost of very long compilation times. For C#, I imagine there could be some library that does the same thing in JIT instead.

cm2187 · on May 7, 2022

It was the Nag library for .net [1]. What I really wanted to use is Extreme Optimization, but couldn't get the company to buy a license. In the end I realised I could express the problem as a linear program and used Google OR Tools and LPSolve.

[1] https://www.nag.com/numeric/dt/nagdotnet_dtw02/html/contents...

olkyts · on May 7, 2022

Can anyone recommend good books / courses / other resources to learn about operations research from scratch?

atum47 · on May 7, 2022

Great article. Got me thinking if this could be used to solve the knapsack problem.

rich_sasha · on May 7, 2022

You can. It is a straightforward MILP program.

The issue is, MILP in general is NP-complete to solve... so it doesn’t gain you anything. In practice solutions can be found quickly, or at least good approximations.

atum47 · on May 7, 2022

Yep, remember studying the solution using dynamic programming, back in college

evolveyourmind · on May 7, 2022

Yes, knapsack can be optimally solved with these solvers. Indeed, the time complexity of these optimisers is exponential.

drdunce · on May 7, 2022

Great article, very well explained.