Only tangentially related, but I solved a problem at my work this week with a couple of cleverly crafted MILP programs. Its really exciting in the nerdiest way possible and I need to share it with someone, anyone! I can't write a blog post about it because I don't have a personal website, plus it'd be a trade secret anyways. Most of my coworkers aren't technical so they're happy to get an answer to their problem, but they don't really care about the how part.
The coolest instance of MILP I have encountered "in the wild" at work was a disaggregation problem where we could only receive vendor reports that contained aggregated statistics, each report aggregating the original data along a different dimension (say one by customer zip code, another by age range, etc). By lining up these aggregated reports, you could turn that into an MILP system that could be solved to unwind all of the aggregation and get back to the original data. This gave us way more flexibility in deciding how to use the vendor data.
Yup, k-anonymity is hard. State of the art is rapidly evolving, requiring higher and higher k (i.e. more coarse/less useful reports). Most companies out there prefer to, um, not notice that.
Yes, and I mostly agree. Most companies just randomize an ID or do some group-bys and consider it "anonymized", which are susceptible to a variety of correlation techniques (such as setting up this MILP problem). And the more data sets you have with the same underlying masked primary key (e.g. person), the easier it can get. I have seen these weak attempts at hiding information in multiple industries. Just for the record, in my use case there was never an intent to map data to individuals, just to get the most granular information possible about key metrics.
Write it up. Even if you can’t publish it now, you may not recall later. It’s quite likely circumstances will change such that your current employer has no interest in blocking publication so if you already have a document you’ll be able to advance human knowledge.
Doesn't hurt to ask for written approval from your manager to write about it! I'd love to read it as well. Getting to use optimization algorithms at your job is super rare (unless you're in specific industries).
Solving for the funding plan of a bank to balance the balance sheets and meet all the regulatory constraints (it actually uses a linearisation of a piecewise linear convex problem that I had learned in college decades ago, but completely forgot about, I asked for guidance from one of my professors who asked for permission to use the case as an exercise for his students).
Assigning cash flows to changes in balances when the two feeds are inconsistent and the timing is off (you need to do some complex combinations to match them).
But same as OP, you get a sense of self-satisfaction but that won’t get you any credit internally.
Please do write it up! It's always interesting to read how others solve problems using similar tools. If you have a github account you can always gist it. If not, perhaps one of the prolific writers in HN would be willing to publish it for you?
I suppose so. A first pass on which resources to acquire with different costs and different (and diminishing) rates of return. Then a second pass to assign those resources to workers who have different capacities to do work, and also can't do every kind of work (so bin packing?).
Great to see OR highlighted here! Thank you for sharing.
The syntax demonstrated is very similar to the syntax of other libraries, including the python tools provided by Gurobi [1].
For experimentation and rapid prototyping LP, QP, MIP and geometric programming problems, I highly recommend CVXPY [2]. CVXPY allows the user to select a solver. Among the solvers is GLOP [3].
For modelling libraries in general-purpose languages, Gurobi's python bindings have the best reputation. But of course Gurobi is very expensive (I have heard about $50k for a fully unrestricted license, plus $10k yearly for support). On the open-source side, besides Google's OR-Tools, there is Pyomo [1] and PuLP [2] in Python (as the article mentions). In Julia, there is JuMP [3], whose development community is extremely enthusiastic.
Traditionally, however, mathematical models were encoded in domain-specific languages. The most prominent one is AMPL [4] which is proprietary. The glpk [5] people have developed a very neat open source clone of AMPL: the GNU MathProg language. For a more modern take on AMPL-type modelling DSLs, look at ZIMPL [6], which is open source as well.
I’ve used pulp a lot to set up MIP problems in python. The problem is that every variable often uses 10kb of memory or so. When problems get large, the machine that I have that has access to gurobi will run out of memory juts setting up the problem in python (or when calling gurobi from within python, which for a short time seems to double memory usage when calling an external process) - even though gurobi can still handle the actual model.
I’ve often given up and just build the problems in python by outputting text files, which is very cumbersome.
I wonder whether there’s a better alternative, i.e. a python library that will deal well with large problems.
Cvxpy is what you're looking for, your variables and constraints can be defined as numpy arrays, it's game changing. It also has a giant selection of solvers, is open source and actively developed, and even has tensorflow integrations
I don't know if it's applicable to your use case, but if you work on these models a lot you might want to consider using Minizinc/Flatzinc as a modeling language.
Nothing against python, but those models get unwieldy very quickly, especially when there are tons of overlapping constraints popping up.
Well, I usually use MIP in larger processing pipelines, which I tend to write in python. For example see step 4. in this algorithm about automatically generating a public transit map layer:
https://site.transitapp.com/how-we-built-the-worlds-pretties...
Use cvxpy, much nicer api and you can drop numpy arrays into it so your optimizations can be one line instead of awful lists of constraint objects. Also it can be a tensorflow layer now with cvxlayers
The article makes it sound like they are guaranteed to find the optimum solution. However, the amount of iterations/execution time implies they don't simply use brute force.
So how do these solver guarantee an optimal solution? My first thought is that they somehow model the problem as an equation whose minima/maxima they can determine analytically but I'm unsure of how to build such an equation automatically (especially when multiple constraints like food, gold, wood etc. are involved).
For linear programming (i.e. all variables are continuous):
Geometrically in R^n, the set of all points that satisfy the constraints is a high-dimensional polyhedron. The best points, the ones that minimize or maximize the linear objective function, are on the boundary of that polyhedron. There is either one (a vertex) or possibly infinitely many (a face), but in some standard reformulations (e.g. all variables are >= 0), there is always at least one optimal vertex. Optimal points can be found using iterative methods either going through the polyhedron ("interior point methods"), or going from vertex to vertex ("simplex method").
In theory, interior point methods can solve the problem to optimality in a time that is polynomial in the size of the input problem. All known variants of the simplex method are exponential in the worst case.
In practice, interior point methods are extremely fast and solve most problems in 10-100 iterations regardless of size! You can expect a problem with 100k variables and constraints to be solved within 5-20 minutes. However, using floating-point arithmetic, the solutions are not always very accurate, so it is customary to add some iterations of the simplex method at the end to get a vertex (vertices have a combinatorial structure; there is no accumulation of error, and one can get their value to high accuracy). On its own, the simplex method is still very fast, count 10-60 minutes for the same 100k-variable problem (with very high variance, though).
For integer programming (i.e. some or all of the variables are discrete):
This is much harder. The problem is NP-hard. In theory it looks hopeless. Indeed brute force is hopeless in practice.
But one can still to a lot better than brute force. IP solvers use branch-and-cut. The fastest ones (in which millions of PhD-holder hours have been poured) can solve problems with hundreds of thousands of variables in typically less than 24h. Of course, it is possible to construct a problem with 4 variables that will take years to solve, but that typically does not happen with naturally-occuring problems. For specially-structured problems (like knapsack, TSP or vehicle routing), customized codes can use special techniques to solve problems with tens of billions of variables.
To add some insight to what branch-and-cut actually does: even with integer programming you can still work with the high-dimensional polyhedron. If you end up being lucky and the optimal vertex turns out to be integral, you know you have the optimal integer solution. But of course, it may turn out that the vertex turns out to have non-integer components that should be integer, and that is where the real work starts. A lot of the magic in commercial solvers involves the development of clever tricks to get from this optimal non-integer point to an optimal integer point. With branching you split up the polyhedron eliminating your current non-integer point, and with cutting you add constraints that remove the non-integer point but are guaranteed to not remove any optimal integer points. There is typically a lot of choice in which thing to try first, and these choices typically have huge impact on how fast the solver is.
I am always happy to see people point out that NP-hard problems are solved to optimality in practice. Too often you come across someone who believes it is impossible to prove you have an optimal solution for an NP-hard problems, no matter which instance. But NP-hardness does not rule out the possibility that you might be much more lucky than the worst case with the particular instance you are interested in. It only says that there exist worst case classes of instances that become much more difficult to solve as their size increases.
By the way, I remember having learned that Integer Programming with a constant number of variables but a variable number of constraints can actually be solved in polynomial time with help from the Lenstra–Lenstra–Lovász lattice basis reduction algorithm, so I would not expect an integer problem with four variables to need years to solve. A constant number of constraints is actually more difficult than a constant number of variables (the Knapsack Problem is NP-hard and has a single constraint). AFAIK, for a constant number of constraints we have nothing better than a pseudo-polynomial time algorithm by Papadimitriou.
(The generalized term for a polyhedron in higher dimensions is a polytope, by the way.)
Don't forget that you can also use LP relaxation ("well, what if we didn't need everything to be integral?"). If it comes out with integral weights (as was the case in the original example, modulo floating point error), then you're done as the sibling commenter points out. If you're willing to accept approximate solutions to your integer program, you can just pick a point near the "optimal" solution.
And, of course, modern IP solvers will use the relaxed LP problem to implement "cheap" upper bounds (assuming MAX, per standard form) for entire branches of the solution space, as the optimal solution to a less-constrained problem is never worse than for the more-constrained problem (since any feasible assignment remains so if you remove a constraint).
This is exactly right — too many people equate NP-hard to unsolvable, but in fact the modern world (logistics, plant operations, airline scheduling, etc.) relies on us solving NP-hard problems.
It is unclear whether you are asking about integer programs or linear, so I'll address both.
Integer programming is very hard, and is an active area of research. The space of problems that arise in practice typically have sparsity, special structure, and other features that allow for shortcuts to be employed so that brute force isn't needed.
Interior point solvers are guaranteed to find a global solution, and it can be proved that they will do so efficiently [1]. Practical implementations leverage decades of experience to improve upon the theoretical guarantees.
There are ways (called heuristics) to solve some NP hard problems faster than exhaustive search.
Of course, for theoretical reasons, a heuristic is not guaranteed to be faster on all problem instances.
The general idea is to do some engineering and find a heuristic that works well for your problem. More sophisticated algorithms will try multiple heuristics at the same time (I believe OR tools has options that falll in this category). There is even research by now that aims to use ML to find good heuristics for a given family of problems (where the common characteristics of the problems might lend to a dominant heuristic).
These heuristics can still have guarantees of correctness. A good place to start is by looking up "branch and bound" algorithms, e.g., for ILP problems. The idea there is to narrow the search space while maintaining some theoretical guarantees.
(Not an expert but i did a project on this once. I'm sure someone more expert can come along and correct where I'm wrong.)
> There is even research by now that aims to use ML to find good heuristics for a given family of problems (where the common characteristics of the problems might lend to a dominant heuristic).
I think I have spotted some papers about this for combinatorial problems but not for continuous variables, which makes sense given their relative hardness.
Do you know some papers, journals, conferences about this?
I'm interested specially about the view from people working in optimization research more than in AI.
I think we're probably thinking of the same papers. There was a competition /session at last year's neurips and there's related papers linked from their webpage.
Also, just FYI, there are actually two simplex methods. One is an algorithm to solve LPs (by George Dantzig), while the other is a derivative-free optimization algorithm for solving general unconstrained problems.
It’s unfortunate they both share the same name, because they’re completely different algorithms.
Simplex AFAIK isn’t used to solve real world “regular” linear programs as worst-case complexity is exponential, and polynomial algorithms exist.
As it happens, MILP solvers often do use the simplex solver internally, as it can be hot-restarted - you can modify the problem a bit and quickly get an amended solution.
On the contrary the simplex method IS used to solve regular linear programs, despite the existence of polynomial time barrier/interior-point algorithms.
Yes, simplex is worst case exponential and barrier is worst-case polynomial, but depending on the problem, the average case characteristics are very different. Depending on the problem, simplex can beat barrier methods. For many large LP/MILP problems, heuristics and randomness usually dominate solution time rather than simplex vs barrier. That’s why commercial solvers so thoroughly beat open-source solvers —- their heuristics are far superior.
For NP-hard problems, it’s not a good idea to judge algorithms by their worst-case complexity —- which only provides bound — but by their real world performance. Before Karmarkar’s method, there was another polynomial time algorithm by Khachiyan. Despite being polynomial time, it was too slow to be of any practical interest.
Yes it is true, logic programming languages like Prolog are indeed often a good fit to solve such combinatorial tasks.
For example, using Scryer Prolog and its CLP(ℤ) library for constraint logic programming over integers, we can solve the first optimization task from the article on the Prolog toplevel, without even having to define a predicate:
?- use_module(library(clpz)).
true.
?- Vs = [Swordsmen,Bowmen,Horsemen],
Vs ins 0..sup,
60*Swordsmen + 80*Bowmen + 140*Horsemen #=< 1200,
20*Swordsmen + 10*Bowmen #=< 800,
40*Bowmen + 100*Horsemen #=< 600,
Power #= Swordsmen*70 + Bowmen*95 + Horsemen*230,
labeling([max(Power)], Vs).
yielding solutions in decreasing amount of Power:
Vs = [6,0,6], Swordsmen = 6, Bowmen = 0, Horsemen = 6, Power = 1800
; Vs = [7,1,5], Swordsmen = 7, Bowmen = 1, Horsemen = 5, Power = 1735
; Vs = [5,0,6], Swordsmen = 5, Bowmen = 0, Horsemen = 6, Power = 1730
; ... .
On backtracking, alternatives are automatically generated. The above interaction shows that in this concrete example, the optimal solution is unique. In Scryer Prolog, we can press "a" on the toplevel, and it will automatically enumerate all solutions, 558 in total for this example, ending with:
...,
; Vs = [2,0,0], Swordsmen = 2, Bowmen = 0, Horsemen = 0, Power = 140
; Vs = [0,1,0], Swordsmen = 0, Bowmen = 1, Horsemen = 0, Power = 95
; Vs = [1,0,0], Swordsmen = 1, Bowmen = 0, Horsemen = 0, Power = 70
; Vs = [0,0,0], Swordsmen = 0, Bowmen = 0, Horsemen = 0, Power = 0
; false.
If we omit the actual search for solutions from the query above, i.e., if we post only the part before labeling/2:
?- Vs = [Swordsmen,Bowmen,Horsemen],
Vs ins 0..sup,
60*Swordsmen + 80*Bowmen + 140*Horsemen #=< 1200,
20*Swordsmen + 10*Bowmen #=< 800,
40*Bowmen + 100*Horsemen #=< 600,
Power #= Swordsmen*70 + Bowmen*95 + Horsemen*230.
then we see from the answer that the constraint solver has automatically deduced, from the posted constraints, significantly restricted domains of the occurring integer variables:
...,
clpz:(Swordsmen in 0..20),
clpz:(Bowmen in 0..15),
clpz:(Horsemen in 0..6),
clpz:(Power in 0..4205).
This automatic restriction is called constraint propagation, and this is what reduces the search space so significantly. It also allows the application of heuristics that often significantly speed up up the search in practice, such as starting with domain variables with smallest domains, in the hope to detect infeasible cases earlier. We can flexibly select these heuristics as options of labeling/2, while leaving all other parts of the query the same. For instance, in this concrete example, I obtained a more than 2-fold speedup by using the "min" labeling option, i.e., writing labeling([max(Power),min], Vs), so that the search always selects a variable with smallest lower bound for branching.
Constraint solvers blend in naturally in Prolog, and are often a reason for buying a fast commercial Prolog system. In fact, Prolog itself can be regarded as an instance of constraint programming: constraint logic programming over Herbrand terms, CLP(H), where (=)/2, i.e., terms are the same, and dif/2, i.e., terms are different, are the only constraints. Other specialized domains include CLP(B) over Boolean values, and CLP(Q) over the rational numbers, where dedicated algorithms allow very efficient and convenient solutions.
Constraint programming was also recently discussed in The Programming Paradigm to Find One Solution Among 8,080,104 Candidates:
Thanks for the detailed write-up, I find this stuff super interesting. If I wanted to read up and study the maths involved (CLP(Z)) what would be the best place to start (textbooks, courses etc)?
The second course is the last part of a three part series. The first two parts are on modeling of problems with MiniZinc. They are also quite enjoyable.
I appreciate the effort you have put into this comment, you have renewed my interest in learning Prolog. It's such a beautiful and still esoteric language to my eyes.
Use cvxpy, much nicer api and you can drop numpy arrays into it so your optimizations can be one line instead of awful lists of constraint objects. Also it can be a tensorflow layer now with cvxlayers
That seems to be the crux of it, right? How to decompose a problem into bounds, constraints, and an objective function? I’m thinking of finance applications.
The nice thing with Google OR tools is that it gives you several solvers with common API. In my experience a solver might struggle sometimes with a problem that another one will solve without problem. So it’s useful to allow your users to switch the solver used with a simple drop down.
By the way, isn't that for-loop in section IV, point 2, iterating over resources, unnecessary? You're not using the loop variable r, and the constraint is on the minimum power of the army, which has nothing to do with the resources.
The issues tab. Mostly to check for any critical / common pitfalls (by searching for issues with the most comments) and to get a sense of how battle-hardened the repo is (how many issues have been raised).
For Python libraries that do significant computation, the actual code is often in C, C++, or Fortran with the Python part just serving to tie the libraries together.
Like it or not, Fortran is a very fast and memory efficient compiled language with compilers supporting most platforms and many open source libraries written in it.
These days, it survives mostly as an authoring language for math and statistical libraries.
I was told recently to use some Math library for .net that was clearly a wrapper around a Fortran library. All method names were a 4 characters hex, I think the same for parameter names. I just refused.
That's probably a wrapper for BLAS/LAPACK. The majority of BLAS/LAPACK implementations use the original functions names as the reference implementation found on Netlib. The names are not in hex, but acronyms and short forms e.g. GEMV refers to double general matrix-vector multiplication.
What did you end up using as alternative? In C++ for example, there's a template based library (Eigen) that fuses multiple operations and gives you some speedup compared to the usual libraries, at the cost of very long compilation times. For C#, I imagine there could be some library that does the same thing in JIT instead.
It was the Nag library for .net [1]. What I really wanted to use is Extreme Optimization, but couldn't get the company to buy a license. In the end I realised I could express the problem as a linear program and used Google OR Tools and LPSolve.
The issue is, MILP in general is NP-complete to solve... so it doesn’t gain you anything. In practice solutions can be found quickly, or at least good approximations.