The way one solves other problems on the D-Wave is by reducing them to the D-Wave problem, so save in extraordinary circumstances the answer will be "Yes."
The "D-Wave problem" (as Aaronson calls it) is kind of peculiar.
It's a certain kind of binary optimization problem that comes up in models of magnetic media ("Ising model"). In the original magnetic context, the two states are N and S. Each magnetic element within a 2D array of states jiggles around locally trying to align with its neighbors, and by doing so, each state-flip influences a global energy function.
The same problem comes up in image processing. The probabilistic equations describing segmentation of on-object versus off-object regions are the same mathematically as the magnetic energy function of the Ising model.
This particular problem is important in its niche, but it's really not that general. It's possible that even something as trivial (analytically) as going from 2 to 3 states will not generalize well to the D-wave hardware.
And it's possible that, by the time you transform a given problem (say, graph matching) to encode it in this model, you end up with either (a) something with more variables than the original problem, (b) something with exotic parameter settings that the D-wave hardware cannot handle, or (c) a model having an energy surface that is not well-suited to the particular D-Wave computational mechanism (annealing).
In some applications, the use of linear programming relaxations (I have not read the detailed paper, but I assume this is the CPLEX result discussed in the post) is much slower than annealing, and in others, LP relaxations are more competitive. Sometimes the LP relaxations give much better results, but they tend to be much, much, slower, for the Ising problem.
Also note that the topology of dwave's architecture strongly limits the instance of the Ising model you can implement and so the blowup in the instance size is twice: from your problem to the ising model and to map this Ising problem to the dwave's architecture.