Great work! The fact that it captures "long-range order" seemingly perfectly is something not many have been able to do before! And the "collapse" visualization is great fun to watch.
But is your algorithm really qualitatively all that different from previous search methods (e.g. Efros and Leung), if you are still (uniform random?) sampling over the input distribution of patches?
I notice also your input textures tend to have sharp boundaries (as is common in pixel art). It would be interesting to see the results when the input has a lot of "noise", such as high def images of rocks or clouds ;)
While I still prefer search methods because they are easy to implement (and give better results), "deep" methods are definitely gaining some ground:
Efros' and Leung's method doesn't satisfy the (C1) condition. The closest previous work is Paul Merrel's model synthesis.
WFC and texture synthesis serve similar purposes: they produce images similar to the input image. However, the definition of what is "similar" is different in each case. If you have a high def input with noise (like realistic rocks and clouds) then you really want to to use texture synthesis methods. If you have an indexed image with few colors and you want to capture... something like the inner rules of that image and long range correlations (if you have an output of a cellular automata, for example, or a dungeon), then you want to use WFC-like methods.
> something like the inner rules of that image and long range correlations
I assume that if you feed WFC a large input image, it just thinks of that as a very complex set of rules that are harder to satisfy than those of a small input?
Is there a way, then, to instead train the WFC algorithm on a large corpus of small, similar samples, such that it can try to derive the rules common to all the inputs in the corpus, and produce one image that "really" fits the rules, rather than just the ephemeral quirks from an individual sample?
Would there be, for example, a way to train WFC to produce outputs matching the level-design "aesthetic" of a given game, rather than just "continuing" a particular level?
About harder and easier to satisfy, the question of how the rate at which the algorithm runs into contradictions depends on the input is not easy at all. There is no simple correlations between the contradiction rate and the size of the input.
But the first thing you'll notice if you feed it an image with a lot of patterns, is that it will work very slowly.
Yeah, the corpus thing can be done if we cut out rare patterns and leave only frequent ones. I haven't tried it though.
A very good question! The opposite of it is also important, can we follow some heuristics while creating tilesets to minimize contradiction rates, but not making tilesets easy? I don't know. If someone knows please tell me.
>I assume that if you feed WFC a large input image, it just thinks of that as a very complex set of rules that are harder to satisfy than those of a small input?
Since the input is shredded into a multiset of N by M rectangles, it's the opposite: assuming the small input image is a portion of the large one, the large image model adds examples to the ones in the small image model, so the set of cases it can fit an example to is the same or larger.
You mentioned rotation and mirroring type bitmap ops in your algorithm. But how do we precisely apply a series of dynamic filters to the input to "evolve" the patch? Even using overlapping kernels it seems quite intensive! And prone to artefacts...
You might be interested in this work showing texture synthesis over a 2d surface with non-uniform geometry, rotation, scale, or even velocity: http://hhoppe.com/proj/apptexsyn/
Didn't know where else to contact you, but I made a straight [Java port](https://github.com/Aqwis/JavaWaveFunctionCollapse) just as a fun exercise. Feel free to add it to your README.md if you want.
But is your algorithm really qualitatively all that different from previous search methods (e.g. Efros and Leung), if you are still (uniform random?) sampling over the input distribution of patches?
I notice also your input textures tend to have sharp boundaries (as is common in pixel art). It would be interesting to see the results when the input has a lot of "noise", such as high def images of rocks or clouds ;)
While I still prefer search methods because they are easy to implement (and give better results), "deep" methods are definitely gaining some ground:
Deep Textures
http://bethgelab.org/deeptextures/
Combining Markov Random Fields and Convolutional Neural Networks for Image Synthesis
https://arxiv.org/abs/1601.04589