This reminds me of the problems we did 15-20 ago at IOI or ACM ICPC. We did these in pure C then, sometimes C++.
I would have kept the states in a different way. Instead of making a vector/array of actors, I would make a pair of bitvectors the size of the grid. 1 is set if there is a blue (resp. red) actor at that position. No sorting is needed and it seems that for more practical puzzles this gives smaller state. All move operations are still easy to implement.
That would definitely work, and I’d be interested in the performance impact. This was written so that the state size would scale with the number of actors rather than the size of the grid. There is a degenerate case where a massive mostly empty grid becomes difficult not only to store in memory, but also to transition on move. The transition function would take time proportional to the size of the grid rather than the number of actors.
I would have kept the states in a different way. Instead of making a vector/array of actors, I would make a pair of bitvectors the size of the grid. 1 is set if there is a blue (resp. red) actor at that position. No sorting is needed and it seems that for more practical puzzles this gives smaller state. All move operations are still easy to implement.