Yeah, the highlight of my current solution is that the graph is computed lazily / on demand (supplies are passed in the current path and provide estimates) - very CPU heavy but light on memory. There no way I can precompute every possibility, but is there anything specific to A* that rules out lazy node creation? In this problem the nature of every node is a function of the path you took to get there.
You need memory proportional to the number of nodes you visit (score of the node, previous node on the shortes path to this node, list of next candidates). There is nothing preventing you from creating everything else lazily, and in practise most nodes will never get visited.
A* also allows you to find a solution faster by overestimating your heuristic: For example if you multiply your heurisitic with 1.1 you are guaranteed to find a path whose length is within 10% of the shortest path, requiring less CPU time and memory for the search.
No, you can definitely use A* with lazy node creation. It's not clear to me that you'll get a reasonable runtime though with an NP-complete problem. Seems like you might have a pretty high branching factor there, but if you're already using depth-first search, A* should be a major improvement over that with a good heuristic.