So i just build this - with a few changes to the approach and usable as a simple pi-extention without having to use what-the-pi. It seems to work pretty well so far.
Why do we need a hash for every line. Why cant we mark every fifth line (or get smarter and calculate entropy of lines and jump longer for empty boilerplate)? I feel adding a random 3 char header to every line while making the edit tool smarter will make the overall understandability of the content dumber.
Yes, this looks like O(1) actions, where before, its likely that harnesses are ingesting and outputting huge portions of the source files for each step, and the local uses of str_replace() are themselves O(N) on the users computer. The excess reads and writes from the LLM are O(N^2).
Not sure what they're calculating, but this seems to me like it could be many times more efficient than 20%.