The fundamental techniques that they use are highly lossey and are far inferior to ultra-long context length models where you can do it all in one prompt. Hate to break it to you and all the others.
The methods they employ are to improve the context being given to the model irrespective of the context length. Even when the context length improves these methods will be used to decrease the search space and resources required for a single task (think about stream search vs indexed search).
I’m also curious what paper you are referencing that finds that more context vs more relevant context yields better results?