The approach I was going to take is generating english code descriptions/summaries in related chunks of under 4k tokens then those can be combined and also summarized.
In my opinion the code descriptions gpt 3.5 turbo has been spitting out for me are good quality and concise. I’d argue they are probably better than what many of the developers themselves would write, especially when english isn’t native for the developer.
If you're trying to get codebase level introspection you might need to wait a bit for some of these techs to mature.
Exciting space and yea, learning heaps myself day by day.