GPT-4 is a scary improvement over 3.5, especially for handling code. It will be the literal definition of awesome when these models get a large enough context space to hold a small-medium sized codebase.
I've been playing around with it for an hour seeing what it can do to refactor some of the things we have with the most tech debt, and it is astounding how well it does with how little context I give it.
The fundamental techniques that they use are highly lossey and are far inferior to ultra-long context length models where you can do it all in one prompt. Hate to break it to you and all the others.
The methods they employ are to improve the context being given to the model irrespective of the context length. Even when the context length improves these methods will be used to decrease the search space and resources required for a single task (think about stream search vs indexed search).
I’m also curious what paper you are referencing that finds that more context vs more relevant context yields better results?
GPT4 supports 32k tokens, which I guesstimate would be ~25k code tokens and perhaps ~2.5k lines of code. So already enough to work on a whole module of code at the same time. If you're using small microservices, it might already be good enough.
Also if you were doing it personally you'd probably not take the whole codebase, but look at function signatures/docs as you get further away from the code you care about. While there's probably benefits in clever systems for summarising and iteratively working on the code, you can get away with just cramming a load of context in.
I have been playing with GPT4 and it's good. It's easy to imagine a bunch of use cases where even the maximal cost of a few dollars ($1.80 ish for the context alone) for a single query is worth it. If it's good enough to save you time, it's very easy to cross the threshold where it's cheaper than a person.
I'm worried it's going to decimate the already wobbly jr market - who are the sr devs of tomorrow to wrangle the LLM output after it replaces the jr's job of coding all day?
All programmers that use AI just got a boost. I tried it and it is amazing, it writes complex code on demand like I never seen before. But competition also gets the boost. So now we have to compete with devs armed with GPT-4. Humans are still the differentiating factor, what company will use AI better?
Yup, it’s a bit of a learning curve how to get it to output the most useful stuff with the least amount of prompts.
Figure out what your goal is, and then start by giving a wide context and at first it will give wrong answers due to lack of context. With each wrong answer give it some more context fro what you think the solution provide is missing the most. Eventually you get 100% working code, or something so close to it that you can easily and quickly finish it yourself.
This is one strategy, but there are many that you can use to get it to reduce the burden of refactoring.
I've been playing around with it for an hour seeing what it can do to refactor some of the things we have with the most tech debt, and it is astounding how well it does with how little context I give it.