You don’t mention it explicitly, but I assume you’ve manually copied and pasted all the code, as well as the various patches with updates? In my experience, that quickly makes new suggestions from ChatGPT go out of sync with the actual state of the code. Did you occasionally start the conversation over and pasted in all the code you currently had, or did this not turn out to be an issue for you?
I have found with GPT4, this isn't an issue, but you do have to watch it and let it know if it makes up some new syntax. Scolding it will usually get it to correct itself.
The bigger issue for me has been hallucinated libraries. It will link to things that don't exist that it insists do. Sometimes I've been able to get it to output the library it hallucinated though.
It also makes more syntax errors than a person in my experience, but that is made up for by it being really good at troubleshooting bugs and the speed it outputs.
Indeed, I manually copied the outputs. If the network lost context (or I ran out of GPT-4 credits and reverted to GPT-3) or for some reason I needed to start a new chat, I would start by also feeding in the other modules' docstrings to re-build context. sometimes I had to pass these again after a few prompts.
A good example looks like:
```
I am trying to model associative memory that i may attach to a gpt model
here is the code for the memory:
....
can we keep the input vectors in another array so we can fetch items from it directly instead of having to reconstruct?
```
I asked GPT to write a program which displays the skeleton of a project, i.e. folders, files, functions, classes and methods. I put that at the top of the prompt.
I had a spooky experience with a project that was written almost entirely by GPT. I gave it the skeleton and one method and asked it for a modification. It gave it and also said "don't forget to update this other method", and showed me the updated code for that too.
The spooky part is, I never told it the code for that method, but it was able to tell from context what it should be. (I told it that it itself had written it, but I don't know if that made any difference: does GPT know how it "would" have done things, i.e. can predict any code it knows that it wrote?)
It's very good at guessing from function names and context. This is very impressive the first few times, and very frustrating when working with existing codebases, because it assumes the existence of functions based on naming schemes. When those don't exist, you can go ask it to write them, but this is a rabbit hole that leads to more distractions than necessary (oh now we need that function, which needs this one). It starts to feel like too much work.
Often, it will posit the existence of a function that is named slightly differently. This is great for helping you find corners of an API with functionality you didn't know existed, but insanely frustrating when you're just trying to get code working the first time. You end up manually verifying the API calls.
It only works well with codebases that are probably quite well represented in its training data ime. For more obscure ones one is better off just doing it by themselves. Finetuning may be a way to overcome this, though.