Quick testing on vscode to see if I'd consider replacing Copilot with this.
Biggest showstopper right now for me is the output length is substantially small. The default length is set to 256, but even if I up it to 4096, I'm not getting any larger chunks of code.
Is this because of a max latency setting, or the internal prompt, or am I doing something wrong? Or is it only really make to try to autocomplete lines and not blocks like Copilot will.
- Generation time exceeded (configurable in the plugin config)
- Number of tokens exceeded (not the case since you increased it)
- Indentation - stops generating if the next line has shorter indent than the first line
- Small probability of the sampled token
Most likely you are hitting the last criteria. It's something that should be improved in some way, but I am not very sure how. Currently, it is using a very basic token sampling strategy with a custom threshold logic to stop generating when the token probability is too low. Likely this logic is too conservative.
Is this because of a max latency setting, or the internal prompt, or am I doing something wrong? Or is it only really make to try to autocomplete lines and not blocks like Copilot will.
Thanks :)