There’s a project called Tabby that your can host as a server on a machine that has a GPU, and has a VSCode extension that connects to the server.
The default model is called starcoder, and it’s the small version, 1B parameters. The downside is that it’s not super smart (but still an improvement over built in tools), but since it’s such a small model, I’m getting sub-second processing times.