That leads to a kind of fluid distinction similar to interpreted vs. compiled languages.
You tell the AI what you want it to do. The AI does what you want. It might process the requests itself, working at the "code level" of your input, which is the prompt. It might also generate some specific bytecode, taking time an effort which is made up for by more efficiently processing inputs. You could have something like JIT, where the AI decides which program to use for the given request, occasionally making and caching a new one if none fit.
Yeah AI at least now is so energy inefficient. There is only so much sun hitting earth we can be stupid with. Using AI for everything makes electron apps seem efficient! Once the hype runs out and you pay full price much AI today will be unattractive. Hopefully that leads to more efficient AI. (Which I suspect is more interesting)
You tell the AI what you want it to do. The AI does what you want. It might process the requests itself, working at the "code level" of your input, which is the prompt. It might also generate some specific bytecode, taking time an effort which is made up for by more efficiently processing inputs. You could have something like JIT, where the AI decides which program to use for the given request, occasionally making and caching a new one if none fit.