> Java has the largest, oldest and most explicit data set for the LLM to reference
That seems to be a recommendation for coding with LLMs that don't have access to tools to look up APIs, docs and 3rd party source-code, rather than something you'd chose for "Agentic Coding".
Once the tooling can automatically figure out what is right, what language you use matters less, as long as source code ends available somewhere the agent can read it when needed.
Agree much with your 2nd point though, all outputs still require careful review and what better language to use than one you know inside-out?
Why is this? is there just a insanely large codebase of open source projects in Java (the only thing i can think of is the entire Apache suite)? Or is it because the docs are that expressive and detailed for a given OSS library?
Certain points about the language, as well as certain long-existing open source projects have been discussed ad-nauseum online. This all adds to the body of knowledge.
1) Java has the largest, oldest and most explicit data set for the LLM to reference, so it's likely to be the most thorough, if not the most correct.
2) Go with the language YOU know best because you'll be able to spot when the LLM is incorrect, flawed in its 'reasoning', hallucinating etc.