At least for llama.cpp users, this recently introduced PR -- https://github.com/ggerganov/llama.cpp/pull/1773 -- introducing grammar-based sampling could potentially improve structural reliability of LLaMA output. They provide an example JSON grammar as well.