Couple things there where you can see if it improves with the prompt/formatting. E.g. with Davinci (and J a bit but didn't test too much) you can get bette results by:
- Using few-shot examples of similar length to the targets (e.g. 10 digit math, use 10 digit few shots)
- Chunking numbers with commas
- Having it double check itself