I was thinking the same about consistency, which is useful for many contexts, but if you let another LLM do the extraction, you are again at the mercy of that LLMs quality/hallucinations.
Of course mitigation strategies were using many different LLMs and comparing results (voting) or using a highly trained/specialised model for only entity / context extraction. An interesting benchmark would be when those extraction techniques/models would exceed what a human professional is able to do.
Of course mitigation strategies were using many different LLMs and comparing results (voting) or using a highly trained/specialised model for only entity / context extraction. An interesting benchmark would be when those extraction techniques/models would exceed what a human professional is able to do.