For whatever it's worth, asking a model to count is a terrible idea due to how t...

For whatever it's worth, asking a model to count is a terrible idea due to how they work.

You may have more luck with a hybrid approach, using LLMs for language understanding and computers for the counting. For example, ask them to write a short, one-line description of every instance where something happens, and then use a traditional program to count the lines.