In the tests I've done on most terminals it was an organic process.
I simply typed into the terminal and let my brain decide. I spent about an hour just flipping between different terminals / programs and doing a bunch of different types of typing tests. It was very very clear and repeatable to do a "better or worse?" test between them.
Typing fast, typing slow and also just holding down the key (this made the most noticeable difference). Terminals that have low input latency will appear to spit out characters liquid smooth if you hold down the key.
I also talked to some friends who use Windows and we came to the same conclusion in a blind test.
For measuring the latency numbers themselves, it was just making a best estimate based on how I perceive those values. I don't think the numbers matter as much as the "this feels better" or "this feels worse" test since in the end that's the deciding factor.