"The user is then presented 10 images (a tiger,
a house, a moose, etc) from a library of 10,000 images."
I should have made this more clear. The ten images are chosen randomly from the group of 10,000
The question is: are there 10,000 images that are different enough people won't be fooled. Say my picture is a green house. And a prompt has a picture of a red house, will I accidentally think its the right site key?
The good news is most people won't even have a picture of a house as their site key so it will protect a large percent.
That is a fair attempt. The weak point of course is the bit of data which stores which image you chose. If the attacker is able to read that, then he can display the right image.
1) If the attacker can scrape the screen, they can detect which image you are using - securing the entire pipeline to the screen is hard.
2) 10,000 images is way too few.
Even if we can assume an even distribution of images, as an attacker I can serve the same image to all targets, 1 in 10,000 will now think that they are interacting with a trusted component