Just to mention a modification on the "simple and intuitive cardinality estimator" that's far more accurate and actually used for Google's SZL:
Given N objects, hash each object and store the lowest M (s.t. M << N) hashes, then calculate what percentage of hash space is covered and compare to how much of the hash space would be expected to be covered.
This removes the issue of having to assume the N objects are distributed evenly as hash(object) should exhibit that property.
I actually cover that in the post (well, more or less - I talk about hashing to remove bias, after talking about the 'min element' algorithm. According to the papers cited, though, taking the count of leading zeroes is more space efficient, allowing you to have more buckets in the same amount of space.
The hashing isn't the point. The leading zeroes method is more likely to produce unreliable results when you can't guarantee the dataset will be large in advance whilst the "keep min M elements" method works fine for unexpectedly small sets. That's why it's used for SZL where the user can request approximate unique counts on any data regardless of size.
Have you read the papers I linked in detail? Some of them, such as HyperLogLog, provide corrections to give better estimates for small sets, and although I can't follow the proof in its entirety, they claim to be more efficient than the alternatives, including the one you propose.
Given N objects, hash each object and store the lowest M (s.t. M << N) hashes, then calculate what percentage of hash space is covered and compare to how much of the hash space would be expected to be covered.
This removes the issue of having to assume the N objects are distributed evenly as hash(object) should exhibit that property.
[1] My naive Python implementation for tests on the NLP WSJ corpus -- https://github.com/Smerity/Snippets/blob/master/analytics/ap...
[2]: Szl's implementation -- http://code.google.com/p/szl/source/browse/trunk/src/emitter...