Assuming your keys are short strings, 1M elements is 32MB plus btree overhead. If your keys are longer strings, (URLs are typically too long to benefit from short string optimization) 1M elements is still 32MB, but now it's also a pointer to somewhere else in memory with more data at the other end. So there's that; probably more than just 1-2 cache misses.
That being said, in the case of just 1M elements, a Bloom filter won't buy you much. For a bloom filter to be truly effective, you need all of the following:
1. Many queries come back negative; that is, you spend a lot of time searching for keys that are not in the DB. This is not the typical case. When the typical web app searches for a _key_ the key is typically in the DB. (as opposed to "find all sales on November 7th" or whatever, which Bloom filters cannot assist with)
2. The actual query is disproportionately expensive. Unless a Bloom filter can save you a disk/network call, it's generally not worth the extra overhead.
3. You rarely/never remove items, and your data grows slowly.
Many problems that can be solved with a Bloom filter can also be solved by simply buying more RAM. Many problems that can be solved with a Bloom filter can also be solved by using a faster lookup. (everybody overlooks hash tables. I don't know why.) Bloom filters are _not_ a general purpose algorithm. You really need to grok your data before pulling the trigger on a Bloom filter. (don't forget to benchmark)
> (everybody overlooks hash tables. I don't know why.)
It is probably worse than you think since a hash table can also replace the Bloom filter in its "false positives allowed" filtration role. Not sure if that's what you meant.
If all you save is the "fingerprint" (or a B-bit truncated hash code of the keys) then you get a very similar probabilistic data structure. Further, if you put these small integer fingerprints in an open addressed hash table with linear probing then you get a (very likely) single-cache miss test for membership. This alternate structure was, in fact, also analyzed by Burton Bloom (in the very same paper he introduced his famous filter). He found fingerprint filters to be slower only on very unrealistic memory systems with neither cache nor "word fetch" structure where access to every bit had unit cost no matter its location. If you can grab a big batch of bits in unit cost the analysis changes a lot. Modern systems grab 512..1024 bits at the same cost as 1...
This fingerprint filter alternative does take 2..4x more space depending on scale and allowed false positive rates, but that space is organized better in terms of caches. Most would choose to spend that space to save k-1 cache misses of time in today's era of 60-120 ns memory and multi GHz CPUs. It is slightly easier to implement a large bitvector than an array of B-bit numbers, depending on B. { It's pretty trivial for B=8*m, though. ;-) } It's kind of crazy that hash tables for this purpose fell into "also ran/obscure" status...
If you switch from linear probing to cuckoo hashing, then you end up with a cuckoo filter, which is more compact than a bloom filter for a given false-positive rate. The compactness comes from the fact that Cuckoo hashing allows load factors of over 90% which are impractical for linear-probing.
Well, sure. You can also switch to Robin-Hood linear probing to work well at high load factors and retain the almost never >1 DRAM hit. These are just various hash table options to claw back some load factor ratios of space usage.
That being said, in the case of just 1M elements, a Bloom filter won't buy you much. For a bloom filter to be truly effective, you need all of the following:
1. Many queries come back negative; that is, you spend a lot of time searching for keys that are not in the DB. This is not the typical case. When the typical web app searches for a _key_ the key is typically in the DB. (as opposed to "find all sales on November 7th" or whatever, which Bloom filters cannot assist with)
2. The actual query is disproportionately expensive. Unless a Bloom filter can save you a disk/network call, it's generally not worth the extra overhead.
3. You rarely/never remove items, and your data grows slowly.
Many problems that can be solved with a Bloom filter can also be solved by simply buying more RAM. Many problems that can be solved with a Bloom filter can also be solved by using a faster lookup. (everybody overlooks hash tables. I don't know why.) Bloom filters are _not_ a general purpose algorithm. You really need to grok your data before pulling the trigger on a Bloom filter. (don't forget to benchmark)