Dumb question as I haven't used the feature: Why is dedupe flawed? Is it because it requires "enormous" amounts of RAM? Does it eventually slow down writes?
Dedup on ZFS is problematic because ZFS, in exchange for some of its core useful features, promises that block locations on disk are immutable.
So the only place you can do dedup is inline, as the data is being written the first time, not after-the-fact.
In addition, this requires you keep a huge indirection table (the DDT, or dedup table) that needs to be read for all writes, so either that has to be kept in memory or on fast storage, or you've just turned every write into one or more random reads, plus writes.
This also means that even if you turn off dedup after turning it on, the performance implications remain until the DDT no longer contains any blocks (e.g. you rewrote all the data after turning dedup off).
There are feature proposals to make the performance of dedup less pathological, but nobody's taken up implementing them so far. (Someone even did a proof of concept implementation of one of them, and it still hasn't been finished and integrated.)
It's not just the issue of RAM, but that dedupe is only attainable with the online DDT method. There is no offline deduping at the block level, and of course related annoying things like no COW between filesystems.