HN actively erases unicode regions to prevent emoji abuse and other zalgoing. Sites and apps do it nowadays, just not with emoji. It's the other side of your point – unicode can do too much and it's not a regular text, so you can't search within that sort of bold, validate, etc. So people choose to work with a subset, which may still leak: https://news.ycombinator.com/item?id=42231608