I'm surprised that this is still such a big issue. I'd expect with unicode standards, this should handle well out of the box.
Ages ago, in 2005 or thereabouts, I was working on a website for a Jewish organisation. I don't know Hebrew, but Hebrew has many of these same issues, including the right-to-left direction, and I was quite surprised to see text from our own CMS, based on Java, with tons of XML pipelines (Cocoon! XSLT!) generating HTML to be viewed in a browser, just handled this correctly without any problems.
At least the text was right-to-left, which was the only thing I knew, but the customer presumably knew more and they were happy.
Though selecting text in a mixed left-to-right and right-to-left piece of text looks really weird. Not sure what we did with alignment there, but I vaguely recall that everything may have been centered. An ugly compromise perhaps, but centering text was still popular at the time.
For the most part, these problems only pop up when you try to render RTL text visually, and then copy it back out from UI elements. In internal pipelines, however, it's just another string of opaque bytes that you perform a few very well-specified standard library operations on.
(Arabic also has the connection challenge, which Hebrew does not, but usually renderers that do RTL right also handle the ligatures.
Ages ago, in 2005 or thereabouts, I was working on a website for a Jewish organisation. I don't know Hebrew, but Hebrew has many of these same issues, including the right-to-left direction, and I was quite surprised to see text from our own CMS, based on Java, with tons of XML pipelines (Cocoon! XSLT!) generating HTML to be viewed in a browser, just handled this correctly without any problems.
At least the text was right-to-left, which was the only thing I knew, but the customer presumably knew more and they were happy.
Though selecting text in a mixed left-to-right and right-to-left piece of text looks really weird. Not sure what we did with alignment there, but I vaguely recall that everything may have been centered. An ugly compromise perhaps, but centering text was still popular at the time.