> This meant some customers received emails informing them their new premium was now $2700 instead of $27.00.
there's a secondary issue here, why in the world would you auto split a monetary value across a numeric decimal indicator? why would you split lines at all for this use case?
As mentioned, the SMTP protocol only allows for 1000 bytes of data per line. The author also mentions that they are sending html emails, which ignore line breaks.
So a message intended to be sent by an SMTP client:
DATA
Hello customer,<br>[978 characters] 27.00
Was erroneously formated into:
DATA
Hello customer,<br>[978 characters] 27
.00
.
The period after 27 will be removed. And this is how the html will be rendered.
but html does not ignore line breaks. when part of body text, a run of whitespace (including newline) becomes a single whitespace when rendered.
so splitting 27.00 on the . becomes 27 00, because the CRLF is significant to the client.
you would want to split at whitespace, not at any other character -- unless you had a 999+ string of non-whitespace of course.
perhaps the author didn't know or didn't realize or thought it insignificant to his point that in addition there was a quoted-printable encoding, in which case i believe the trailing/mandatory CRLF can be made non significant for client rendering. personally i still would have split on actual whitespace. (well, i wouldn't have written an smtp client in the first place.)
Hmmmm, html doesn't ignore line breaks, it just treats them as any other whitespace, where a consecutive sequence is folded into a single space. 27 00 would still be quite confusing, of course
I think GP was was using the phrase "auto split ... across [character]" in reference to characters that can cause line breaks for "word wrap" purposes in page layouts. For example, a normal space is a character that causes line breaks, but a non-breaking space (nbsp) is not. A hyphen, a tab, a zero-width space (zwsp), and several other characters are also generally used for line breaking. I think GP is saying that the decimal indicator -- the fourth character of "$27.00" -- should not be used for breaking. I think GP assumes that the problematic line breaking in TFA is akin to the type of "word wrap" page layout logic I've just explained; in reality the line breaking in TFA has nothing to do with that, it's simply breaking at 1000 octets (probably for reasons of buffer size, certainly not page layout) regardless of what character is in that position, so this whole thing is moot. GP needs to RTFA!
If the period was the 999th character in the line, it would split it to the next line since the maximum line length in SMTP is 1000 characters including CRLF.
there's a secondary issue here, why in the world would you auto split a monetary value across a numeric decimal indicator? why would you split lines at all for this use case?