Message threading (1997-2002)

songgao · on Sept 8, 2013

I was interning in Rackspace this summer. For a while we were working on an email notification sender and we wanted email clients to group messages related the same event into a single conversation. However, different "event" might have same message subject, and we wanted to suggest how messages should be grouped. So we looked into "In-Reply-To" (RFC-822) and "References"(RFC-2822) fields. We ended up implementing RFC-2822 since it obsoletes RFC-822 and we figured if we want our message grouping work on most email clients, the safest way was to use the up-to-date standard (2822).

Interesting fact is that, among three clients we tested, only mutt faithfully implemented the standard. It honestly grouped all messages referenced to the same ID into the same parent, despite subject or sending time. However, neither Gmail or Outlook respects the "References" field.

In Gmail, it seems subject of the message plus [one of <time of message sent> and <References>] are used for grouping. But it certainly doesn't exclusively rely on "References" since we got messages referenced to same parent message grouped into different conversations.

In Outlook, "References" field is ignored completely. It only relies on subject of messages. We got messages for different "event" from more than 10 days from each other, grouped into same conversation.

MoOmer · on Sept 8, 2013

Not sure if you're still there, but I think rfc 2822 was made obsolete by 5322[1] - which has since had some updates as well

[1] http://datatracker.ietf.org/doc/rfc5322/

songgao · on Sept 9, 2013

Thanks! That's an interesting read. We were actually wondering what was the maximum number of items in References field but couldn't find any. This RFC does give an answer[1]!

[1] http://tools.ietf.org/html/rfc5322#page-25

EDIT: fixed typo

jvehent · on Sept 9, 2013

People who haven't taken the time to work with mutt sometime fail to understand how convenient and user friendly it is. Pretty colors and nice interfaces do not always increase productivity. Sometimes displaying threads properly is all that's needed.

Also, the "mark thread as read" feature in mutt (^R) is a life saver when sorting through dozens of discussions.

ubernostrum · on Sept 9, 2013

I have taken the time to try to work with Mutt, and some other console email tools. I would love to be using something like that full-time, since I practically live in terminal windows.

But ultimately I can't; I don't need pretty colors or graphical tricks. I do need something which understands that it's not 1975 anymore. My mail no longer lives in a local spool file, and there is no longer a local sendmail. I have multiple email addresses, which use IMAP, and remote SMTP.

And none of the console clients -- including Mutt -- can really do that. Every couple years I try Mutt again, and try whatever the current crop of attempted successors are, and get thrown right back into 1975 again and give up in frustration.

(and Mutt, last I checked, actually considers it both a feature and a point of philosophical purity/pride to refuse to acknowledge the fact that anything other than "shell out to local sendmail" exists. Also, IMAP was unbearably slow, usually requiring a full re-fetch/re-index of the entire remote inbox, potentially hundreds of thousands of messages, every time Mutt started up, and multiple IMAP accounts were an unholy mess)

yrro · on Sept 9, 2013

While I find working with multiple accounts awkward, I find that Mutt has no problem submitting mail via SMTP; and once I enable message header & body caching, IMAP accounts become perfectly fast to use.

danieldk · on Sept 9, 2013

And if you have GMail, you can patch it to do server-side search as well:

http://people.spodhuis.org/phil.pennock/software/mutt-patche...

Still, I like Mail.app better, since it supports multiple accounts well and has excellent search.

nieve · on Sept 9, 2013

As far as console clients go Mew does very well with imap, in fact more reliably than most of the gui clients I've tried. It happily pulled down multi-GB Gmail mailboxes and kept them in sync without any issues, has sane disconnected-mode behavior, handled network failures well, etc. It's implemented inside emacs, but you don't have to be a heavy emacs user to use it and one of the vi-emulation modes might help if not. Built-in ssh tunneling support; good support for mime, pgp, s/mime; and pluggable search options were what convinced me. It depends on stunnel for ssl/tls, but I'm happy they're not reimplementing support. The manual could use some work, but it's relatively extensive.

https://github.com/kazu-yamamoto/Mew http://www.mew.org/en/info/release/

jlgaddis · on Sept 9, 2013

Like you, I have several different e-mail accounts (all "remote") and send through a number of different mail servers (depending on the account, of course). I use mutt 99% of the time, on my laptop, wherever I may be (home, friends' homes, customer sites, public hotspots, etc.). I couldn't (easily) do it with just mutt but with the help of various external tools (such as offlineimap, msmtp, and notmuch) I have no problems or hoops to jump through on a daily basis to make things work.

stevekemp · on Sept 9, 2013

You might enjoy my new scriptable client then, although it suffers from a few design decisions I believe are acceptable:

* Maildir only. * Shell out to sendmail.

I do enjoy scripting though; for example viewing all unread messages regardless of folder, or marking messages as read based on regexps.

http://lumail.org/

songgao · on Sept 9, 2013

My personal experience is that, I like how mutt looks and the fact that I'm able to deal with emails in CLI. The reason I'm not using it as my main email client (yet) is that I can't remember all shortcuts and I don't know enough what exactly each keystroke does to my emails and email server. These can all be fixed by reading docs and more practice though. I guess it's the same thing as vim. Once you are used to it, it becomes the favorite.

jvehent · on Sept 9, 2013

And like vim, you will probably never use, nor need, all the shortcuts available. I use about a dozen keys in mutt, some of which are custom binding I use to send messages to separate maildir in one keystroke.

All my emails get into my Inbox (except for bugzilla stuff). I go through my Inbox several times a day, and as I read messages, I press 'o', 'n', 'e', or 'a' to send the message in one of the four folders I mostly use. For the rest, I sort it manually using the 's' key. This way, my Inbox is always empty. Or if it's not, I know it's a message I need to get back to later on. There's usually no more than a handful of those.

Granted, you can do that with any mail client. But mutt is particularly well suited for the job.

greenyoda · on Sept 8, 2013

1. I'm impressed by the amount of analysis and the clarity of thought that went into designing this algorithm. It's not just something you can sit down at the keyboard and pound out.

2. This is a great example of the perils of re-writing code that you don't completely understand:

4.0 eliminated the "dummy thread parent'' step, which is an absolute necessity to get threading right in the case where you don't have every message (e.g., because one has expired, or was never sent to you at all.) The best explanation I was able to get from them for why they did this was, "it looked ugly and I didn't understand why it was there.''

wrs · on Sept 8, 2013

Re (2): This is also a great example of the perils of writing a complex algorithm and failing to provide overview documentation (like this article!), not just inline comments.

However, in the absence of an overview, if the code that you don't understand was written by jwz, you might want to study it very hard before removing it. :)

enneff · on Sept 9, 2013

Doesn't matter who wrote it. If the code works you'd damn well better understand it completely before rewriting.

zmmmmm · on Sept 9, 2013

> If the code works you'd damn well better understand it completely before rewriting

But then, one of the primary reasons that usually motivates a rewrite is that nobody understands the old code. (sometimes acknowledged as such, and other times indirectly in the form of "every time we try to fix a bug we break something else, this code is terrible").

greenyoda · on Sept 9, 2013

Re-writing code that nobody understands and hoping that it will work correctly is just wishful thinking. If people really wanted to understand the code, they could do the hard work of reading it, tracing it, writing tests for it, etc. In many cases, you can transform the code into something readable by applying a long series of simple refactorings.

For some code, breaking behavior you don't understand doesn't make a difference. If you're Facebook, you can arbitrarily change the user interface of your site and your users have no say in the matter, since they're not paying customers. But other developers don't have it that easy. If you support code that people have built their own applications on top of, you can't just break stuff. If you make backward incompatible changes to the Linux kernel APIs, or break Microsoft Excel so that macros that have been working for years stop working, people all over the world will be very unhappy.

zimpenfish · on Sept 9, 2013

If you have a well defined spec and a reasonable test suite, you can throw away the code nobody understands and still have the replacement code work correctly.

The chances of having a well defined spec and reasonable test suite in a place where there's code nobody understands are left as a calculation exercise for the reader (but I'd start at 5% and work downwards.)

greenyoda · on Sept 9, 2013

It's not likely you'll ever find an accurate spec for a piece of legacy software that's been around for years. Even if there was a spec that completely defined the original behavior (which is doubtful), that spec probably won't reflect all the new features and other changes that were added over the years. You'd have to merge the original product spec with all the new feature and change specs and hope that you didn't miss anything. (In many cases, it takes a bug report to realize that an item in the spec was defined incorrectly, incompletely or ambiguously.)

In practice, I think that the only "spec" that's likely to capture the exact behavior of the code as it exists today is the code itself.

Also, the lack of a complete spec probably implies a lack of a complete acceptance test suite. Note that I said "acceptance test", not "unit test". Unit tests from the original code are useless for testing the re-written code, since they're specific to the particular implementation of the product you already have, which may have a completely different set of classes from the one you'll be replacing it with.

zimpenfish · on Sept 10, 2013

Indeed and I covered all that in my second paragraph.

thrownaway2424 · on Sept 9, 2013

The way this should have been done is there should have been a unit test that tested the correctness of whatever the "dummy thread parent" was supposed to achieve. With such a test in hand the new implementation would have been obviously deficient.

Of course, in accordance with another of jwz's laws, the CADT model of software development, test suites are often discarded along with everything else when some attention-deficit teenager decides to rewrite everything from scratch.

yrro · on Sept 9, 2013

It's not like jwz was dead or trapped incommunicado in the arctic. Did no one bother to just ask him?

gregschlom · on Sept 9, 2013

I implemented jwz's algorithm for my now defunct email client (http://betterinbox.com)

It was fun and worked extremely well, though it did give different results than gmail on some instances.

zura · on Sept 9, 2013

It looks quite interesting, I wonder what went wrong.

gregschlom · on Sept 9, 2013

Several things:

1. Got tired after working on it for a year, mostly on my own. I felt the need to join a team doing something bigger.

2. Ran out of money

3. Wanted to relocate to Silicon Valley, but as a foreigner it would have been too complicated to move with my startup.

4. The project was too ambitious. A full blown email client is hard to write.

5. I was a Windows guy a that time, but all potential early adopters were OS X users. Though we did had OS X support, the app wasn't as nice or polished as it should have been.

It's interesting because we started working on this roughly at the same time as the Sparrow team. In the end, they released way before us because they focused on a narrower niche (simple gmail client for OS X, instead of the cross-platform email + todo list manager that we were doing). They won :)

mfincham · on Sept 9, 2013

For what it's worth, Balsa (http://pawsa.fedorapeople.org/balsa/) implements this as a threading option.

Edit: pointed to correct URL

pestaa · on Sept 8, 2013

Very insightful article. I do wonder though if "say no to databases" still stands as of now. I agree that performance-wise files are hard to beat for most problems, but we're storing data in databases because they provide guarantees a filesystem doesn't, eases deployment and configuration, etc.

rogerbinns · on Sept 9, 2013

The problem with Netscape 4 was that it introduced a database that was (in theory) human readable, but poorly specified, buggy and inconsistent. If you are going to change then things should be better on at least one axis, and preferably no regressions on the others. https://en.wikipedia.org/wiki/Mork_(file_format)

It is also worth pointing out this era predated SQLite.

fennecfoxen · on Sept 9, 2013

See also: http://www.jwz.org/hacks/mork.pl

mjn · on Sept 8, 2013

There's been a movement away from flat files even (perhaps, especially) among the kinds of people who like old-school Unix CLI and console tools for mail. There hasn't been a lot of movement in tools that directly read a Maildir or mbox, partly because searching is painfully slow. Instead, people are now building on top of things like notmuch: http://notmuchmail.org/

delinka · on Sept 8, 2013

I don't read it as "say no to databases," but "stop putting square pegs in round holes."

Too many programmers find a Next Greatest Thing and then try to reconfigure problems to fit the newfound solution. Too many managers "need" solutions that are buzzword-compliant. Combine these and you get projects that are "written in C++ and use databases" and are thus a success regardless of whether they work.

krakensden · on Sept 9, 2013

It stands up well for in-memory tasks where doing the simplest-thing-that-could-possibly-work is too slow.

The union of those two things is probably not a big component of most people's daily work though. Computers are fast now, you can get away with a lot of naive code.

hendry · on Sept 9, 2013

You could use Dovecot's "thread references" to produce an appropriate data structure from a variety of mail stores.

See "Write a decent mailing list Web archive system" on http://suckless.org/project_ideas for an example.

jbverschoor · on Sept 9, 2013

I'm always frikking annoyed by gmail and mail.app and airmail app with the fact that they try to guess a thread..

Messages with the same subjects are not threads!

longlivedeath · on Sept 9, 2013

I read the title as an epitaph.

taeric · on Sept 8, 2013

Can we look forward to this coming to twitter soon? :)

martindale · on Sept 8, 2013

Twitter has an explicit "in_reply_to" field which references the parent tweet ID. Since they've seized control of most clients (and most clients implement the correct parameters), this has wildly increased the coherence of Twitter threads, at least in comparison to a few years back.

taeric · on Sept 9, 2013

Apologies, I made a poorly directed joke at the new UI of twitter. I'm honestly not against it, personally, but I know it has garnered the ire of quite a few of my friends.

frozenport · on Sept 9, 2013

It tickles my fancy thinking about an era when C++ was compared to C.