> What's difficult is to ensure that the converged state is renderable as richtext. For example, is there a table cell that was inserted where a column was deleted?
Yes. This is one of the fundamental limitations of working at a textual level, which is sort of the local optimum that *nix ended up in. JSON in particular gets suuuuper fucked up if you don't merge/rebase carefully. There's no real syntax for it to grab onto and diff doesn't understand the concept of indentation or commas, so it just turns into an ocean of line-swapping and incorrect block-swapping. Diff also does an excruciatingly poor job in the very common case when everyone is appending to the same area (let's say, the end of the file).
This is pretty much just an inherent weakness of textual matching, what you need is to work on trees of lexical token nodes, or some type of object structure stream like powershell. Not that it's perfect either, I'm sure there are combinations of options that can leave a node in an inconsistent state, but it at least gives you a chance at correctness.
In some cases patience-diff can help, it tries to generate big blocks of changed ranges, hopefully some of the hunks being relatively syntactically well-formed, but there is still no guarantee. There is also JSON-diff which implements such a lexical-tree diff model for diff files, similar to the "jq" util.
I think that's also viable for other lexable languages too - it would be cool to sed or find/replace on an tokenized representation of the code, have jquery or jpath style selectors etc. A lot of the time we kinda end up doing this with find/replace anyway.
Yes. This is one of the fundamental limitations of working at a textual level, which is sort of the local optimum that *nix ended up in. JSON in particular gets suuuuper fucked up if you don't merge/rebase carefully. There's no real syntax for it to grab onto and diff doesn't understand the concept of indentation or commas, so it just turns into an ocean of line-swapping and incorrect block-swapping. Diff also does an excruciatingly poor job in the very common case when everyone is appending to the same area (let's say, the end of the file).
This is pretty much just an inherent weakness of textual matching, what you need is to work on trees of lexical token nodes, or some type of object structure stream like powershell. Not that it's perfect either, I'm sure there are combinations of options that can leave a node in an inconsistent state, but it at least gives you a chance at correctness.
In some cases patience-diff can help, it tries to generate big blocks of changed ranges, hopefully some of the hunks being relatively syntactically well-formed, but there is still no guarantee. There is also JSON-diff which implements such a lexical-tree diff model for diff files, similar to the "jq" util.
https://github.com/zgrossbart/jdd
I think that's also viable for other lexable languages too - it would be cool to sed or find/replace on an tokenized representation of the code, have jquery or jpath style selectors etc. A lot of the time we kinda end up doing this with find/replace anyway.