While it looks pretty simple it cleans MS Word and browser artifacts in pasted markup pretty well.
But I shall admit that such simplicity is possible only with sciter (that html-notepad is based on). E.g. that canonicalizeDOM gets called before the content appears in target document. So all this does not affect undo/redo stack, etc.