"NTFS allows any sequence of 16-bit values for name encoding (file names, stream names, index names, etc.) except 0x0000. This means UTF-16 code units are supported, but the file system does not check whether a sequence is valid UTF-16 (it allows any sequence of short values, not restricted to those in the Unicode standard). "
- from wikipedia NTFS page [1]
So if you assume that NTFS filename is valid UTF-16 and convert it to UTF-8 there might be a problem. Basically they can be any sequence of 16-bit values.
There was a time when some of our customers had lots of problems with gigantic files on their drives that was impossible to delete with windows explorer. I would come home to them and help them delete the files with the command line using filename*.ext to catch them. My guess was that the filename had some protected characters that windows explorer didn't allow. Don't remember how they ended up with the files but most likely some download program and someone having a laugh :-)
There are a number of characters like path separators that cannot be part of a file name on windows. However I am not sure if this is enforced by the OS APIs or by NTFS itself. It is entirely possible that NTFS could allow something that higher layers don’t.
- from wikipedia NTFS page [1]
So if you assume that NTFS filename is valid UTF-16 and convert it to UTF-8 there might be a problem. Basically they can be any sequence of 16-bit values.