The price of C++ compatibility is that it doesn't use Rust strings internally. It's all "wchar" and "WString", which are Microsoft C++ "wide character strings".
This may be more of a Microsoft backwards compatibility issue than a C++ issue.
It has nothing to do with Windows. fish doesn't support Windows. Their use of wchar_t is the glibc wchar_t (wchar_t is not Microsoft-specific) which is a 32-bit type and stores UTF-32-encoded codepoints. The Rust type they're using is also the same ( https://github.com/fish-shell/fish-shell/blob/master/doc_int... ).
The Cygwin issue isn't strings (well, that could be another issue) but that Rust doesn't support Cygwin in the first place, at least according to the comments in the linked thread.
The siblings are correct but also not precise in a way in which could be misleading.
Rust has one built-in string type: str, commonly written &str. This is a sequence of UTF-8 encoded bytes.
Rust also has several standard library string types:
* String is an owned version of &str. Also UTF-8.
* CString/CStr: like String and &str, but null terminated, with no specific encoding
* OsString/OsStr: an "os native" encoded string, in practice, this is the WTF-8 spec.
And of course, all sorts of user library-created string types.
The issue at hand isn't that there's some sort of special string type that Rust can't represent, it's that they have an existing codebase that uses a specific string type, so they're going to keep using that type for compatibility reasons, rather than one of the more usual Rust types. This means they won't have to transcode between the two types.
If we're competing to be pedantic here, the problem is that Windows encodes paths as UTF-16 and Linux can support just about any random jumble of bytes in a path regardless of if those bytes are valid UTF-8 or not. Neither of these play nicely with Rust's "if it can be represented, it's valid" approach to code safety, so OsStr(ing) exists as a more permissive but less powerful analogue for such cases.
Rust is capable of UTF-8. It is not capable of not UTF-8. Any sequence of bytes that is not a valid UTF-8 string cannot be represented by the String type.
Rust strings enforce utf-8 encoding, yes. However, it seems Windows (which uses utf-16) allows ill-formed UTF-16, in particular it tolerates unpaired surrogates.