Ah I see, thanks. You shouldn't require a BOM though. There are often files without a BOM, and not all files with UTF-16 in them are text either (EXEs etc.). I would just search for all the possible UTF byte sequences (UTF-7, UTF-8, UTF-16LE/BE, UTF-32, possibly with a switch to allow specifying subsets or additional encodings if you can support that?) regardless of BOM.
In ripgrep itself you can apparently only look in files of encodings other than UTF16LE with BOM by manually specifying `--encoding UTF16BE` etc.
I could maybe add encoding detection myself, but I'm kind of discouraged since not even the unix `file` tool can detect those files as text, and a normal editor opens at least a UTF16BE file completely wrong. So I'm not sure if I want to spend my time on trying to write heuristic detection on those, especially since UTF16 itself is broken and shouldn't really exist at all...
Thanks! Yeah I wouldn't try to detect encodings or use heuristics either. If you could just reduce a single pattern into the OR of a bunch of byte sequences in each encoding, I think that should work? I'm not sure how easy that is with the interface you're given. (I wouldn't call UTF-16 'broken', but either way... it's a reality; a huge fraction of the time when you're searching binary files on Windows it's to find text inside executables, which on Windows are generally UTF-16.)
Nope, haven't tried it. I just saw that ripgrep is using appveyor for windows instead, so I assumed it doesn't work on travis. I was actually just trying to add appveyor to this [1], but I'm getting a weird error.
> doesn't support other UTF encodings like UTF-16
UTF-16 should in fact work, since ripgrep supports it too. Looks like my binary file detection is at fault [1]..
> Not sure if it can search in single-line mode either
That works fine, just use `rga --multiline '\n' fname`
[1]: https://github.com/phiresky/ripgrep-all/issues/5