DropDead's comments

DropDead · 2026-03-15T15:14:36 1773587676

Why didn't some make av rule to find stuff like this, they are just plain text files

nine_k · 2026-03-15T16:05:58 1773590758

The rule must be very simple: any occurrence of `eval()` should be a BIG RED FLAG. It should be handled like a live bomb, which it is.

Then, any appearance of unprintable characters should also be flagged. There are rather few legitimate uses of some zero-width characters, like ZWJ in emoji composition. Ideally all such characters should be inserted as \xNNNN escape sequences, and not literal characters.

Simple lint rules would suffice for that, with zero AI involvement.

hamburglar · 2026-03-15T17:30:51 1773595851

I think there’s debate (which I don’t want to participate in) over whether or not invisible characters have their uses in Unicode. But I hope we can all agree that invisible characters have no business in code, and banishing them is reasonable.

WalterBright · 2026-03-15T17:14:59 1773594899

> There are rather few legitimate uses of some zero-width characters, like ZWJ in emoji composition.

Emojis are another abomination that should be removed from Unicode. If you want pictures, use a gif.

_flux · 2026-03-15T18:09:09 1773598149

Arguably them being in Unicode is an accessibility issue, unless we thought to standardize GIF names, and then that already sounds a lot like Unicode.

WalterBright · 2026-03-15T18:20:51 1773598851

How is it an accessibility issue? HTML allows things like little gif files. I've done this myself when I wrote text that contained Egyptian hieroglyphs. It works just fine!

_flux · 2026-03-15T18:43:15 1773600195

I mean if you don't have sight.

WalterBright · 2026-03-15T18:46:40 1773600400

Then use words. Or tooltips (HTML supports that). I use tooltips on my web pages to support accessibility for screen readers. Unicode should not be attempting to badly reinvent HTML.

sghitbyabazooka · 2026-03-15T18:03:26 1773597806

( ꏿ ﹏ ꏿ ; )

trollbridge · 2026-03-15T16:25:21 1773591921

In our repos, we have some basic stuff like ruff that runs, and that includes a hard error on any Unicode characters. We mostly did this after some un-fun times when byte order marks somehow ended up in a file and it made something fail.

I have considered allowing a short list that does not include emojis, joining characters, and so on - basically just currency symbols, accent marks, and everything else you'd find in CP-1521 but never got around to it.

hrmtst93837 · 2026-03-16T12:03:42 1773662622

Automatic escaping sounds nice until you need to grep or diff across repos and get buried in opaque escapes that turn ordinary review into unreadable junk. Once that lands in a repo, even routine deps updates can turn into edge-case mismatch roulette.

Lint zero-width chars, sure. But if the actual sink is runtime string injection, banning eval is only half a fix because Function and friends still get you to the same bad place while the linter congratulates itself.

abound · 2026-03-15T15:27:04 1773588424

Yeah it would have been nice to end with "and here's a five-line shell script to check if your project is likely affected". But to their credit, they do have an open-source tool [1], I'm just not willing to install a big blob of JavaScript to look for vulns in my other big blobs of JavaScript

[1] https://github.com/AikidoSec/safe-chain

nine_k · 2026-03-15T16:30:17 1773592217

Something like this should work, assuming your encoding is Unicode (normally UTF-8), which grep would interpret:

  grep -P '[\x{200B}\x{200C}\x{200D}\x{FEFF}]' code.ts

See https://stackoverflow.com/q/78129129/223424

pwoyke · 2026-03-23T13:51:19 1774273879

The grep approach catches zero-width joiners and BOM characters but misses what GlassWorm uses - variation selectors (U+FE00-FE0F and U+E0100-E01EF). Those don't show up in most regex patterns people reach for, and they're valid Unicode so editors don't flag them either. ESLint won't catch it because variation selectors are legal characters - they're meant for glyph selection in CJK text and emoji. The issue is that GlassWorm uses thousands of them per line where legitimate use is 1-2. It's a density problem, not a character-class problem. We ran into this while analyzing the waves at work and ended up building a scanner around it - counts variation selector clusters per line, matches the decoder pattern (codePointAt + the specific arithmetic GlassWorm uses) in a narrow window to cut false positives from minified code. Open-sourced it last week: https://github.com/afine-com/glassworm-hunter

charcircuit · 2026-03-15T20:03:34 1773605014

Isn't that what this article is about? Advertising an av rule in their product that catches this.

DropDead · 2026-02-27T08:42:04 1772181724

For me, its better to just use a local password store like KeePass

DropDead · 2026-02-27T08:39:41 1772181581

I liked the movies and played FireRed (now on Switch)