It's interesting that our best "AI" is essentially a hacking tool. It's like we're trying to build skynet here :)
I wonder if you could make something like Afl-fuzz but point it at your test cases, until it brute-forces some code that makes all your tests pass :) You can throw the syntax of all the modules it should be using at it.
While it might not be make as much sense as if you come up with code yourself, arguably everything humanity has done has been a result of evolutionary fuzzing and test cases.
> I wonder if you could make something like Afl-fuzz but point it at your test cases, until it brute-forces some code that makes all your tests pass :)
Isn't that how genetic algorithms are supposed to work?
That's how genetic programming could work; with a suitable test suite you should be able to evolve an implementation. See FINCH for some early work on GP for Java [1].
This might be computationally feasible in this lifetime for a simple app of sorts, but we are dealing with potentially an insane number of combinations that would have to be tried.
We still have trouble calculating all possible 20 character passwords.
You have considerably less trouble checking all 20-character passwords if you get to introspect on the program doing the checking though! (To see what character lets it advance to the following loop or otherwise changes its state; this is similar to a timing attack.)
Without doing that, btw, I think you understate the difficulty of "calculating all possible 20 character passwords." It's not just that we 'still have trouble' doing it. If you just take lowercase a-z (and hell, even exclude Q,A,Z,M, and Y so that it's the same on all major keyboard layouts) you still get 20^20=100,000,000,000,000,000,000,000,000 combinations i.e. 10^26. Which at 1000 gigatries per second leaves you with 10^14 seconds, i.e. 3 years. Hey, it's not so bad :)
That compares favorably with the number of grams in the whole solar system (something like 1.99E+33), including, obviously, the whole Earth, the sun, all the planets, etc. It's the number of grams in 1,000,000 of our solar systems. So don't expect us to start calculating all possible 20 character passwords anytime soon :)
>It's interesting that our best "AI" is essentially a hacking tool. It's like we're trying to build skynet here :)
It's interesting? It should be no surprise that technology often gets its highest and most inspirational motivation from the dreams of the totality of human literature. For, it is a fact, Artists dream, Engineers build. And all roles are equally capable of authoring things.
It should be less of a surprise, and more of a cause for call to arms, that we are in fact building skynet, and have been for decades now.
If the tests are crashes and you have existing unit tests, some work out of Virginia on using genetic programming to evolve patches could possibly work: http://dijkstra.cs.virginia.edu/genprog/
Wouldn't overfitting be a huge issue? You'd probably have to put a large penalty on size/running time vs correctness.
Getting trivial solutions is really easy and doesn't "connect" much to the "space" of good algorithms. That's the primary reason for the development of neural networks: with continuous differentiable elements you can do a nicer "hill climbing" toward small representations through back-propagation. With hard-value logic you'd be more stumbling in the dark until you get lucky, but that's O(exp(N)) on the program size.
this is a very good point. I suppose you could in some cases make the test cases "exhaustive" (or at least drawing from all possible inputs) which would possibly mitigate this problem. Putting a limit on size/running time as you suggest is certainly a good idea, though it's not like any algorithm can brute-force more than a couple of composed tokens at a time....
> I wonder if you could make something like Afl-fuzz but point it at your test cases, until it brute-forces some code that makes all your tests pass :) You can throw the syntax of all the modules it should be using at it.
AIUI prolog kind of works like that. You give it what it can use and what it needs to produce, and it'll fill in the implementation.
(In practice it's slow as hell and figuring out how to tell it the information in the "right" way so that it can "figure out" the proper implementation is a black art.)
IANAL, But from the readme [1], it's Apache Licensed, so fork away - just make sure you retain their copyright notices and follow the rest of the license.
-edit-
In addition, the bottom of the README which you quoted just mentions the need to sign a CLA if you want your code pushed upstream. See the FSF page about why they require copyright assignment [2] for their projects for some background.
Note that the Contributor License Agreement (CLA) and copyright assignment are rather different things. Google and the FSF have very different reasons for requiring their respective arrangements.
Someone should combine AFL with an x86 code emulator. Sure, it'll be slow. But it'll be better than nothing. (And you'd be fuzzing the emulator at the same time!)
(An interesting alternative route would be to rewrite the binary in-place, using something like PEBIL.)
AFL already has support for using QEMU for blackbox binary instrumentation. The NixOS packages (maintained by me) automatically have it supported out of the box. There is also 3rd party support for using Dyninst to statically instrument binaries in place (which is faster than QEMU and looks more robust than PEBIL): https://github.com/vrtadmin/moflow/tree/master/afl-dyninst
I wonder if you could make something like Afl-fuzz but point it at your test cases, until it brute-forces some code that makes all your tests pass :) You can throw the syntax of all the modules it should be using at it.
While it might not be make as much sense as if you come up with code yourself, arguably everything humanity has done has been a result of evolutionary fuzzing and test cases.