I'm one of the co-authors on the paper. Let me know if you have any questions! (And no, we didn't come up with the title of the article. One of the Columbia journalists did.)
I'd just like to point out that `cryptography' is _not_ just about testing that `decrypt(encrypt(msg)) == msg`. memfrob'ing a region of memory does this, but would we consider that a 'secure' encryption scheme? I don't think so . . .
Even if it was -- how do you unit-test a PRNG to ensure the output is uniformly random? How do you ensure you don't have collisions in a hash function? How can you ensure that there's no replay attack on your protocol? How can you use unit-testing to demonstrate that there is no hole in your protocol like, say, when the Germans repeated the key twice at the beginning of the message while using Enigma?
These are all very real and valid cryptography concerns. Sorry if this sounds snarky, but in all seriousness -- I'm very curious as to how you would unit-test these.
How do you ensure you don't have collisions in a hash function?
All hash functions have collisions, by definition, since they represent a unique checksum of a large amount of data in a smaller keyspace. What you are really asking is probably "how do you prove that you have difficult to predict collisions"? I don't know. (I assume that the mechanisms for doing that are based upon statistical analysis of the propagation breadth and speed of various single bit changes in the input stream over varying numbers of rounds?)
How can you ensure that there's no replay attack on your protocol?
A combination of session keys and message sequencing? (eg. Each message must include a hash of the last.)
How can you use unit-testing to demonstrate that there is no hole in your protocol like, say, when the Germans repeated the key twice at the beginning of the message while using Enigma?
There's an entire field of computer science around formal proofs; I don't know enough about it to really comment, however a partial answer may lay in there. Essentially that means adopting a design-time proof strategy in your testing rather than a post-facto unit-testing proof strategy, though.
Good points, but there is a domain of these claimed problems which are covered by automated testing, that's all I'm saying.
And no, you can't find security holes with tests (unless they are random-input style tests, I guess), but you can at least cover the hole afterwards with a test.
Just reading your second reference, the bit about expert systems is pretty interesting . . . what's your stance on this? It seems that, given sufficient instrumentation information, one could probably construct a fairly decent expert system that might be able to shed some light onto these sorts of problems. (Keep in mind the fact that I'm still very much a junior software developer!)
The example of FDs from the original article seems like a good one; a properly developed expert system might have been able to notice a lack of changes in the current software environment and suggest looking elsewhere. (Problem is, if I got that response from my automated problem-solver I'd just ignore it and keep looking . . .)
So, any insightful points to crush my youthful idealism? :)
Expert systems for monitoring and management
of large server farms and networks? Been there.
Done that. Got the T-shirt. Wrote papers.
Gave one at an AAAI IAAI conference at Stanford.
Shipped two commercial products.
First problem: Need some 'experts'. So, only
know about problems the experts do. Their
knowledge is supposed to be mostly just empirical,
from their 'expert experience'. Even if they
have 'deep knowledge', that is, how the systems
work internally (e.g., the engine connects to the
torque converter connects to the transmission
connects to the rear differential connects to
the rear wheels), the specific problems they
know about are usually just the ones they have
encountered. So, essentially can only address
problems seen before and are well understood.
But for the problems of the OP, they were seen
for the first time. Bummer. Actually, with
irony, if the problems have been seen before and
are well understood, then why the heck have they
not already been solved?
Second, with expert systems, we're working just
intuitively from just experience. So,
we have little idea if what we are doing is the
best possible in any reasonable sense or even
at all good. In particular, in monitoring, the
main, first operational problem on the 'bridge'
or in the 'NOC' (network operations center)
is the false alarm rate too high with no good
way to lower it (except just to ignore some
alarms, but we should be able to do much better).
Net, for a serious attack on system monitoring
and management, I can't recommend taking
expert systems very seriously.
Speaking as a student studying systems (particularly security and OS development) . . . the second point sounds really interesting.
Recently I've been working on a simple microkernel (called Sydi) that should actually make this possible. It's a distributed system straight down to the kernel: when two instances of the kernel detect each other running on a network, they freely swap processes and resources between each other as required to balance the load. This is a nightmare from a security perspective, of course, and I haven't figured out the authentication bits yet -- too busy working on an AML interpreter because I don't like ACPICA -- but that's not the issue here. (It's far from finished, but I'm pretty confident that I've worked out the first 85% of the details, with just the remaining 115% left, plus the next 95% of the programming . . .)
Due to some aspects of its design (asynchronous system calls via message-passing, transparent message routing between nodes in the network, etc) I feel that it will be entirely possible to take a snapshot of an entire cluster of machines running the kernel. It would be expensive -- requiring a freeze of all running processes while the snapshot is taking place to maintain any level of precision -- but I'm confident I could code that in during the space of a week or two . . .
I haven't thought much about what kind of analysis one could do on the instrumentation/snapshot results, though. I'm sadly too inexperienced with `real-world' systems stuff to be able to say. Anyone have any suggestions for possible analysis avenues to explore?
This sounds interesting! Have you published anything more substantial on it? I yearn for the day when I can have a Plan 9-ish resource sharing and file system connecting my personal computer to my cloud computers.
> It would be expensive -- requiring a freeze of all running processes while the snapshot is taking place to maintain any level of precision.
Have you thought about distributed algorithms like the Chandy-Lamport Snapshot algorithm (http://en.wikipedia.org/wiki/Snapshot_algorithm) that take a consistent snapshot and do not require a system freeze?
This is exactly my take on the Parallella board, and it's the exact reason why I donated to the Kickstarter back in the Fall.
I want to try my hand at writing a real, efficient, many:many message-passing API on top of SHM. It's something I've been interested in for a while (and am doing in an a side project for x86_64). Not because it hasn't been done a thousand times before, but because it's neat.
I want to write a compiler for the Parallella. Not because there aren't compilers already, but because I've never written a compiler that targets a RISC architecture before. I've never written a compiler that respects pipelining.
I want to write a parallelized FFT based on the original paper for the Parallela. I've used various FFT libraries before, but never actually implemented an FFT straight up. Why? Not because it's never been done before, but just because it's an idea that appeals to me. And for practice parallelizing algorithms . . .
I want to write a raytracer for the Parallella. Not because I haven't written a raytracer before, but because I think that I'll be able to do something interesting with a Parallella raytracer that I haven't done before: real-time (ish) raytracing. Not because that hasn't been done before, but because it'd be neat to build.
I want to build a distributed physics engine. Not because there aren't excellent open-source physics engines (Bullet, ODE, etc.) -- but because I find the problem interesting. It's something I've wanted to do for a while, but never got around to. Why? Because it's interesting.
I could go on, but I'll stop here. The Parallella, I think, is a catalyst for a lot of small projects that I've wanted to do for a while. The Parallella is my excuse to spend time on random projects that will never go anywhere beyond a page on my website describing what they are, plus a link to the source code.
And, you know what? That seems perfect to me. That's why I want a Parallella, and that's why I'm eagerly awaiting mine within the next month or three. (Hopefully!)
Sounds cool, but worth pointing out that the Parallela is Zynq based, and so comes with a Xilinx FPGA built into the SoC that includes the dual ARM cores. The FPGA provides the "glue" for the Epiphany chip to talk to the CPU, but there's plenty of spare capacity.
The more the merrier, though. I wish I had time to play with FPGA's - I have a Minimig (Amiga reimplementation where the custom chips are all in an FPGA) and I'm on the list for an FPGA Replay (targeting FPGA reimplementation of assorted home computers including the Amiga, and arcade machines in an FPGA).
I've been using Lua in a project for quite some time, replacing Python as my embedded scripting language of choice.
It never ceases to amaze me, as a language and as an interpreter. Sure, the syntax bites you occasionally, and sure, the lack of built-in functionality is occasionally annoying. But hey, I can add callbacks into a simulation engine that execute 10 million times per second with almost no slowdown in the result. Like the article mentions, the string matching library isn't as powerful as Perl's, but it's more than adequate for all of my uses.
My only wish is for an updated LNUM patch that works with Lua 5.2. I deal with integers too much in my projects for its absence to not be annoying.
Roberto did a keynote at the Lua workshop '12, where he detailled what keeps the Lua team busy. A video of it exists somewhere on the web. Half of it was about how to integrate IEEE doubles and 64 bits ints soundly in the language. There are some challenges left, but they're really trying to overcome them.
Same thing here. When reading fiction (and some types of non-fiction) I can do away with my internal monologue and absorb the text directly, with full or near-full comprehension. [0] With textbooks etc. I need to actually subvocalize the words for full comprehension.
I'm skeptical of the claim about subvocalization made here as well, but I'm realistic enough to know that I'm not different enough to fall outside the realms of the study referenced.
[0] My reading speed in this `mode', so to speak, ranges from 300-750 WPM, depending on how engaged I am in the material. I measure comprehension by testing myself via asking others (with a copy of the material in question) to quiz me on the content after such a reading session.
Strangely enough, I find that retention is always better (for me atleast) in casual reading of novels etc; I'm ok there with ~400-500 wpm. I do skim through reports actually, well mostly that happens when you approximately know the content so there just may be higher speeds may be achieved as you know what to expect. I think for me it is mostly prior knowledge that maybe makes a difference of 100-200 wpm
I've found that my impression of CtrlP's speed varies dramatically depending on the project that I'm working on. For projects such as Bochs [0] (~750 files in CtrlP), I get annoyed at the initial .5-sec wait [1], but after that it's extremely fast. I don't have a particularly fast processor, either.
I find CtrlP breaks down with large projects such as the Linux kernel. While it's fast to find the initial list of files (2 seconds or so) it's extremely slow to actually search through said list, about .5-sec per keystroke.
[0] I'm not a maintainer, I just happen to be using its wonderful instrumentation hooks in a research project.
[1] It's about half a second on my laptop, which has an SSD. This time is dramatically larger if you have a rotary drive, more like 10-20 seconds IIRC.
Regarding the company in question -- my own best guess would be one that does static code analysis -- Coverity [0] comes to mind, for example. I can't really think of any other companies that would benefit greatly from having a large stock of employees that all know the C++ standard intimately. Except of course for those that write compilers, but as you say, there aren't many of those around.
I can't find it now, but they published a paper that said "There is no such thing as C" in which they discuss the widely varying implementations and expectations of C, different enough that special flags had to be invented on their tool to get weird programs to pass.
> An updated version of LAN Manager known as NTLM was introduced with Windows NT 3.1. It lowered the susceptibility of Windows passwords to rainbow table attacks, but didn't eliminate the risk. To this day, the authentication system still doesn't apply cryptographic "salt" to passwords to render such attacks infeasible.
Is this true? Do WinNT-derived systems _still_ not use password salting? If that is the case, my opinion of Windows's security just dropped quite a bit. I know security can exist without salting, and I've not done any in-depth Windows development, but from experience w/backend dev and Linux dev it seems like it'd be pretty cheap to add . . .
The problem wasn't the difficulty of adding a better hash algorithm. The lousy hash was kept around to maintain compatibility with network clients that didn't support the new algorithms.
I'm one of the co-authors on the paper. Let me know if you have any questions! (And no, we didn't come up with the title of the article. One of the Columbia journalists did.)