Julia does vectors well, but scalar computations are very important. The problem with vector computations:
z = x*(v+w)
This involves a memory allocation for v+w, and a second one for the product x times result. One way to avoid this in numpy is careful use of out:
z = zeros(shape=x.shape, dtype=float)
add(v,w,out=z)
multiply(x,z,out=z)
An even faster way, which only requires traversing the arrays once, and is more readable in my opinion, is a simple for loop:
for i=1:len(x)
z[i] = x[i]*(v[i]+w[i])
end
This also lets you more easily put if statements inside loops, etc. You can also do accumulative calculations without creating intermediate arrays at all. I.e. one way to create a random walk is:
walk = cumsum(bernoulli(p).rvs(1024))
end_result = walk[-1]
Another way is:
end_result = 0
for i=1:1024
if rand() < p
end_result += 1
end
end
This is not a problem with vector computations, but the particular implementation. The vector expression in your example is by far the most readable. It is literally the math you had in mind.
De-vectorization in code is like embedding ASM code: you had to write it out because the compiler sucks. Good language design should favor lucid and concise syntax, and good compiler implementation should make it not necessary to circumvent it for performance. In this case, the compiler should be implementing those vector expressions without allocating unnecessary memory.
I've sort of been assuming that Julia is capable of inlining the vectorized notation into a single traversal over the array, at least for simple types. Is that not true?