I agree with point (1), wholeheartedly - by default the syntax is ugly unless yo...

quotemstr · on April 6, 2020

Regarding #2: whether you use the same buffer or a new buffer, you still have to actually write to the memory. The bottleneck at numpy scale isn't the memory allocation (mmap or whatever) but the actual memory writes (saturating the memory bus), and you need to perform the same number of writes no matter which destination array you use.

a_t48 · on April 6, 2020

It should still be a perf gain:

1. Memory allocation isn't free 2. Doing multiple loops over the buffer (once per operation) is going to be slower than doing all the operations at once, both because of caching and the opportunity to do the operations on a register and write at the end (though who knows if the python interpreter will do that).

quotemstr · on April 6, 2020

The Python interpreter is too stupid to do operation coalescing. That's the whole point of my initial comment.