When I allocate large arrays with NumPy, it appears as though some kind of lazy allocation is taking place, which I do not understand.
If I do
a = np.empty(10**9)
while watching the memory usage of the system (e.g. via
htop), nothing happens. This allocates a billion 8-byte floats, so I would expect about 8 GB of additional memory being used up. Also, the operation only takes a few milliseconds.
If I now do
a[:] = 0
the memory jumps up to what is expected.
One may think that
np.empty() is somehow clever. The same behavior is however seen if I instead do
b = np.zeros(10**9)
Again, the memory does not seem to be allocated until I do e.g.
b[:] = 0
which ought to be a no-op. I can even loop over all elements without the memory going up.
Lastly, the same behavior is not seen with
np.ones(). Here the memory is consumed on creation, which now takes about a second.
What is going on?