Quote:
How can this be explained on implementation level that it’s faster to iterate over contiguous values?
It's down to a number of things: Remember that the memory in a modern system has many levels of cache - when computers read a value from memory, they don't necessarily read a single value, they often read a cache block. Generally speaking, Intel processors read 64 byte blocks and cache it, so that if the next request is also from the same cache block, it can all remain inside the processor, where it's all a whole load faster.
So if your loop is accessing sequential addresses, there is a good chance that it's in the cache already (15 out of 16 int reads will be).
(Then there is memory width: two sequential int accesses can be made (and indeed are made) on a 64 bit system when reading a 32 bit integer because the data bus always reads 64 bits at a time - but you can ignore this for all practical purposes in your apps. The compiler won't!)
So if one loop always missed the cache, and the other hits it 15 times out of 16, I think you can guess which is going to be quicker!