Click here to Skip to main content
15,867,308 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
In this programme there is a matrixof 128 x 128 integers. In the first step we change every element to 1. In the second step we change every element to 2. If we compare the time needed for these steps we see that step 2 is much faster than step 1 when we change the length of the matrix. This differce starts from 512x512 and becomes more signifacant when we change it to 1024,2000,... How can this be explained?

Java
public class PagingTest 
{
    private static int size = 128;          // 128, 1024, 8192, ...
    private static long start, stop;
    private static int[][] data;

    public static void main (String[] args) 
    {
        if(args.length > 0)
        {
            size = Integer.parseInt(args[0]);
        }

        data = new int[size][size];
        System.out.println("Updating " + size + " x " + size + " array ... ");

        start = System.nanoTime();

        for(int j = 0; j < size; j++)
        {  
            for(int i = 0; i < size; i++)
            {  
                data[i][j] = 1;
            }
        }
        stop = System.nanoTime();
        System.out.println("  step 1: " + (stop - start) + " nanoseconds");
     
        start = System.nanoTime();

        for(int i = 0; i < size; i++)
        {  
            for(int j = 0; j < size; j++)
            {  
                data[i][j] = 2;
            }
        }
        stop = System.nanoTime();
        System.out.println("  step 2: " + (stop - start) + " nanoseconds");
    }
}


What I have tried:

I have tried running the programme for different matrix seizes to compare the output. Everytime we dubble the row and collumn length. I know it has nothing to do with the cache. Maybe it has something to do with TLB?
Posted
Updated 12-Dec-19 2:26am

It could be because you are not iterating the same way in both cases: in first case, you iterate over second dimension, then over first. In second case, you iterate over first dimension, then over second dimension.
It is possible that second case is faster because you iterate over contiguous values in memory.
 
Share this answer
 
Comments
Member 14654526 12-Dec-19 8:05am    
How can this be explained on implementation level that it’s faster to iterate over contiguous values?
phil.o 12-Dec-19 8:18am    
Caching maybe? Having to get back and forth an array of in-memory values renders caching useless.
Member 14654526 12-Dec-19 8:51am    
Our professor has already stated it’s definitely not caching.
Quote:
How can this be explained on implementation level that it’s faster to iterate over contiguous values?

It's down to a number of things: Remember that the memory in a modern system has many levels of cache - when computers read a value from memory, they don't necessarily read a single value, they often read a cache block. Generally speaking, Intel processors read 64 byte blocks and cache it, so that if the next request is also from the same cache block, it can all remain inside the processor, where it's all a whole load faster.

So if your loop is accessing sequential addresses, there is a good chance that it's in the cache already (15 out of 16 int reads will be).

(Then there is memory width: two sequential int accesses can be made (and indeed are made) on a 64 bit system when reading a 32 bit integer because the data bus always reads 64 bits at a time - but you can ignore this for all practical purposes in your apps. The compiler won't!)

So if one loop always missed the cache, and the other hits it 15 times out of 16, I think you can guess which is going to be quicker!
 
Share this answer
 

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900