De-referencing pointers in the fastest way

Question

0.00/5 (No votes)

See more:

Hi,
Need your valuable suggestions.

C++

struct node
{
   char name[20];
};

int main(void)
{
   struct node n1;
   struct node* ptr;

   strcpy(n1.name, "HelloWorld!");

   ptr = $n1;  

   printf("%s\n", ptr->name);  // is this the fastest //
   printf("%s\n", n1.name);    // is this the fastest //
   return 0;
}

any other scenarios will be much appreciated
any good books for performance tuning in C on using pointers and other stuffs.
:)

Posted 10-Oct-12 21:50pm

AshakiranBhatter

Updated 10-Oct-12 22:23pm

Jochen Arndt

v2

Add a Solution

Comments

Jochen Arndt 11-Oct-12 4:25am

I added tags to the code for better readibility.
You should also edit your question using the green 'Improve question' link to change the wrong line 'ptr = $n1' to 'ptr = &n1'.

pasztorpisti 11-Oct-12 4:33am

This is not pointer dereferencing and without optimizing the code this is faster: n1.name. Pointer dereferencing often can't be optimized. However such thing is rarely the subject of optimization. The source for performance bottlenecks in real world is always something else.

3 solutions

Add a Solution

Add your solution here

Treat my content as plain text, not as HTML

Preview 0

…

Existing Members

Sign in to your account

...or Join us

Download, Vote, Comment, Publish.

Your Email
Password
Forgot your password?

Your Email
This email is in use. Do you need your password?
Optional Password

I have read and agree to the Terms of Service and Privacy Policy
Please subscribe me to the CodeProject newsletters

When answering a question please:

Read the question carefully.
Understand that English isn't everyone's first language so be lenient of bad spelling and grammar.
If a question is poorly phrased then either ask for clarification, ignore it, or edit the question and fix the problem. Insults are not welcome.
Don't tell someone to read the manual. Chances are they have and don't get it. Provide an answer or move on to the next question.

Let's work to help developers, not make them feel stupid.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Jochen Arndt · Answer 1 · 2012-10-10T22:35:00

Solution 1

Regarding the generated code you may find out yourself by letting the compiler generate assembly output files. With Visual C++ you can set the output option in the C/C++ project settings (command line switches /FA and /Fa). With Linux and gcc, use the -S command line option.

Posted 10-Oct-12 22:35pm

Jochen Arndt

enhzflep · Answer 2 · 2012-10-10T23:33:00

As pasztorpisti said, this is not an area you should bother looking at for optimizations - these savings are (generally) made by changing the approach to a problem, _not_ by changing the syntax you use to access a particular piece of data.

Sometimes cache-locality can play a part, which can dictate the amount by which a loop can be un-rolled before the benefit of doing so is lost.

That said, they're not quite the same, as can be shown with a small snippet of code.

First, the results:

Run 1:
Time elapsed to complete 500000000 iterations: 3.561095
Time elapsed to complete 500000000 iterations: 3.797974
Dif = 0.236879s

Run 2:
Time elapsed to complete 500000000 iterations: 3.546631
Time elapsed to complete 500000000 iterations: 4.148972
Dif = 0.602341s

Run 3:
Time elapsed to complete 500000000 iterations: 2.312600
Time elapsed to complete 500000000 iterations: 2.926228
Dif = 0.613628s

Run 4:
Time elapsed to complete 500000000 iterations: 3.227201
Time elapsed to complete 500000000 iterations: 3.392681
Dif = 0.16548s

Next, the code:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <windows.h>

LARGE_INTEGER clockFreq;
LARGE_INTEGER tStart, tEnd;

struct node
{
   char name[20];
};

LARGE_INTEGER getTicks()
{
    LARGE_INTEGER result;
    QueryPerformanceCounter(&result);
    return result;
}

double elapsedSecs(LARGE_INTEGER tStart, LARGE_INTEGER tEnd)
{
    long ticksElapsed = tEnd.QuadPart - tStart.QuadPart;
    double timePeriod = (double)ticksElapsed / (double)clockFreq.QuadPart;
    return timePeriod;
}


int main(void)
{
    struct node n1;
    struct node* ptr;
    long i, max = 500000000;
    char *tmp;

    QueryPerformanceFrequency(&clockFreq);

    strcpy(n1.name, "HelloWorld!");

    ptr = &n1;

    tStart = getTicks();
    for (i=0; i<max; i++)
    {
        tmp = ptr->name;
    }
    tEnd = getTicks();
    printf("Time elapsed to complete %d iterations: %f\n", max, elapsedSecs(tStart, tEnd));


    tStart = getTicks();
    for (i=0; i<max; i++)
    {
        tmp = n1.name;
    }
    tEnd = getTicks();
    printf("Time elapsed to complete %d iterations: %f\n", max, elapsedSecs(tStart, tEnd));

    printf("%s\n", ptr->name);  // is this the fastest - yup! //
    printf("%s\n", n1.name);    // is this the fastest //

   return 0;
}

Finally, the qualification and testing environment:

While we can see that the times taken vary considerably over the course of the 4 runs, in each and every case the ptr->name access is faster. But before you celebrate, do realize that in each case we're doing 500 million iterations. Yet the time difference is only in the order of 15% of the total time.

So, the importance of saving 15% of next to nothing is.... you guessed it - even closer to nothing!

Windows task-switching (which is beyond your control) has a far, far greater effect than the access method.

Test machine: i3 M220 @ 2.13 Ghz, 4GB, Win7 Home prem

CPallini · Answer 3 · 2012-10-10T22:43:00

Solution 2

Of course there is no difference.
Remember: above all you program shoul be correct (for instance ptr = $n1; is a syntax error, it should be ptr = &n1;), then you may find a clever algorithm, letting the compiler optimize for you, then, if you really need, optimize yourself by hand.

Posted 10-Oct-12 22:43pm

CPallini

v2