Click here to Skip to main content
15,879,535 members
Please Sign up or sign in to vote.
5.00/5 (1 vote)
See more:
I have noticed that when I call a function multiple times with the same arguments, all of them immutable, in the same scope, the compiler does not save the result of the first call to be used as the result of the other calls. The function neither change nor depend on any outside state. The context is completely “pure and immutable”, at least as much as F# is, I’m aware that it allows impurity and almost everything that can be done with .Net. I created this simple code example to demonstrate what I mean:

F#
let rec pureFib n : int64 =
    if n = 0 then
        1L
    elif n = 1 then
        1L
    else pureFib(n - 2) + pureFib(n - 1)
 
let shouldBeCalledOnce n =
    let r1 = pureFib n

    let r2 = pureFib n
    //let r2 = r1  //uncomment this line and comment the previuous one and you will get almost a 2x speed boost

    r1 + r2


On my computer calling shouldBeCalledOnce with n > 42 last enough to notice the difference, you can also use this entry point to test it in a console application.

F#
[<EntryPoint>]
let main argv = 
    let crono = System.Diagnostics.Stopwatch.StartNew()
    let v = shouldBeCalledOnce 43
    printfn "%A" crono.ElapsedMilliseconds
    int v


The issue (if there is really one) is easy to get, so, I suppose there is no need for complex benchmarks. I compiled the code with optimize+ and without debug information, only pdb.
If one of the key features of functional programming is that a function will always return the same value when called with the same arguments, that’s something the compiler should use to avoid multiple meaningless calls to save time, and in some cases improve readability and the understanding of code.
In short, my questions are the following:
1. Does F# compiler have a way of optimizing multiple calls to pure functions?
2. For that first question to be true, it would imply that the compiler was able to know when a function is pure and when it is not. Does it?
Posted
Updated 26-Aug-14 8:27am
v2
Comments
Sergey Alexandrovich Kryukov 25-Aug-14 16:39pm    
By "pure function", do you mean a function with no side effect, only returning results? For your question to make sense, it should also required that the result should depend only on the input parameters, otherwise a call to, for example, System.DateTime.Now should not allow this optimization.
—SA
Enrique A. Casanovas Pedre 26-Aug-14 14:02pm    
Yes, that is what I mean when I say a pure function. A function which result only depends on the input parameters. Sorry if I wasn't clear enough, I supposed I was.
Sergey Alexandrovich Kryukov 28-Aug-14 13:41pm    
Please, next time be more careful with commenting: if you commented on any of my post, in this case, on my comment above, even indirectly (when your comment node becomes an indirect child of the post you are commenting), I would receive a notification on your post. But your comment above was put as a comment to your own post and not mine, so I did not get any notification and paid attention for it only because of your comment on my answer. It will be useful to take this aspect into account for your future comment.

Thank you.
—SA

1 solution

Please confirm that I understand your question correctly; look at my comment to the question.

Here is my idea: at the level of the compiler, it's really hard to do this optimization. In some cases, the compiler does not have enough information. Consider there is a way to figure out that the function is "pure". It should check up not only the body of some functions, but also all called functions, recursively. But not all such functions come with the source code. What would this optimizer do? If all the called code is the .NET code, it's possible to reverse-engineer all code. But face it: ultimately, some code will always do unmanaged calls (unless the OS is something like Singularity, pure managed OS). You see, the CLI standard does not have a slot in metadata which carry information on the "purity" of the function, even if it is a part of .NET BCL.

So, that was my arguments explaining why I don't think that such optimization makes sense (again, please validate that I understood your idea correctly; look at my comment to the question). But now, in all cases, you can easily figure out what is going on in reality. Optimize your compilation, obtain the assembly (IL code) and reverse-engineer it. This is easy to do with very good quality if you use .NET Reflector or open-source ILSpy:
http://en.wikipedia.org/wiki/.NET_Reflector[^],
http://ilspy.net[^].

[EDIT]

Please see also my comment to the question, where I question the practical value of this optimization, due to presence of different function arguments.

For those rare cases when it makes a lot of practical sense (function calculation is really slow, calculations with the same parameters or no parameters is quite likely), you can easily introduce such optimization on an application level: create a dictionary of functions results found by the compound key based on input function arguments.

As it's not critical that you still recalculate the function result few extra times, you could have one dictionary per function (you hardly can have many of such functions), not having to put all the information on the function parameters. If could be just a hash value based on parameters value. Say,
C#
using System.Collections.Generic;

//...

type FunctionType = //...
FunctionType MyFunction(params object[] parameters) { /* ... */ }  // could be any types

//...

Dictionary<int, FunctionType> myFunctionOptimizationDictionary =
   new Dictionary<int, FunctionType>();


FunctionType MyFunctionWrapper(params object[] parameters) {
    int key = 0;
    foreach(object @object in parameters)
        key ^= @object.GetHashCode(); // for example
    FunctionType result;
    if (!myFunctionOptimizationDictionary.TryGetValue(key, out result) {
        result = MyFunction(parameters); // in other cases, there is no a call
        myFunctionOptimizationDictionary.Add(key, result);
    } //if       
    return result;
}
Simple, isn't it. You can easily make this algorithm generic.

However, the side effect of such optimization can be the overuse of the memory spent for storing the "optimized" keys and values. Always a trade off. This fact, too, is a point telling us that the practical use of the optimization at the CLR level would be quite questionable.

I hope we can close this issue now. What will you say?

—SA
 
Share this answer
 
v4
Comments
Enrique A. Casanovas Pedre 26-Aug-14 14:24pm    
I understand your explanation perfectly and that is what I thought, but I still think the compiler can tag a function as pure or not, at least being sure that if a function is tagged as pure then it is without any doubts (like the example), and the optimization can be done, but if a function is not tagged as pure it is assumed as not being pure and therefore not optimized. Any function which calls another not tagged is assumed as not pure. I think you get what I mean. I don't know how the F# compiler is built, but that idea is anything but new and perhaps it may be implemented. Please, forgive my persistence, but I just don't want see any opportunity of improving the language wasted. Thanks for the answer and I apologize for my late one.
Sergey Alexandrovich Kryukov 26-Aug-14 15:33pm    
Theoretically speaking, it can, but how can it deal with existing CLR standard? This is exactly what I meant my mentioning "carry information on the "purity" of the function". If the source code is not available, this "tag" should be stored in the assembly. In what form?

Also, I thought a bit more and made a conclusion that this hypothetical optimization might not be that prudent as it may seem at first. This is because the functions typically have some function arguments. If so, they are typically called with different set of parameters (otherwise, why using those parameters). And it means that you would need to recalculate the function anyway. You could also avoid some recalculations, if you have a dictionary based on a compound key, say, reference to the function compound with the parameter set. Thank you could avoid some recalculations when the same functions is called with same exact set of parameters again. Now, think how likely is it? How much of performance would it improve, considering the relative expenses of composing a key from the parameters and searching in a dictionary (of course, O(1), but how fast?), as compared with just the recalculations. It makes the practical value of such optimization quite questionable.

That brings me to another advice: for those rare cases when it makes a lot of practical sense (function calculation is really slow, calculations with the same parameters or no parameters is quite likely), you can easily introduce such optimization on an application level: create a dictionary of functions results found by the compound key based on input function arguments.

I think we can wrap out our discussion: I provided some analysis and gave your two practical recipes: one for finding out how the code is optimized, another one for performing the optimization on an application level. Will you now accept the answer formally (green "Accept" button)?

—SA
Enrique A. Casanovas Pedre 28-Aug-14 9:56am    
Sorry, I forgot to mark the question as solved since the first time. My question was actually a yes/no question and I know now for sure that the answer is no. Although that second analysis you gave me is an overkill, my purpose was not to go that far, I didn't pretend to make a global optimization of every function call. I just have had similar cases to that example I showed in my life as a programmer and thought that with a functional language it was easier to do it. I don't like having an extra line of code saving a result just to use it two lines below or in the next iteration, like this common case:
for(int i = 0; i < data.GetSize(); i++)
...
when data.GetSize() is "slow" and constant during the loop cycles, we use:
int n = data.GetSize()
for(int i = 0; i < n; i++)
...

Well, I prefer the former. I also understand that the CLI metadata wouldn't allow this kind of optimization, and maybe the effort to solve it is bigger than the gain it would produce. Thanks again, It seems I am the only one concerned/interested about this.
Sergey Alexandrovich Kryukov 28-Aug-14 13:49pm    
No problem at all. And thank you for this little code sample. Here is my conclusion:

Even though the common programming wisdom teaches not to do "manual optimization", it all depends on what is called "optimization". True, thorough optimization makes little sense and can even make things worse (at least readability, but sometimes even the performance itself). At the same time, "optimization through design" is something to understand. As to the aspect you mentioned, there is a fundamental difference between expression and a function call. The expression usually can be optimized. As to the function call, the compiler often would not have enough information to make sure the result to the call would be the same. Therefore, calling something like GetSize (on the same non-modified object representing the set, of course) really makes sense. Our discussion could be useful for many readers who which to develop software in state-of-the art style.

Again, thank you for a good interesting question, which could serve as a model for other inquirers.

—SA

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900