Click here to Skip to main content
15,881,833 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
See more:
Overall I must optimize my code for size, due to targeting IoT devices which have small amounts of SRAM with which to code for execution, and which have small program flash sizes, but there are some situations where I need trivial functions to be aggressively inlined.

Unfortunately, when optimizing for size, GCC basically seems to take over deciding when something should be inlined entirely, causing me some major performance problems in certain codepaths.

In order to get around this, rather than declaring an inline function, I use preprocessor macros with the function's implementation inside of them, and use those instead of a function call when I need to do something.

That leads to methods that look like this:

C++
virtual void address_window(
        uint16_t x1, 
        uint16_t y1, 
        uint16_t x2, 
        uint16_t y2) {
    HTCW_RS_C; HTCW_WRITE8(ca_set);
    HTCW_RS_D; HTCW_WRITE16(x1); HTCW_WRITE16(x2);
    HTCW_RS_C; HTCW_WRITE8(ra_set);
    HTCW_RS_C; HTCW_WRITE16(y1); HTCW_WRITE16(y2);
    HTCW_RS_C; HTCW_WRITE8(ram_wr);
    HTCW_RS_C;
}


all the HTCW_XXXX things are preprocessor macros.

I'm one of those weirdos that thinks that in an ideal world, in C++ the preprocessor should be totally unnecessary. You should be able to anything you can do with the preprocessor using const/constexpr/inline/template. In theory, you can, but because "inline" is a suggestion at best, in practice I'm still stuck with the preprocessor ugliness.

Does anyone know of a better way?

What I have tried:

I've tried declaring my functions inline and even only putting them in the header, and GCC nevertheless will sometimes create a call for it.
Posted
Updated 7-Dec-21 2:51am

Have you tried the always_inline attribute? See the attribute specification documentation over at cppreference.com[^] That requires C++-11, so will hopefully be available for your G++ tool chain.
Also see Optimize Options (Using the GNU Compiler Collection (GCC))[^] for other possible G++ optimization flags that might affect this.
Alternatively, have you looked into clang? Does it do things any differently.

As an aside, it seems to me like optimize for size and "always inline" are opposing goals. If you're trying to reduce size, then it would seem that you don't want to inline, which will increase size.

Depending on the makeup of the project, perhaps you could designate compilation units as "Optimize for size" vs "Optimize for speed". Though I have to confess I have no idea how you might make that work. I might be able to do that with a Makefile, but since you have expressed a preference for Visual Studio elsewhere, you would have to figure that out for yourself.
 
Share this answer
 
Comments
CPallini 6-Dec-21 16:31pm    
5.
k5054 6-Dec-21 17:32pm    
Further reading the Common Function Attributes (Using the GNU Compiler Collection (GCC))[^] suggests that maybe the attribue "flatten" might get you closer to what you want. From the docs: Quote:flatten
Generally, inlining into a function is limited. For a function marked with this attribute, every call inside this function is inlined, if possible. Functions declared with attribute noinline and similar are not inlined. Whether the function itself is considered for inlining depends on its size and the current inlining parameters.

I read that as the flatten attribute applies to all function calls within the annotated function, not the function itself. So perhaps a combination of flatten for the caller and always_inlinefor the called functions gets you there. Or at least closer.
honey the codewitch 6-Dec-21 18:13pm    
That could work, so I'm accepting it, since I wasn't familiar with the attribute prior. I'll give it a shot. I have to optimize for size. I'm not even sure how my toolchain would react otherwise, I just can't afford the extra space, particularly on STM32 boards. I can't use clang because my toolchain relies on GCC and I don't have any control over that. Actually all my IoT toolchains require GCC at the moment.

I do like to compile with both clang and GCC (when i'm not doing IoT-only) but GCC is my primary target.
I never tried that myself just for this purpose, but have you considered wrapping the code in question into a template function? If I'm not mistaken that should force it to be inline. E. g.:
C++
template <typename T>
void address_window_internal(T x1, T x2, T y1, T y2) {
    // do stuff
}
or, if you prefer using the macro names you already introduced:
C++
template <typename T>
void HTCW_RS_C(T& x1, T& x2, T& y1, T& y2) {
    // do stuff
}


P.S.: a quick web search resulted in different opinions, but there appears to be a consensus that template member functions are probably always treated as inline. So maybe that could work.
 
Share this answer
 
v2
Comments
honey the codewitch 7-Dec-21 9:10am    
Hmmm.. It seems a bit sloppy to me because I don't like the idea of introducing arbitrary template arguments just to templatize a function. The code smell, you understand. If I come along in 6 months and every function is templatized with dummy arguments, I don't what I'd do. I think I'll stick with the __attribute((always_inline)) route suggested before, because I'm targeting GCC specifically.
Stefan_Lang 7-Dec-21 9:42am    
I totally understand. Personally I wouldn't do this myself unless I am really desperate to optimize my code, and have proven that this will fulfil the requirements (and then I'd make sure to document it very clearly!).

I only suggested this because some time ago I had to work with someones code who was crazy for premature optimizations, typically using inline, but also by extending templated code further than needed. (not saying your optimization is premature - that was just the context that made me think on templates being inline)
honey the codewitch 7-Dec-21 11:10am    
Totally. This code flips physical digital pins on a chip high and low, which it needs to do fast and precisely timed, because it's a driver for an 8-bit parallel bus attached to a (usually color) LCD or TFT display. The frame rates I can achieve with graphics draws are directly tied to this code.

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900