Click here to Skip to main content
15,616,585 members
Articles / Mobile Apps / Android
Posted 2 Mar 2015

Tagged as


1 bookmarked

Code Example of Power/Performance Optimization on Android* Using Intel® Intrinsics

Rate me:
Please Sign up or sign in to vote.
5.00/5 (1 vote)
2 Mar 2015CPOL4 min read
Code Example of Power/Performance Optimization on Android* Using Intel® Intrinsics

This article is for our sponsors at CodeProject. These articles are intended to provide you with information on products and services that we consider useful and of value to developers

Intel® Developer Zone offers tools and how-to information for cross-platform app development, platform and technology information, code samples, and peer expertise to help developers innovate and succeed. Join our communities for Android, Internet of Things, Intel® RealSense™ Technology and Windows to download tools, access dev kits, share ideas with like-minded developers, and participate in hackathon’s, contests, roadshows, and local events.


It goes without saying that battery life, especially of mobile devices, is critically important for users. We’ve all been in situations where we lose power right when we need it the most—navigating a new city, mid-conversation on an important call, and so on. It may not be completely intuitive, but by optimizing application performance, developers reduce power consumption and that helps users.

Analyzing Apps with a Combination of Intel® Graphics Performance Analyzers + VTune™ Amplifier

What is the first step to improve the power/performance of your application? First, you have to understand whether your app is CPU or GPU bound. And you can do it using a combination of Intel® tools:

Intel® Graphics Performance Analyzers or GPA is a tool for graphics analysis and optimization of Microsoft DirectX* applications and Android* OpenGL ES* applications. You can find more about it here:

For purposes of Android optimization I prefer the GPA console client. You can read about it here:

VTune™ Amplifier helps you analyze the algorithm choices and identify where and how your application can benefit from available hardware resources. Use VTune Amplifier to locate or determine the following:

  • The most time-consuming functions (hotspots) in your application and/or on the whole system
  • Sections of code that do not effectively utilize available processor time
  • The best sections of code to optimize for sequential performance and for threaded performance
  • Synchronization objects that affect the application performance
  • Whether, where, and how your application spends time on input/output operations
  • The performance impact of different synchronization methods, different numbers of threads, or different algorithms
  • Thread activity and transitions
  • Hardware-related bottlenecks in your code

Configure the data collection on the host system (Linux*, OS X*, or Windows*) and run the analysis on a remote system (Linux or Android). Remote analysis on Android and embedded Linux systems is supported by the VTune Amplifier for systems only.

You can read more here:

The figure below shows how to use a combination of GPA and VTune Amplifier to analyze and optimize your application.

What are Intel® Intrinsics

Intel® intrinsics are assembly-coded functions that allow you to use C/C++ function calls and variables instead of assembly instructions. Intrinsics provide access to instructions that cannot be generated using the standard constructs of the C and C++ languages.

Intrinsics are expanded inline, eliminating function call overhead. Providing the same benefit as using inline assembly, intrinsics improve code readability, assist instruction scheduling, and help reduce debugging.

You can read more here:

How to find and connect Intel® C++ Compiler for Android* OS to your project?

Intel® C++ Compiler for Android* OS is included in Intel® INDE suite. Inte®l C++ Compiler for Android* integrates in Android NDK and provides an optimized alternative to compile x86 libraries.

Download and install Intel C++ Compiler for Android. Provide a path to NDK directory during the installation to integrate Intel C++ Compiler for Android into Android NDK.

After the successful installation, the Intel® C++ Compiler for Android will be automatically integrated into the Android NDK toolchain and will compile optimized libraries for x86 architecture.


To demonstrate the usage of Intel intrinsics, let’s look at the C++ code:

Float x = 1.0f / sqrtf( y );

This type of code (especially in physics algorithms) often takes place in hotspots.

By analyzing this string in the VTune Amplifier, the profile will show you that the compiler generates sqrt + div instead of rsqrt.

The way to fix it is using Intel intrinsics:

Float x = rsqrt( y );

Where rsqrt is:



inline float rsqrt(const float x)
    float r;
    _mm_store_ss(&r, _mm_rsqrt_ss( _mm_load_ss(&x)));
    return r;

About the Author

Stanislav Pavlov works in the Software & Service Group at Intel Corporation. He has 10+ years of experience in technologies. His main interest is optimization of performance, power consumption, and parallel programming. In his current role as a Senior Application Engineer providing technical support for Intel®-based devices, Stanislav works closely with software developers and SoC architects to help them achieve the best possible performance on Intel® platforms. Stanislav holds a Master's degree in Mathematical Economics from the National Research University Higher School of Economics. He is currently pursuing an MBA in the Moscow Business School.


This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Written By
United States United States
Intel is inside more and more Android devices, and we have tools and resources to make your app development faster and easier.

Comments and Discussions

-- There are no messages in this forum --