Profiling C++ & Python

Measuring Performance in C++ & Python

Zach Wolpe
4 min readJul 10, 2023

Profiling is essential to optimizing code and understanding runtime characteristics.

Here are my tools for profiling my C++ and Python code.

These decorates can be attached to any C++ or Python script used to generate and analyse runtime statistics.

Below I provide a guide to Getting Started.

C++ Profiler

Explanation

The C++ profiler contains an Instrumentor class which is responsible for writing timing data to .json . This is then leveraged by the CPUProfiler class to write runtime data. The CPUProfiler decorates a function that you wish to time.

Usage

  1. Include the class declaration in the required source file (preferably using a header file).
#include "[$PATH]/profiler.cpp"

2. Attached the macro (that calls the CPUProfiler decorator) to the desired code block.

void function() {
PROFILE_FUNCTION();
// some function...
};

3. Compile and build your program, you should find a new file profile.json in your current directory.

4. Go to chrome://tracing/ (type chrome://tracing into your chrome browser).

5. Upload the profile.json file.

Example

To illustrate how functions can wrap other functions, and this can all be interpreted by the profiler, consider 4 dummy functions (funtion1(), function2(), functionWrapper(), programWrapper() ). They do nothing but perform a loop, but note the order of operations.

A dummy function is created called RunBenchmark(). The macro decorator is attached to this function.

#include "cppModules/profiler.cpp"

void looper() {
// nonsense tester operation.
for (int i=1; i<=1000000; i++) {};
};

void function1() {
PROFILE_FUNCTION();
std::cout << " - Running function1..." << std::endl;
looper();
};

void function2() {
PROFILE_FUNCTION();
std::cout << " - Running function2..." << std::endl;
looper();
};

void functionWrapper() {
PROFILE_FUNCTION();
std::cout << "functionWrapper Benchmarks..." << std::endl;
looper();
function1();
function2();
};

void programWrapper() {
PROFILE_FUNCTION();
std::cout << ">> Running programWrapper..." << std::endl;
Instrumentor::Get().LaunchSession("myprofiler", "profiler.json");
functionWrapper();
Instrumentor::Get().EndSession();
};

int main() {
programWrapper();
return 0;
};

Compile:

g++ -o run_profiler run_profiler.cpp

Execute:

./run_profiler

Navigate to chrome://tracing and upload the nearly created profiler.json

Example: Build and run.
Expected output on chrome://tracing.

Python Profiler

My Python Profiler is composed of 2 parts, a CPU and a RAM monitor. A base class Interface is used to define each module's structure.

Explanation

Each child class implements a profiler method. The memory profiler approximates RAM usage by sampling the change in virtual memory during the attached algorithm. The CPU profiler follows the same structure but produces a .prof file — which can be used to profile code with external tools.

Usage

Simply attach the decorators @CPUProfiler.profile and @MemoryProfiler.profile to any code and you’ll generate the necessary data.

The following example script is called run_profiler.py.

  1. Install install snakviz
python -m pip install snakeviz

2. Execute the Python script containing the profiler decorators.

python [$SCRIPT].py

The memory data and statistics are now available for analysis. Additionally, the CPU profiler should have produced a .prof.

3. Launch snakeviz from the command line

snakeviz [$PATH]/pstats.prof

Example

Consider the following script, called run_profiler.py. Note that the profiler decorators are included in the script.

from modules.profiler import *

if __name__ == '__main__':

@CPUProfiler.profile
@MemoryProfiler.profile
def test_func(*args, **kwargs):
a = []

print('launching test_func()..')
time.sleep(1)
for i in range(100000):
a.append(i)
print('test_func() finished.')
return True

result, mem_profile = test_func(sav_loc='temp-store')
print('\nresult: ', result)
print('mem_profile: ', mem_profile)

Execution:

Expected output: Trace plots of runtime characteristics. Expected output (top of page).
Expected output: Table runtime characteristics. Expected output (bottom of page).

CAVEATS**: Better tools exist to profile code extensively. I use MacOS so XCodes Instruments is likely the best option. One should also be aware of what is actually being profiled. C++s compiler optimizers code and can radically change your code (for example, compresses loops and deterministic calculations) so it’s best (at least initially) to run your profiler in debug mode. Finally one should be aware of any overhead incurred by the profiler itself (CPU/RAM usage).

--

--