Profiling C++ & Python
Measuring Performance in C++ & Python
Profiling is essential to optimizing code and understanding runtime characteristics.
Here are my tools for profiling my C++ and Python code.
These decorates can be attached to any C++
or Python
script used to generate and analyse runtime statistics.
Below I provide a guide to Getting Started.
C++ Profiler
Explanation
The C++ profiler contains an Instrumentor
class which is responsible for writing timing data to .json
. This is then leveraged by the CPUProfiler
class to write runtime data. The CPUProfiler
decorates a function that you wish to time.
Usage
- Include the class declaration in the required source file (preferably using a header file).
#include "[$PATH]/profiler.cpp"
2. Attached the macro
(that calls the CPUProfiler
decorator) to the desired code block.
void function() {
PROFILE_FUNCTION();
// some function...
};
3. Compile and build your program, you should find a new file profile.json
in your current directory.
4. Go to chrome://tracing/ (type chrome://tracing into your chrome browser).
5. Upload the profile.json
file.
Example
To illustrate how functions can wrap other functions, and this can all be interpreted by the profiler, consider 4 dummy functions (funtion1(), function2(), functionWrapper(), programWrapper()
). They do nothing but perform a loop, but note the order of operations.
A dummy function is created called RunBenchmark()
. The macro decorator is attached to this function.
#include "cppModules/profiler.cpp"
void looper() {
// nonsense tester operation.
for (int i=1; i<=1000000; i++) {};
};
void function1() {
PROFILE_FUNCTION();
std::cout << " - Running function1..." << std::endl;
looper();
};
void function2() {
PROFILE_FUNCTION();
std::cout << " - Running function2..." << std::endl;
looper();
};
void functionWrapper() {
PROFILE_FUNCTION();
std::cout << "functionWrapper Benchmarks..." << std::endl;
looper();
function1();
function2();
};
void programWrapper() {
PROFILE_FUNCTION();
std::cout << ">> Running programWrapper..." << std::endl;
Instrumentor::Get().LaunchSession("myprofiler", "profiler.json");
functionWrapper();
Instrumentor::Get().EndSession();
};
int main() {
programWrapper();
return 0;
};
Compile:
g++ -o run_profiler run_profiler.cpp
Execute:
./run_profiler
Navigate to chrome://tracing and upload the nearly created profiler.json
Python Profiler
My Python Profiler is composed of 2 parts, a CPU and a RAM monitor. A base class Interface is used to define each module's structure.
Explanation
Each child class implements a profiler
method. The memory profiler approximates RAM usage by sampling the change in virtual memory during the attached algorithm. The CPU profiler follows the same structure but produces a .prof
file — which can be used to profile code with external tools.
Usage
Simply attach the decorators @CPUProfiler.profile
and @MemoryProfiler.profile
to any code and you’ll generate the necessary data.
The following example script is called run_profiler.py
.
- Install install
snakviz
python -m pip install snakeviz
2. Execute the Python script containing the profiler decorators.
python [$SCRIPT].py
The memory data and statistics are now available for analysis. Additionally, the CPU profiler should have produced a .prof
.
3. Launch snakeviz
from the command line
snakeviz [$PATH]/pstats.prof
Example
Consider the following script, called run_profiler.py
. Note that the profiler decorators are included in the script.
from modules.profiler import *
if __name__ == '__main__':
@CPUProfiler.profile
@MemoryProfiler.profile
def test_func(*args, **kwargs):
a = []
print('launching test_func()..')
time.sleep(1)
for i in range(100000):
a.append(i)
print('test_func() finished.')
return True
result, mem_profile = test_func(sav_loc='temp-store')
print('\nresult: ', result)
print('mem_profile: ', mem_profile)
Execution:
CAVEATS**: Better tools exist to profile code extensively. I use MacOS so XCodes Instruments is likely the best option. One should also be aware of what is actually being profiled. C++
s compiler optimizers code and can radically change your code (for example, compresses loops and deterministic calculations) so it’s best (at least initially) to run your profiler in debug mode. Finally one should be aware of any overhead incurred by the profiler itself (CPU/RAM usage).