In this exercises you will speed-up the filling of an histogram in cartesian coordinates starting from points in polar coordinates
-
start from the code in
binning.cpp
-
compile it and use ‘perf record ./a.out’ & ‘perf report’ to identify “hot spots’
-
modify it to use
sin
andcos
fromsimpleSinCos.h
-
optimize it to run faster in scalar mode
-
modify the code to enable auto-vectorization
-
Change the code to use gcc native-vectors
for each version compile with
-
c++ -g -O2 -Wall -fopt-info-vec -march=native binning.cpp
-
c++ -g -O3 -Wall -fopt-info-vec -march=native binning.cpp
-
c++ -g -Ofast -Wall -fopt-info-vec -march=native binning.cpp
-
try -funroll-loops
in case you decide to further optimize simpleSinCos make sure it reproduces the results.
For a full discussion of the optimization techniques that can be used in the solution look at the paper by Colfax
Bonus: add thread parallelization with OpenMP
More Bonuses: new compilers try harder: https://godbolt.org/g/1D8U2V