P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\titulka.jpg PB173 - Tématický vývoj aplikací v C/C++ Skupina: Aplikovaná kryptografie a bezpečné programování •Petr Švenda svenda@fi.muni.cz P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Optimization •PB173 | Optimization 2 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Optimization steps 1.Do not optimize prematurely - write clean and correct code first! 2. 2.When code works, find performance bottleneck and remove it 3. 3.Document optimization and test it thoroughly •PB173 | Optimization 3 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Performance measurement - manual •Manual speed measure 1.Measure time before target operation 2.Execute operation 3.Measure time after target operation 4.Compute and print difference • •PB173 | Optimization •clock_t elapsed = -clock(); •aes256_encrypt_ecb(&ctx, buf); •elapsed += clock(); 4 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Manual measurement – possible problems •It is time consuming –additional code, manually inserted –less readable, error prone (use DEBUG macro) •Precision –some function returns time in seconds (e.g., time()) •short operations will take 0 –prefer functions returning result in ms or CPU ticks •e.g., clock() •check documentation for real precision –run operation multiple times (e.g., 1000x) •and divide the resulting time by that factor •PB173 | Optimization 5 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Manual measurement – possible problems •Additional unintended overhead may screw the results –one-time initialization of objects –cache usage, disk swap –garbage collection (not in C/C++) •Need to know the probable bottleneck in advance –timing code is inserted manually –you are selecting what you like to measure –time consuming to localize bottleneck • •PB173 | Optimization 6 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Automatic measurement - profiling •Automatic tool to measure time and memory used •“Time” spend in specific function •How often a function is called •Call tree –what function called actual one –based on real code execution (condition jumps) •Many other statistics, depend on the tools • • •PB173 | Optimization 7 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Profiling methods •Profiling method: CPU Sampling –check periodically what is executed on CPU –accurate, low overhead •Profiling method: Instrumentation –automatically inserts special accounting code –will return exact function call counter –(may affect performance timings a bit) •additional code present •PB173 | Optimization 8 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg MS Visual Studio Profiler •Visual Studio 2013 or earlier –Analyze->Launch Performance Wizard •Visual Studio 2015 –Debug ® Profiler ® Start diagnostic tools •May require admin privileges (will ask) •PB173 | Optimization 9 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg MS VS Profiler – results (Summary) •Where to start the optimization work? – •PB173 | Optimization perf_summary 10 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg MS VS Profiler – results (Functions) •Result given in number of sampling hits –meaningful result is % of total time spend in function •Inclusive sampling –samples hit in function or its children –aggregate over call stack for given function •Exclusive sampling –samples hit in exclusively in given function –usually what you want •fraction of time spend in function code (not in subfunctions) – •PB173 | Optimization 11 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg MS VS Profiler – results (Functions) • •PB173 | Optimization perf_Functions •Doubleclick to move into Function Details view 12 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg 46 % of time spend in gf_alog function • • • • • • •How to speed up gf_alog function? •PB173 | Optimization 13 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg aestab.c • •PB173 | Optimization •AES_RETURN aes_init(void) •{ uint_32t i, w; • •#if defined(FF_TABLES) • • uint_8t pow[512], log[256]; • • if(init) • return EXIT_SUCCESS; • /* log and power tables for GF(2^8) finite field with • WPOLY as modular polynomial - the simplest primitive • root is 0x03, used here to generate the tables • */ • • i = 0; w = 1; • do • { • pow[i] = (uint_8t)w; • pow[i + 255] = (uint_8t)w; • log[w] = (uint_8t)i++; • w ^= (w << 1) ^ (w & 0x80 ? WPOLY : 0); • } • while (w != 1); •// ... 14 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg MS VS Profiler – save results •You can save results and compare later •To check the real impact of your optimization •Don’t forget to eventually stop the optimization J •PB173 | Optimization 15 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg GCC gcov tool •http://gcc.gnu.org/onlinedocs/gcc/Gcov.html#Gcov 1.Compile program by GCC with additional flags –gcc -Wall -fprofile-arcs -ftest-coverage main.c –gcc -Wall --coverage main.c –additional monitoring code is added to binary 2.Execute program –files with “.bb" ".bbg" and ".da" extension are created 3.Analyze resulting files with gcov –gcov main.c –annotated source code is created •Lcov - graphical front-end for gcov –http://ltp.sourceforge.net/coverage/lcov.php – – • 16 PB173 | Optimization P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg • 17 PB173 | Optimization D:\lcov.png Taken from http://ltp.sourceforge.net/coverage/lcov/output/example/methods/iterate.c.gcov.html P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Memory consumption profiling •Not provided by MSVS Profiler for native apps –unfortunately, available only for managed code •Visual Studio is detecting memory leaks! –_CrtDumpMemoryLeaks() –run program in debug mode (possibly without any breakpoint) –let it finish and watch Output pane •Valgrind -v --leak-check=full •Write your own new and delete –and log the allocated/freed memory •PB173 | Optimization 18 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg • 19 PB173 | Optimization P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Assignment – performance analysis •Produce detailed speed estimation for: –New user registration –User authentication to server –Obtain list of other users –Prepare protected message for another user –Unprotect message from another user •Which function(s) is consuming most of the CPU? –provide a list with %, discuss possible improvements •Improve performance of the most significant bottleneck twice (resulting time 50%, document) •PB173 | Optimization 20 P:\CRCS\2012_0178_Redesign_loga_a_JVS\PPT_prezentace\sablona\pracovni\normalni.jpg Submissions, deadlines •Upload application source codes as single zip file into IS Homework vault (Crypto - 6. homework (Performance)) –2xA4 with performance analysis •DEADLINE 10.4. 12:00 –0-10 points assigned 21 PB173 | Optimization