Profiling - Falmouth-Games-Academy/comp350-research-journal GitHub Wiki

There is no doubt that the grail of efficiency leads to abuse. Programmers waste enormous amounts of time thinking about or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

Yet we should not pass up our opportunities in that critical 3%. A good programmer will not be lulled into complacency by such reasoning, he will be wise to look carefully at the critical code; but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail. After working with such tools for seven years, I've become convinced that all compilers written from now on should be designed to provide all programmers with feedback indicating what parts of their programs are costing the most; indeed, this feedback should be supplied automatically unless it has been specifically turned off.

Donald E. Knuth 7(http://cowboyprogramming.com/files/p261-knuth.pdf)

This statement is often misquoted and misinterpreted. In particular, the clause "premature optimization is the root of all evil" has led many software engineers to believe that optimisation is unimportant. Software developers must reject this prevailing attitude that performance is not important. Once they decide that writing efficient software is worthwhile, the next step is to learn how high-level language compilers and interpreters process those high-level constructs; once the engineer understands this process, choosing appropriate high-level constructs will become second nature and won't incur additional development cost 8(http://delivery.acm.org/10.1145/1520000/1513451/v10i3_hyde.pdf?ip=86.153.15.167&id=1513451&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1549741127_e97f90b01ccedc499fdb43bdc6bf2768).

Profiling

In computing, profiling is a form of analysis that measures different aspects of a program, such as, the frequency and duration of function calls, the usage of particular instructions, Time complexity or Space Complexity 1(https://en.wikipedia.org/wiki/Profiling_(computer_programming)). Profiling a program to find it's hotspots is the necessary first step in optimising the performance of the program 2(https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity1.html).

The Pareto Principle 9(https://www.emeraldinsight.com/doi/abs/10.1108/eb024706) 10(https://s3.amazonaws.com/academia.edu.documents/38161019/juran.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1549746208&Signature=1TqgmazgT3ULpfZIyp7Sv6WDAVI%3D&response-content-disposition=inline%3B%20filename%3DJuran.pdf), also known as the 80/20 rule, suggests that the majority (80%) of the execution time will be taken by a small (20%) portion of the code. However, this will most probably be spread throughout the source code and not easy to modify 8(http://delivery.acm.org/10.1145/1520000/1513451/v10i3_hyde.pdf?ip=86.153.15.167&id=1513451&acc=OPEN&key=4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E4D4702B0C3E38B35%2E6D218144511F3437&__acm__=1549741127_e97f90b01ccedc499fdb43bdc6bf2768). Only by profiling can these portions be made apparent. Profiling is achieved using a profiler, a tool which tracks lots of information about a program that is useful for improving the efficiency and performance of the code 3(https://stackify.com/what-is-code-profiling/).

Instrumented Profilers

Instrumented profilers require the programmers to add instructions to sections of the program they would like to measure. The profiler will measure the time it takes to do sections that have been marked.

void Node::Update()
{
    FT_PROFILE_FN
    for(Object* obj : mObjects)
    {
        obj->Update();
    }
}

In this section of code, FT_PROFILE_FN creates an object that measures the time when it is created and then when it falls out of scope 4(https://engineering.riotgames.com/news/profiling-measurement-and-analysis). Instrumenting a program can sometimes cause performance differences, which can lead to inaccurate data or heisenbugs 1(https://en.wikipedia.org/wiki/Profiling_(computer_programming)).

Profiling in Unreal Engine

Profiling in Unreal Engine is done separately for the CPU and GPU.

The GPU profiler is useful for artists optimisation. Typing ProfileGPU into the console shows what is causing a GPU bottleneck e.g. static lights or shadows. Which can then be optimised accordingly to get rid of the bottleneck[5].

The CPU profiler is incredibly useful for designers and artists as well as programmers. The CPU profiler checks for draw calls created by 3D objects, this is what designers and artists should keep an eye on. Whereas programmers can concentrate on the game stats such as Blueprint time and tick time 6(https://docs.unrealengine.com/en-us/Engine/Performance/CPU).

It is also possible to use the visual studio profiler with Unreal Engine 4. To do this, you need to open your projects visual studio solution and run the editor through there. Doing it this way will also allow you to profile the editor itself. This does also mean your profiler data will include any of the editors function calls as well, making it hard to view game specific data. To profile just your game: change your solution configuration to the debugGame solution, profiling this should give you an accurate profile of your game.

Bottlenecks

In computing, a bottleneck occurs when the performance of a program is limited by a specific component. This often happens when the software relies too heavily on said component 11(http://blog.logicalincrements.com/2017/09/what-cpu-gpu-computer-bottlenecks-how-to-detect-them/).

For example, an application could be memory intensive while barely using the CPU. To balance this a trade-off can be made. The program can be changed so that more calculations are done on the fly rather than collecting the data from memory. Potential bottlenecks can be located using profiling techniques.

When making software for specific hardware it is important to understand the strengths and weaknesses of the hardware in order to avoid running into bottlenecks. To make the most out of a strong GPU more work should be allocated to it.

If we look at video games we find that some games are more CPU intensive while others are GPU intensive. According to WePC 12(https://www.wepc.com/tips/cpu-gpu-bottleneck/), the games that are more CPU heavy are the ones that have high FPS with low-resolution graphics such as Cities, Skylines, Minecraft and Civilization V. On the other hand, we have the GPU heavy games which show higher frame rates when running on a higher quality graphics card such as The Witcher 3, Metro, Last Light and Borderlands 2. The CPU intensive games given as an example require many more calculations for the gameplay. The GPU games are more focused on shaders and impressive graphics.

Call Count

The call count measures the number of times a function has been called another object or function. This metric can be used to identify the most commonly called functions in your code. Knowing which functions are called most often will allow you to identify where to focus your time optimising. Optimising code that is rarely executed will see less of a return than optimising code that is called often[13]. Knowing which code is most commonly executed allows you to optimise optimally.

The call count is located in the profiler, and becomes most useful for bug detection when it is run while the game is being played. This will help the developer easily identify at what time something gets called. If the call count increases rapidly you know somethings wrong, this is an easy bug to miss that will have a big influence on frame rate.

References

[1] https://en.wikipedia.org/wiki/Profiling_(computer_programming)

[2] https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity1.html

[3] https://stackify.com/what-is-code-profiling/

[4] https://engineering.riotgames.com/news/profiling-measurement-and-analysis

[5] https://docs.unrealengine.com/en-US/Engine/Performance/GPU

[6] https://docs.unrealengine.com/en-us/Engine/Performance/CPU

[7] Knuth, Donald E. "Structured Programming with go to Statements." ACM Computing Surveys (CSUR) 6.4 (1974): 261-301.

[8] Hyde, Randall. "The fallacy of premature optimization." Ubiquity 2009.February (2009): 1.

[9] Sanders, Robert. "The Pareto principle: its use and abuse." Journal of Services Marketing 1.2 (1987): 37-40.

[10] Juran, Joseph, and A. Blanton Godfrey. "Quality handbook." Republished McGraw-Hill (1999): 173-178.

[11] http://blog.logicalincrements.com/2017/09/what-cpu-gpu-computer-bottlenecks-how-to-detect-them/

[12] https://www.wepc.com/tips/cpu-gpu-bottleneck/

[13]https://www.toptal.com/full-stack/code-optimization