![]() |
Profile Mode removes many of gDEBugger's debugging features to enable a performance experience as close to the real results as possible. The main performance measurement used for graphics applications is the number of frames rendered per second or fps.
This is gDEBugger GL's default layout for Profile Mode. Move your mouse over the screenshot to display the different views' names.
|
|
Work with performance counters: Click on any of the counters in the Performance Dashboard view, or double-click a counter name in the list next to the Performance Graph View. This will bring up the Performance Counters dialog. The available counters tree displays all available performance counters and the active counters list displays the currently selected counters. The available performance counters are divided into three groups:
To add a performance counter double-click it in the available counters tree or select it and press the Add button. To remove a counter, double-click it in the active counters list or select it and press the Remove button. To clear all the performance counters from the active counters list, press the Remove All button. After making the required changes, press the OK button to apply your changes.
Press the Save Profiling Data button to save the currently collected performance data (from the Performance Graph View) to a .csv format file. By saving the performance counters' data you can compare the change in various performance metrics to perform regression test (between different versions of your application) or to test your application on different graphics hardware and driver configurations.
Locate Graphic Pipeline bottlenecks: The graphics system generates images through a pipelined sequence of operations. A pipeline runs only as fast as its slowest stage. The slowest stage is often called the pipeline bottleneck. A single graphics primitive (for example, a triangle) has a single graphic pipeline bottleneck. However, the bottleneck may change when rendering a graphics frame that contains multiple primitives. For example, if the application first renders a group of lines and afterwards a group of lit and shaded triangles, it is possible for the bottleneck to change.
OpenGL Graphic Pipeline (data advances from left to right) | ||||||
Runs on CPU | Runs on GPU | |||||
Application | OpenGL Driver | Vertex Shading | Geometry Shading | Primitive Assembly | Fragment Shading | Frame Buffers |
Light Operations | ||||||
Texture Operations |
There are two approaches for locating graphics pipeline performance bottlenecks:
The first one is to view the performance counters which supply information about specific pipieline stages utilization. The performance bottleneck will usually be in the pipeline stage which has the highest utilization rate.
The second approach is using gDEBugger to disable various stages of the OpenGL graphic pipeline and observe the changes in performance. This approach requires using some performance counters that will enable measuring the application's overall performance. CPUs and GPUs utilization along with frames/sec counters are usually the best metrics for this. Let the application run and observe of the performance counters' values. Then, using the Performance Toolbar, apply different forced modes to disable various stages in the graphic pipeline. The performance will improve when you disable the OpenGL graphics pipeline stage that contains the bottleneck. Each of the forced modes disables different stages of the OpenGL pipeline as shown in the following table:
OpenGL graphic pipeline stages disabled by each forced mode | ||||||||
Application + Driver |
Vertex Shading |
Geometry Shading |
Primitive Assembly |
Fragment Shading |
Lights | Textures | Frame Buffers | |
![]() |
X | X | X | X | X | X | X | |
![]() |
X | X | X | X | ||||
![]() |
X | |||||||
![]() |
X | |||||||
![]() |
X | X | ||||||
![]() |
X |
For example, if these commands improve the performance but the other commands don't, the bottleneck is in the light operations.
The next table shows for each possible graphics pipeline bottleneck location, which forced modes will improve the performance ("Yes" means that applying the forced mode is expected to improve the performance).
Bottleneck | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
Application | No | No | No | No | No | No |
Driver | No | No | No | No | No | No |
Vertex Shading | Yes | No | No | No | No | No |
Geometry Shaders | Yes | No | No | No | Yes | No |
Primitive Assembly | Yes | No | No | No | No | No |
Fragment Shading | Yes | Yes | No | No | Yes | Yes |
Light Operations | Yes | Yes | Yes | No | No | No |
Texture Operations | Yes | Yes | No | Yes | No | No |
Frame Buffers | Yes | Yes | No | No | No | No |
For Example, if the bottleneck is in the Texture Operations phase, these three commands will improve performance while these three
won't.