Before installing GPU PerfStudio, uninstall all previous versions of GPU PerfStudio and PerfDash. GPU PerfStudio works best when run remotely. You must run the GPU PerfStudio installer on both machines – the “client” machine running the GPU PerfStudio application and the “server” (target) machine running the application to be profiled. If you select a custom install, you can install only the server or client on an individual machine.
GPU PerfStudio provides statistics on every D3D call executed per frame in addition to the hardware and driver level counters described below. Real time graphs and bar charts can easily be created for all numeric data. Not all counters are available for all cards.
|
Hardware Counter |
Description |
|
% Hardware Utilization |
Percent time GPU is busy |
|
% Vertex Wait for Pixel |
Percent time vertex processing is waiting for pixel processing to finish (can indicate slow pixel shaders) |
|
% Pixel Wait for Vertex |
Percent time pixel processing is waiting for vertex processing to finish |
|
Pre-clip Primitives |
Primitive count before clipping |
|
Post-clip Primitives |
Primitive count after clipping |
| % Blended Pixels |
Percent of total pixels drawn with blending enabled |
|
ALU to Texture Instruction Ratio |
Ratio between pixel shader ALU and texture instructions |
|
% Pixels Passed Z-test |
Percent of pixels which passed the Z-test |
|
Overdraw |
Total number of pixels drawn divided by the Overdraw counter resolution. This counter can also be representative of the number of render targets in use. |
|
Texture Cache Miss Rate |
Texture cache miss rate in bytes per pixel |
|
Post HiZ Sample Count |
Number of samples after HyperZ |
|
Post TopZ Pixel Count |
Pixels after early Z culling has taken place |
|
Post Shader Pixel Count |
Pixels after shading and alpha test have taken place |
|
TopZ Reject Rate |
Rate of pixel rejection due to early Z test |
|
Driver Data Counter |
Description |
|
Framerate |
Frames per second |
|
LocalTextureMem |
Local Texture Memory used |
|
AGPTextureMem |
AGP texture memory used |
|
LocalVBIBMem |
Local Vertex buffers and index buffer memory used |
|
AGPVBIBMem |
AGP Vertex buffers and index buffers memory used |
|
TextureUpload |
Texture data uploaded |
|
VBIBUpload |
Vertex buffers and index buffers data uploaded |
|
PrimsPerRSChange |
Primitives rendered per render state change |
|
PrimsPerTSChange |
Primitives rendered per texture state change |
|
PrimsPerVSChange |
Primitives rendered per vertex shader change |
|
PrimsPerPSChange |
Primitives rendered per pixel shader change |
|
PrimsPerVSCChange |
Primitives rendered per vertex shade constant change |
|
PrimsPerPSCChange |
Primitives rendered per pixel shader constant change |
|
FlipStall |
Stalls on frame buffer flip |
|
VBStall |
Stalls on vertex buffer |
|
GeometryBufferAllocatedDefault |
|
GeometryBufferAllocatedImmutable |
|
GeometryBufferAllocatedDynamic |
|
GeometryBufferAllocatedStaging |
|
GeometryBufferAllocated |
| Allocated memory for vertex and index buffers and stream output |
|
GeometryBufferUsedPercentage |
Percentage of allocated geometry buffer memory used |
|
ConstantBufferAllocatedDefault |
|
ConstantBufferAllocatedImmutable |
|
ConstantBufferAllocatedDynamic |
|
ConstantBufferAllocatedStaging |
|
ConstantBufferAllocated |
| Allocated memory for constant buffers |
|
ConstantBufferUsedPercentage |
Percentage of allocated constant buffer memory used |
|
RenderTargetAllocatedDefault |
|
RenderTargetAllocatedImmutable |
|
RenderTargetAllocatedDynamic |
|
RenderTargetAllocatedStaging |
|
RenderTargetAllocated |
| Allocated memory for render targets |
|
RenderTargetUsedPercentage |
Percentage of allocated render target memory used |
|
TextureDepthStencilShaderAllocatedDefault |
|
TextureDepthStencilShaderAllocatedImmutable |
|
TextureDepthStencilShaderAllocatedDynamic |
|
TextureDepthStencilShaderAllocatedStaging |
|
TextureDepthStencilShaderAllocated |
| Allocated memory for ShaderResources, DepthStencil buffers and Textures |
|
TextureDepthStencilShaderUsedPercentage |
Percentage of allocated TextureDepthStencilShader memory used |
|
PrimsPerDepthStencilStateChange |
Primitives rendered per depth stencil state change |
|
PrimsPerBlendStateChange |
Primitives rendered per blend state change |
|
PrimsPerGeometryShaderChange |
Primitives rendered per geometry shader change |
|
PrimsPerPSSamplerStateChange |
Primitives rendered per pixel shader sampler state change |
|
PrimsPerVSSamplerStateChange |
Primitives rendered per vertex shader sampler state change |
|
PrimsPerGSSamplerStateChange |
Primitives rendered per geometry shader sampler state change |
|
Override |
Description/Possible Bottleneck |
|
Force 2x2 Textures |
Is texture bandwidth (large textures) affecting performance? |
|
Force Disable Texture Filtering |
Are expensive texture filtering modes affecting performance? |
|
Force 1x1 Scissor Region |
Identifies vertex processing bottlenecks (by removing most pixel processing) |
|
Force Simple Pixel Shaders |
Identifies expensive pixel shaders |
|
Force Skip Draw*Prim Calls |
Identifies non-GPU bottlenecks (by removing most 3D graphics work) |
| Force Z Test Enable |
Identifies z-order performance issues |
|
Force Z Write Enable |
Identifies z-order performance issues |
|
Force Alpha Blend Enable |
Identifies alpha-blending performance issues |
|
Force Alpha Test Enable |
Can identify problems related to early Z test |
| Force Cull Mode |
Can show culling efficiency |
| Force Fill Mode |
Used for debugging and identifying vertex density |