Dx12 Rendering Backend
Gameface comes with several example rendering backends. One of those backends is the DX12 backend that uses the DirectX 12 rendering API. This page explains the caveats of the DX12 backend and how to use it properly in a client application. For more general information on the rendering flow of Gameface, see Rendering Architecture
Using and driving the DX12 backend
Usually, when using the DirectX 12 rendering API, the application code has several frames in flight. This means that multiple frames are recorded in command lists. Also, when the command lists are submitted to a command queue, they are not executed immediately and the GPU might continue to use them for some time. In this situation, the application code has to be careful not to improperly reuse or destroy some GPU resource that might still be in use by the GPU.
For those reasons, in the DX12 backend, there is a concept of a rendering frame. The backend respects that there might be multiple frames in flight - by default 4, see the comments near Dx12Backend.h::Dx12Backend::MAX_FRAMES_IN_FLIGHT
. Internally, the backend rotates and tracks resources so as not to use the same ones every frame. It also has logic that doesn’t delete resources straight away when handling BRC_Destroy*
commands but only when it is safe to do so.
The safe use of GPU resources across frames in flight is achieved through the use of “fence values”. The correct flow for driving the DX12 backend is:
- At the beginning of a frame on the rendering thread, before all
cohtml::View
s are painted, callDx12Backend::BeginFrame(list)
with the command list object that will be used for this frame - At the end of the frame, when all
cohtml::View
s have been painted, callDx12Backend::EndFrame()
. The returneduint64_t
value is the “fence value” for this frame. - When the GPU has finished its work with the command list for this frame, call
Dx12Backend::ListComplete(frameFenceValue)
with the corresponding fence value.
The call to Dx12Backend::ListComplete
signals the backend that all resources used in the frame with the given fence value are no longer used by the GPU and are free to be destroyed if needed. Knowing when the GPU has finished work with some command list is usually achieved by using DX12 fences and calls to ID3D12CommandQueue::Signal
.
A typical flow for using the DX12 backend is as follows:
// Signal the completion of some frame that has been
// previously recorded. The value of 'BackendFence' will
// be the last frame that was finished by the GPU
previousFrameFence = BackendFence->GetCompletedValue();
dx12Backend->ListComplete(previousFrameFence);
list = AcquireCommandListForCurrentFrame();
// Begin the frame for the backend
// list is a ID3D12GraphicsCommandList*
dx12Backend->BeginFrame(list);
// Paint all views. This records rendering commands in 'list'
cohtlView_1->Paint(...);
...
cohtlView_N->Paint(...);
// End the frame and get a fence value for this frame
frameFence = dx12Backend->EndFrame(list);
// Execute the command list that was recorded for this frame
ID3D12CommandList* lists[]{ list };
CommandQueue->ExecuteCommandLists(1, lists);
// tell the GPU to update the 'BackendFence' when
// it reaches this place in the command queue
CommandQueue->Signal(BackendFence, frameFence);
Descriptor Heaps in the DX12 Backend
A big part of all rendering backends is the management of GPU resources. Resources are created when handling BRC_Create*
backend commands and are at some point destroyed when handling BRC_Destroy*
commands. The managed resources are:
- Textures - see
BRC_CreateTexture
,BRC_CreateTextureWithDataPtr
, andBRC_DestroyTexture
- Depth-Stencil Textures - see
BRC_CreateDSTexture
andBRC_DestroyDSTexture
- Vertex buffers - see
BRC_CreateVertexBuffer
andBRC_DestroyVertexBuffer
- Index buffers - see
BRC_CreateIndexBuffer
andBRC_DestroyIndexBuffer
- Constant buffers - see
BRC_CreateConstantBuffer
andBRC_DestroyConstantBuffer
- Pipeline state objects - see
BRC_CreatePSO
andBRC_DestroyPSO
However, in DX12 there are a couple of other resource types that have to be managed. Those are the GPU visible and CPU staging descriptor heaps. For more information, see the official Descriptor Heaps Overview
With regards to the descriptor heaps usage, the DX12 backend implements the following scheme:
- Views of resources (shader resource views, depth-stencil views, render target views) are placed on CPU staging heaps when created. See
Dx12Backend::m_SRVStagingDesc
,Dx12Backend::m_RTVStagingDesc
,Dx12Backend::m_DSVStagingDesc
, andDx12Backend::m_SamplersStagingDesc
- When views need to be bound, they are first copied to GPU visible heaps. See the logic in
Dx12Backend::SetPSTextures
.
GPU visible descriptor heaps are kept in a ring buffer in Dx12Backend::m_FrameDescriptors
and every frame the DX12 backend uses the next entry in the ring buffer. For every ring buffer entry, there might be multiple descriptor heaps as one of them might not be enough for a single frame. When a frame is begun - see Dx12Backend::BeginFrame
- the DX12 backend can destroy some of the descriptor heaps in the current ring buffer entry. By default, the backend keeps up to 4 descriptor heaps per ring buffer entry. This number can be adjusted - see Dx12Backend::DESCRIPTOR_HEAPS_TO_KEEP_ALIVE
.
The GPU visible descriptor heaps can be thought of as regular GPU resources that take up certain GPU memory. As such, depending on the application use case, there might be GPU memory pressure created by the GPU descriptor heaps usage. The things to consider:
- Creating GPU descriptor heaps is an expensive operation and avoiding it is important
- Having unused GPU descriptor heaps wastes GPU memory
For these reasons, having an appropriate value for DESCRIPTOR_HEAPS_TO_KEEP_ALIVE
is an important thing to consider to achieve good performance and memory trade-off.
Performance tips
Achieving a good performance while using low-level graphics APIs like DirectX 12 requires knowledge of the concrete problem being solved. Having a general rendering backend like the DX12 backend goes against this philosophy. For this reason, fine-tuning the DX12 backend might be necessary when performance and GPU memory utilization are concerns. Here is a list of things to consider when using the DX12 backend:
- Adjust the value of
Dx12Backend::DESCRIPTOR_HEAPS_TO_KEEP_ALIVE
. See the previous section for explanations. - Adjust the values of
Dx12Backend::CBVSRVUAV_DESCRIPTOR_HEAP_SIZE
,Dx12Backend::RTV_DESCRIPTOR_HEAP_SIZE
, andDx12Backend::DSV_DESCRIPTOR_HEAP_SIZE
. Those control how big the allocated GPU descriptor heaps are. - Adjust the values of
Dx12Backend::CBVSRVUAV_STAGING_DESCRIPTOR_HEAP_SIZE
,Dx12Backend::RTV_STAGING_DESCRIPTOR_HEAP_SIZE
, andDx12Backend::DSV_STAGING_DESCRIPTOR_HEAP_SIZE
. Those control how big the allocated staging CPU descriptor heaps are. - Adjust the value of
PersistentConstantBuffersCount
inDx12Backend::FillCaps
. This tells the rendering library how many constant buffers to keep alive per frame per ring buffer entry percohtml::View
. Allocating buffers in DX12 is expensive and if the rendered UI is complex, it might require more constant buffers. - Adjust
Dx12Backend::MAX_FRAMES_IN_FLIGHT
if the application has different numbers of frames in flight. More frames in flight generally imply keeping more copies of some resources. See the comments nearDx12Backend::MAX_FRAMES_IN_FLIGHT
for more information.