Overview of the changes introduced in the atomic effects fork of obs studio - HoneyHazard/obs-studio-atomic-effects GitHub Wiki
- Introduce results; they are tied to respective params
- Results are passed in a lot like other params before draw; they are also fetched back as results after draw
- Only one type of result is introduced, which is
atomic_uint
. This type will be used withatomicCounterIncrement(...)
effect statements to allow atomic increments of a counter, which enabled us to build the Pixel Match Switcher plugin. - D3D11: change from
ps_4_0
tops_5_0
shader model to support UAV variables - OpenGL: change
#version 330
to#version 460
to support atomic counters
- Introducing the atomic counter results
- Effects
- Parsing the intermediate shaders
- Graphics API
- libobs-opengl
- libobs-d3d11
- Atomic counters are special variables that allow safe increments (or decrements) of a counter in the parallel shader environment. The Pixel Match Switcher plugin for OBS uses this feature of the graphics systems to count matching pixel as video data passes though the plugin's filter. Because the counting is done in the shader, the atomic counters work with great performance on any reasonably modern video card (must support
ps_4_0
for D3D11 orversion 460
for OpenGL) - Atomic counters require special treatment by D3D11 and special treatment by OpenGL. They need to reside in specially configured blocks of graphics memory and require highly specialized API to be initialized and used.
- We have selected
atomic_uint
keyword to represent atomic counters being introduced into the OBS effect system. This keyword is also how the counters are represented in GLSL. Like other global variables of the effect, it must be preceded byuniform
. - We have selected
atomicCounterIncrement(...)
function, also borrowed from GLSL, to perform an atomic increment on a counter.
Effects provide a cross platform wrappers for GLSL and HLSL shader use by OBS. In the OBS effect system you specify both vertex and pixel shader behavior in the same effect file.
Roughly speaking, the effect initialization is as follows:
- Effect file is parsed by the
effect-parser
module, generating data structures for all building blocks of the shaders it needs to generate. - These building blocks are then used to write strings that represent intermediate versions of the vertex and pixel shaders for the effect.
- The intermediate versions of shaders are then parsed by
shader-parser
module to generate data structures representing the shader functionality. - Finally, the data structures generated by the
shader-parser
are passed to down to eithergl-shaderparser
ord3d11-shaderprocessor
modules to write actual shaders that will do the effect's work.
We introduce two new syntax elements to support use of the atomic counters in the OBS effect language. The two are borrowed from GLSL, which, arguably, has a more user-friendly syntax for implementing atomic counters than HLSL.
-
atomic_uint
is the variable type that will be used for the atomic counters. Like other global variables of the effect, it must be preceded byuniform
keyword.- Example:
uniform atomic_uint myCounter;
- Example:
-
atomicCounterIncrement(...)
statement increments an atomic counter.- Example:
atomicCounterIncrement(myCounter);
- Example:
- Note: unlike actual GLSL, will not require and will not support the
layout(...)
qualifier in the effect code, but the OpenGL graphics subsystem will have to generate the qualifier in the event it is used.
The data structures built for each effect are used to dynamically generate intermediate shader code, and provide mapping of high-level abstractions for effect variables to lower-level abstractions of shader variables. While modifying the effect data structures, we mirror existing abstractions for effect parameters and their pathways to the Graphics API, as we introduce new abstractions and new pathways for interfacing with program and shader results.
This struct provides a high level interface for interfacing an effect parameter.
It has an array of parameter data structures called params
of type gs_effect_param
. So we also add a new member results
, which is an array of gs_effect_result
.
High level interface for working with a result. Members are:
-
name
result name -
type
variable type; currently onlyGS_SHADER_PARAM_ATOMIC_UINT
is supported -
cur_val
value that was last retrieved from result.
Analogous to gs_effect_param
.
Serves as mapping of higher level effect result interface gs_effect_result
to a lower-level shader result interface handle gs_sresult_t
.
Analogous to pass_shaderparam
.
Has handle pointers for vertex and pixel shaders, and an array for mapping effect params to shader params. These are used later to make passing in of parameter values possible.
So we add program_results
, which is an array of pass_shaderresult
, to also do the mapping of effect results to shader results, and make fetching of the results possible.
effect-parser
module converts the effect code into data structures and uses them to generate intermediate shader code. Input effect string is tokenized by whitespace, and is then processed, token by token, to build data structures representing individual behavior units of the effect. These are then used to generate intermediate versions of vertex and pixel shaders, so they can be passed down to shader-parser
.
Here the goal was to introduce support for results and atomic_uint
type.
This data structure maintains information about individual effect parser parameters as they are being parsed from the effect code. Since some parameters are now also results, we add a new field to be used with those results:
-
is_result
gets set totrue
for any parameter that is also result. This will later lead togs_effect_result
being instantiated for every param marked with this flag.
Assigns fields to ep_param
. Modified to receive and assign the new field is_result
.
Calls ep_param_init(...)
with values received from ep_parse_other(...)
and makes sure no erroneous symbols follow the param declaration.
Our modifications here are limited to propagating is_result
value to ep_param_init(...)
.
This function is for parsing anything in the effect code that is not whitespace, struct
, technique
, or sampler_state
. The function has variables for reacting to property
, const
, and type
keywords, and activates functions for parsing functions
s and param
s. Our modifications are for special handling of the atomic_uint
results.
-
is_result
variable is added, and is set totrue
wheneveratomic_uint
type token is encountered. - call to
ep_parse_param(...)
also passesis_result
variable to the function
After the effect code is parsed into data structures, these data structures are used to generate the intermediate shader code (so it can be processed later by the shader-parser
). During this building of the intermediate shader code for a vertex/pixel shader, a mapping is constructed so that higher level effect params can be linked with corresponding lower-level shader params.
The additions here are for constructing analogous mapping between effect results and shader results.
This function receives a pointer to ep_param
as input and writes the param's declarations in the intermediate shader code. It also appends a name of every used param to the array used_params
.
We teach the function to also append a param name, that is also a result, to the list of results. It will now receive a pointer used_results
, which is an array of strings representing names of the results used in the effect. If the ep_param
, passed into the function, is marked with the is_result
flag, it is appended to used_results
.
ep_write_func_param_deps(...), ep_write_func_func_deps(...), ep_write_func(...), ep_makeshaderstring(...) [modified functions]
These functions call each other and other lower level functions as the intermediate shader is being built from the data structures representing it. They maintain used_params
- an array of strings representing params that were encountered, to be eventually passed down to ep_write_param(...)
, where a name is appended to the array for every used param.
They are all modified so that used_results
, an array of strings representing names of the results used in the effect, can also make its way to ep_write_param(...)
function, where it can be updated for every used result.
- Takes as input a name for a result and a pointer to an allocated
gs_effect_result
so its fields can be initialized. - Finds a corresponding
gs_effect_param
of the same name in the effect's array of params, and retains the pointer ings_effect_result
being initialized. Because this is all called (and has to be called) after the params are "compiled", the array of params is stable and a pointer into that array is safe. - A string of param/result name is also duplicated into
gs_effect_result
.
This function is responsible for building mapping between higher level handles of effect results and lower level handles of shader results, and also invokes mapping between effect params and effect results.
For every result name in the input array of strings used_results
:
- Fetches a pointer to
gs_effect_result
from theeffect_parser
that corresponds to the result name. - Calls
ep_compile_result(...)
sogs_effect_result
receives a pointer to correspondinggs_effect_param
. - Finds a pointer/handle of type
gs_sresult_t
fromgs_shader_t
that correspots to the result name. - The mapped pair of result handles, represented by
pass_shaderresult
, is appended to thepass_results
array, which is passed in by pointer.
Analogous to ep_compile_pass_shaderparams(...)
.
This function invokes generation of shader strings for either vertex or pixel shader. While doing so, it also generates mapping of effect parameters to shader parameters, which is used later to make setting of params possible.
We need to provide a mapping of effect results to shader results, very similar to how it is done for params, so the result retrieval works. Mapping between effect results and corresponding effect params will also be indirectly invoked.
- Add variable
used_results
, which is an array of strings representing names of results used in the shader.used_results
is updated during the calls toep_makeshaderstring(...)
.- This is analogous to how
used_params
is worked with.
- This is analogous to how
- Obtain a pointer variable
pass_results
of typepass_shaderresult
. It is pointing to the data memberprogram_results
ofep_pass
pointer, and will be modified in-place to retain the results mapping inep_pass
data structure- Similar to
vertshader_params
andpixelshader_params
members ofep_pass
.
- Similar to
- Add a call to
ep_compilepass_shaderresults(...)
.used_results
andpass_results
variables are passed in.pass_results
will be updated.- Analogous to how
ep_compile_pass_shaderparams(...)
is called. - Also will result in effect results receiping pointers to corresponding effect params.
- Analogous to how
In a typical scenario the effect user obtains handles to an effect's parameters, and uses those handles to pass the parameters to the effect, so the values can be passed down to the lower level layers and eventually end up in the actual graphics system and shader machinery.
We have to expand the usage to include the results. Similarly to params, the user will be able to lookup result handles by name. The user will then be able to retrieve result values from the result handles.
The result variables are also connected to param variables of the same name. atomic_uint
will require a new type of param that is not int
, and also requires some special care by either graphics system when using it as a parameter.
Obtains a gs_eresult_t
pointer by name. This handle can then be used by gs_effect_get_atomic_uint_result(...)
to obtain an effect result.
Analogous to gs_effect_get_param_by_name(...)
.
This allows you pass a value to an unsigned integer atomic counter in the shader like you would set any other param. Nothing new here, except the new parameter type that will be used with the atomic_uint
variable in the effect code. All lower-level details are in other functions.
Analogous to gs_effect_set_int(...)
.
Given a gs_eresult_t
pointer/handle, retrieves the value of the atomic_uint
result after drawing with the effect has finished.
Again, no super low-level stuff here; just some memory copies. Very similar to gs_effect_set_xyz(...)
functions (where xyz is a data type) except we get instead of set since the results are coming back after the draw, instead of being passed in before the draw.
This is a wrapper for updating effect parameters with new values. In addition to some error-checking, the functions prevents updating uniform
data in the shaders when the values given are no different from the previously assigned value. (so, no update should be necessary)
We customize the behavior to always force updates anytime the parameter type is GS_SHADER_PARAM_ATOMIC_UINT
. Our atomic counters are both a param and a result, and are treated a little different from other uniform
s by the graphics systems. So, we must ensure the value is always passed in to the shaders before every draw, even if we keep sending the same value.
This is a cleanup function for gs_effect_pass
. We modify it to also cleanup the newly introduced program_results
array.
This is a cleanup function for gs_effect
. We modify it to also cleanup the newly introduced results
array.
shader-parser
processes the intermediate vertex and pixel shader code generated by the effect-parser
. Data structures are generated to represent the shader behavior. This allows reformatting the intermediate shader into GLSL or HLSL code, depending on which graphics subsystem is used.
Our changes here are for supporting results, and for assigning a unique index for each atomic counter variable, so either graphics system will be ready to allocate resources for its specialized handling of the counters.
This structure holds an instance of a cf_parser
which is a c-style parser, and has arrays of data structures for params, structs, samplers, and funcs.
We need an incrementing counter to assign unique, increasing index for each atomic counter we encounter in the intermediate shader code. This struct seems to be a fine place to keep the next index to be assigned, so we add atomic_counter_next_index
integer field.
This struct represents a variable in a shader code. We want results to be special kind of variables, so we add is_result
flag to the fields. We also add atomic_counter_index
so that each atomic_uint
variable can be uniquely identified in preparation to be handled by either graphics system. Increasing values will be assigned to indices of counters in the order of the counter's declaration in the intermediate shader.
This initializes members of shader_var
. We modify it to also initialize atomic_counter_next_index
to 0
, so the counter enumeration index will begin with 0
.
This function initializes a shader_var
that was previous allocated, assigning its fields based on function parameters.
New function parameters are added to support results and atomic counters; specifically:
-
is_result
set totrue
when the variable is also a result -
atomic_counter_next_index
is an integer that is passed by pointer. Whenever the variable is of typeatomic_uint
this integer is copied to the fieldatomic_counter_index
inshader_var
, and then the next index to be assigned is incremented.
Very similarly to ep_parse_param(...)
, this calls shader_var_init_param(...)
to initialize a shader_var
and do some error-checking.
We modify the call to shader_var_init_param(...)
so the flag is_result
is passed down to it, and also pass the pointer to atomic_counter_next_index
of gs_shader
as the other new argument.
Very similar to ep_parse_other(...)
and is responsible for parsing anything in the intermediate shader code that is not whitespace, struct
or sampler_state
. We add special handling for variables of type atomic_uint
.
- Local boolean
is_result
is added, and is set totrue
when a variable ofatomic_uint
type is encountered. -
is_result
is also passed down tosp_parse_param(...)
.
graphics
is "an API-independent graphics subsystem wrapper". Many data types are defined. Among other things, it has function pointers that need be assigned to functions specific to OpenGL vs Direct3D11 operation.
Here we integrate some new functionality needed to make results work.
This is actually defined by either D3D11 or OpenGL subsystems. It will contain data that either system will need to interact with an actual shader variable associated with the result.
However, the graphics
code will pass this data around using the [gs_eresult_t
](#gs_eresult_t-new typedef) typedef wrapper.
See OpenGL and D3D11 implementations of gs_shader_result
.
This is a typedef for gs_effect_result
so a pointer handle to an effect result can be used by modules that don't need knowledge of the graphics internals.
Analogous to gs_eparam_t
.
This is a typedef for gs_shader_result
so a pointer handle to a shader result can be used by modules that don't need knowledge of the shader internals.
Analogous to gs_sparam_t
.
This enum for shader param data types is extended to include GS_SHADER_PARAM_ATOMIC_UINT
.
The following new function signatures will be declared so they can be defined by either OpenGL or D3D11 subsystems. This will allow platform-agnostic effect abstractions to interact with the actual shader machinery of either subsystem.
gs_sresult_t *(*gs_shader_get_result_by_name)(gs_shader_t *program, const char *name);
Fetches a pointer/handle gs_sresult_t
by name from a shader program.
See OpenGL and D3D11 implementations.
Analogous to gs_shader_get_param_by_name(...)
.
void (*gs_shader_set_atomic_uint)(gs_sparam_t *param, unsigned int val);
Passes an unsigned integer value to the atomic counter variable in the shader represented by the gs_sresult_t
pointer/handle.
Expands on the existing gs_shader_set_xyz(...)
function declarations, where xyz is a data type.
See OpenGL and D3D11 implementations.
void (*gs_shader_get_result)(gs_sresult_t *result, struct darray *dst);
Copies new result data from the shader into the dst
, provided a gs_sresult_t
pointer/handle.
See OpenGL and D3D11 implementations.
Naming is analogous to gs_shader_set_val(...)
declaration.
This uses the array of mapping pairs pass_shaderresult
to transfer data from shader result handles gs_sresult_t
(where new data is available after a technique draw is finished) into the associated gs_effect_result
, which makes the data available to effect users. It calls gs_shader_get_result(...)
.
Naming is analogous to upload_parameters(...)
.
This function is doing some cleanup after a technique draw has finished, and it was a convenient place for us to insert a call to download_results(...)
.
This module provides implementation of the Graphics API for the OpenGL graphics system.
Roughly speaking, most of the changes fall into the categories of writing low-level code to implement the atomic counters with features available in OpenGL, and adding the structure and logic necessary to make the results work.
gl-subsystem.h/.cpp
does many GL-specific implementations of the Graphics API, including GL-specific implementations of the data structure for shader and program params.
We will modify and add new data structures here to add support for results, with some additions being specific to support atomic_uint
variables.
This is the GL-specific implementation for the shader_param, that has some low-level details for interacting with the params. In platform-agnostic sections of the code the pointers to gs_shader_param
are passed around using gs_sparam_t
pointer handles.
Our concept of a result implies being linked with a param of the same name, and using atomic_uint
variable as a shader param requires some special handling too. So, we choose this struct to contain low-level data necessary for interacting with the atomic counter params AND results, as well as flags that were similarly added to other structures to support results.
New members are:
-
is_result
whentrue
indicates this param also has an associated result -
buffer_id
is the ID of the OpenGL Buffer Object that will be used to interact with the atomic counter. This will be received by callinginit_atomic_buffer(...)
-
layout_binding
in the GL subsystem this represents the index into the indexed targetGL_ATOMIC_COUNTER_BUFFER
that represents graphics data for our counters. -
layout_offset
in the GL subsystem we can have multiple atomic counters share the same layout binding but have different offsets. - ::construction:: TODO: Currently, all our counters get assigned unique binding and the offset is always
0
. A good optimizations will be to reuse same binding but have several counter variables with different offsets into the shared memory block of the same binding.
This will be our implementation of the shader result for the GL subsystem.
Members are:
-
name
: name of the result -
param
: pointer togs_shader_param
, which has fields with low-level details for interacting with atomic counter params/results. -
cur_value
data array where retrieved result data will be stored
This is the GL-specific definition for representing an active shader. One of the members is params
, which is an array of gs_shader_param
.
We add results
, which is an array of gs_shader_result
.
This just has a pointer gs_shader_result
, so a program result can be associated with a shader result.
Analogous to program_param
.
This represents an OpenGL shader program. It has gs_shader
pointers to a vertex and pixel shader, as well as an array of program_param
s.
We add results
, which is an array of program_result
s.
After shader-parser
has parsed the intermediate shader code into data structures, gl-shaderparser
code will generate the final GLSL shader code from these data structures.
Our changes here are for generating code that utilizes the atomic counters feature of OpenGL/GLSL. atomic_uint
type is native to GLSL, but we need to generate the layout(...)
qualifier block that is required for atomic_uint
variable declarations, which we have omitted from the effect language.
Notably, atomicCounterIncrement(...)
(or decrement) effect statements, that we aim to support, are borrowed from GLSL, and require no special translation when regenerated from intermediate shader code into GLSL.
This function writes variable declarations into GLSL, taking into account various qualifiers.
To support atomic_uint
s in GLSL, the declaration must be preceded by a layout(...)
qualifier block. So, our declarations for atomic counters will need to look like this:
(layout binding = 1, offset = 0) uniform atomic_uint myCounter;
We modify the function to insert the layout qualifiers for variable declarations that are of type atomic_uint
. We will use atomic_counter_index
of shader_var
as the source for the binding index.
Dispatches lower-level functions for putting together the final GLSL code.
We change #version 330
preprocessor declaration to #version 460
to support atomic counters.
🚧 TODO An alternative to bumping the shader version all the way to 460 is activating the GL_ARB_shader_atomic_counters
extension. Unfortunately, this seems to be breaking effect compilation for some of the effects that don't use the atomic counters. This alternative can still be explored but the cause of effect breakage will need to be further investigated, or a system for activating the extension only for the effects that require it may need to be introduced.
Defines many functions for activating and interacting with a GLSL shader and an OpenGL shader program.
Our changes are for integrating the flow of data for results, and introducing the low-level code for activation and interaction specific to atomic counters (atomic_uint
).
Instantiates and stores a new result in the results
array of gs_shader
. New instance of gs_shader_result
gets a copy of the param/result name, so the corresponding param and the result can be linked once the intermediate shader parsing has finished and the array of params is stable.
Initializes a gs_shader_param
given a shader_var
as input. Textures get some special treatment here. Once initialized, new instance of gs_shader_param
is pushed back to the params
array of gs_shader
.
We modify the function as follows:
- When param is of type
GS_SHADER_PARAM_ATOMIC_UINT
itslayout_binding
gets set to the value ofatomic_counter_index
ofshader_var
, and itslayout_offset
is0
. -
is_result
field ofshader_var
propagates to the newgs_shader_param
. - in the event
is_result
is true,gl_add_result(...)
is called to instantiate and store ags_shader_result
corresponding to the param's name.
Generates a new OpenGL Buffer Object of type GL_ATOMIC_COUNTER_BUFFER
, which will be used for writing to and reading from an atomic counter variable. Buffer ID is retained to be stored in gs_shader_param
.
We use glGenBuffers(...)
, glBindBuffer(...)
, glBufferData(...)
, glBindBuffer(...)
, and glBindBufferBase(...)
to get an atomic counter buffer initialized.
Implements gs_shader_set_atomic_uint(...)
function signature for the OpenGL subsystem.
Very similar to most of the other gs_shader_set_xyz(...)
implementations (where xyz is a data type) that just copy data into cur_value
data array of gs_shader_param
.
Implements gl_shader_get_result_by_name(...)
function signature for the OpenGL subsystem.
Analogous to gs_shader_get_param_by_name(...)
and just searches for the right result with name
field that matches.
Implements gs_shader_get_result(...)
function signature for the OpenGL subsystem.
Just copies data into destination pointer from the cur_value
data array of gs_shader_result
.
This is a cleanup function for gs_shader
and we modify it to also cleanup the results
array of gs_shader_result
.
This function works on getting param values/data into uniform
variables of an active GLSL shader. For most supported data types this means calling glUniformXyz(...)
function, with some specialized work needed for textures.
The atomic counter variables are also special and we cannot use glUniformXyz(...)
style functions to set values before draw. When param type is GS_SHADER_PARAM_ATOMIC_UINT
we add another specialization to call glBindBuffer(...)
, glBindBufferBase(...)
and glBufferSubData(...)
functions so the value can make its way to atomic_uint
variable of interest in the shader, before the draw.
This function finds the uniform locations of a given gs_shader_param
by calling glGetUniformLocation(...)
with the param's name, and then pushes the given shader param to the params
array of gs_program
.
Because the mechanisms used for initializing and using the atomic counters are different from regular uniform
s, we modify the function to skip finding and assigning a uniform location any time a param is of type is GS_SHADER_PARAM_ATOMIC_UINT
.
This function works on connecting each constructed gs_shader_result
with the respective gs_shader_param
of the same name. It also constructs a new program_result
linked with the shader result, and pushes it to results
array of gs_program
.
Naming is analogous to assign_program_shader_params(...)
.
This function calls assign_program_shader_results(...)
for both vertex and pixel shaders of gs_program
.
Analogous to assign_program_params(...)
.
After GLSL shaders were compiled this function creates a new shader program, attaches the shaders to the program, and links it. After linking it calls assign_program_params(...)
and assign_program_attribs(...)
, so we also add a call to assign_program_results(...)
.
This is a cleanup function for gs_program
and we modify it to also destroy the results
array containing program_result
s.
This module provides implementation of the Graphics API for the Direct3D11 graphics system.
Roughly speaking, most of the changes fall into the categories of implementing atomic counters using Direct3D's UAV variables system, and adding the structure and logic necessary to make the results work.
d3d11-subsystem.h/.cpp
does many D3D11-specific implementations implementations of the Graphics API, including D3D11-specific implementations of the data structure for shader and program params.
We will modify and add new data structures here to add support for results, with some additions being specific to implement atomic_uint
variables using the UAV system.
This is the D3D11-specific implementation for the shader_param, that has some low-level details for interacting with the params. In platform-agnostic sections of the code the pointers to gs_shader_param
are passed around using gs_sparam_t
handle.
Our concept of a result implies being linked with a param of the same name, and using a UAV variable as a shader param requires some special handling too. So, we choose this struct to contain low-level data necessary for interacting with the atomic counter params AND results, as well as flags that were similarly added to other structures to support results.
New members are:
-
is_result
whentrue
indicates this param also has an associated result -
atomicCounterIndex
in the D3D11 subsystem will represent index into the buffer of UAV memory of unsigned integers where variable resides. The value will be copied fromatomic_counter_index
ofshader_var
.
Situational repurposing of an existing member:
- The
pos
member is being used to store the byte address of the variable in the memory chunk used to set const/uniform variables. For UAV counter variables we will reuse the same variable as the byte address into the UAV memory chunks that we send and receive from the shader. Since each counter variable will be 4 bytes, anything that is a counter will have itspos
assigned toatomicCounterIndex * 4
.
This will be our implementation of the shader result for the D3D11 subsystem.
Members are:
-
name
: name of the result -
param
: pointer togs_shader_param
, which has fields with low-level details for interacting with UAV counter params/results. -
curValue
data array where retrieved result data will be stored
This is the D3D11-specific definition for representing an active shader. Among other things, it has params
member, which is a vector of gs_shader_param
, which partakes in passing the param values to the shader before the draw. So, we add results
, which is a vector of gs_shader_result
, and will partake in fetching the results after the draw.
The class also holds const data size and descriptor used for initializing the const/uniform buffer. So, we introduce UAV data size and several descriptor variables used to initialize the UAV buffer and its use. The descriptors are initialized in the gs_shader::BuildUavBuffer(...)
(called from the vertex and pixel shader constructors) and are reused in gs_vertex/pixel_shader::Rebuild(...)
.
-
uavBd
is aD3D11_BUFFER_DESC
structure for describing the UAV buffer. Gets passed down toID3D11Device::CreateBuffer(...)]
to create the UAV buffer in graphics memory. -
uavTxfrBd
is aD3D11_BUFFER_DESC
structure for describing the UAV transfer buffer. Gets passed down toID3D11Device::CreateBuffer(...)]
to create a transfer buffer for transferring data back and forth between the system memory and the UAV graphics memory.- analogous to
bd
which is aD3D11_BUFFER_DESC
used in initializing the constants buffer that transfers uniforms' data to the graphics memory.
- analogous to
-
uavViewDesc
is aD3D11_UNORDERED_ACCESS_VIEW_DESC
structure for describing the UAV view. Gets passed down toID3D11Device::CreateUnorderedAccessView(...)
to create the UAV view for sending the counter data into the shader.
Finally, the class holds pointer to a D3D11 interface for the buffer used in setting const/uniform data. So, we add pointers to the two buffers and a UAV view used in setting and receiving UAV data. These are created in the gs_shader::BuildUavBuffer(...)
(called from the vertex and pixel shader constructors) and have to be reinitialized in the event of gs_vertex/pixel_shader::Rebuild(...)
.
-
uavBuffer
is a pointer toID3D11Buffer
representing the UAV buffer in graphics memory. Obtained by callingD3D11Device::CreateBuffer(...)]
withuavBd
as one of parameters. -
uavTxfrBuffer
is a pointer toID3D11Buffer
for sending and receiving the UAV data. Obtained by callingD3D11Device::CreateBuffer(...)]
withuavTxfrBd
as one of parameters.- analogous to
constants
buffer pointer for setting uniforms
- analogous to
-
uavView
is a pointer toID3D11UnorderedAccessView
and is used to sending data to the UAV region in graphics memory. Obtained by callingID3D11Device::CreateUnorderedAccessView(...)
withuavBuffer
anduavViewDesc
as arguments.
This function encapsulates much of a typical draw activity, including loading vertex buffer, updating blend, raster, Z-stencil states, view+proj matrix, invoking UploadParams(...)
on shader parameters to vertex and pixel shaders, and finally drawing primitives. It is called from gs_draw(...)
.
As we are introducing the concept of results that become available after the draw, we add a call to gs_shader::DownloadResults(...)
on the vertex and pixel shaders, after the primitive draw has finished.
After shader-parser
has parsed the intermediate shader code into data structures, d3d11-shaderprocessor
code will generate the final HLSL shader code from these data structures.
Our changes here are for generating code that parses the added effect syntax for atomic counters, and utilizes the UAV variables of D3D to implement atomic counters. Unfortunately, in HLSL there is no direct analogue to atomic_uint
type and no atomicCounterIncrement(...)
function, so we translate our intermediate shader code to become other things in HLSL that needs to be generated:
-
RWStructuredBuffer<uint> __uavBuffer : register(u1);
is used to declare a UAV memory chunk in the shader that we will use for storing the atomic counter variables. Such a buffer will be added when UAV counters were encountered in the effect code, and theuavBuffer
handle ofgs_shader
will be connected to this buffer when [gs_shader::BuildUavBuffer(...)](gs_shaderBuildUavBuffer-new-function) or
gs_vertex/pixel_shader::Rebuild(...)` is called. - We cannot assign and access variables inside our
__uavBuffer
by name, but we can index into it like an array of unsigned 32-bit integers. We will use theatomicCounterIndex
member ofgs_shader_param
as an index into the buffer (which was inherited fromatomic_counter_index
ofshader_var
). -
InterlockedAdd(...)
will be used to replaceatomicCounterIncrement(...)
added to the effect language. - So, an effect statement
atomicCounterIncrement(varName)
will need to be translated intoInterlockedAdd(__uavBuffer[0], 1)
, where0
happened to be the atomic counter index forvarName
, and1
is because "increment" is equivalent to "add 1 and assign";
This protected utility function is added for iterating through cf_token
s of ShaderParser
until a token matching a string is found. Used by ShaderProcessor::PeekAndSkipAtomicUint(...)
and ShaderProcessor::ReplaceAtomicIncrement(...)
functions.
This protected utility function is added for iterating through cf_token
s of ShaderParser
for as long as tokens match a string. Used by ShaderProcessor::PeekAndSkipAtomicUint(...)
and ShaderProcessor::ReplaceAtomicIncrement(...)
functions.
This function as added to eat up all tokens that are part of uniform atomic_uint myVar;
declarations. As mentioned before, we will be using numeric index into the __uavBuffer
of the shader code instead of counter variable names, so all tokens that are part of declarations for atomic_uint
s will be completely ignored - no output will be produced. Returns true
when one such declaration was encountered and swallowed up.
This works on translating the atomicCounterIncrement(...)
statement added to the effect language into InterlockedAdd(...)
statements of HLSL. In order to translate variable name of intermediate shader code into a numeric index that can be used with __uavBuffer
- the params
array of shader_parser
is scanned for shader_var
with the matching name and the atomic_counter_index
of that shader variable is used.
This one obtains tokens from the parser of the intermediate shader code, and replaces keywords of the effect language into things that actually exist in HLSL, so the final HLSL string is constructed. Our additions of the atomic counter syntax are no exception to these needs, and even require some additional handling. Changes are:
-ShaderProcessor::ReplaceAtomicIncrement(...)
is called whenever atomicCounterIncrement
keyword is encountered, and will navigate tokens to replace the entire increment statement.
- We also add a call to
ShaderProcessor::PeekAndSkipAtomicUint(...)
, which is called after all other keywords are checked for a need of conversion. If the function returnstrue
- this meansatomic_uint
declaration was encountered, and all tokens of the declaration statement will be be skipped in the HLSL. - As mentioned before, we need to add
RWStructuredBuffer<uint> __uavBuffer : register(u1);
into output to declare the UAV memory block for the counters, but only for the shaders that have atomic counter variables. So, instead ofstringstream output
function variable there are now string streamstempOutput
andfinalOutput
.tempOutput
is being written to as tokens are being processed, just likeoutput
was before. During processing of the tokens, we will learn if the UAV block will be needed or not. And if it is needed, in thefinalOutput
we will insertRWStructuredBuffer<uint> __uavBuffer : register(u1);
statement afterstatic const bool obs_glsl_compile = false
but before the rest of the code, which has been constructed intempOutput
. NowfinalOutput
stream has the final HLSL code to be copied intooutputString
, which is the function argument that is passed by reference. Phew.
Defines much of the functionality for initializing and interacting with an active HLSL shader.
Changes here will be for adding the needed structure and logic for results, initializing all the D3D11 device and context handles needed for interacting with the UAV buffer containing atomic counter variables, and using them.
Implements gl_shader_get_result_by_name(...)
function signature for the GL subsystem.
Analogous to gs_shader_get_param_by_name(...)
and just searches for the right result with name
field that matches.
Implements gs_shader_set_atomic_uint(...)
function signature for the D3D11 subsystem.
Copies data from a const void* data
pointer into curValue
vector of gs_shader_param
, resizing the vector when necessary.
Implements gs_shader_get_result(...)
function signature for the D3D11 subsystem.
Just copies data into destination pointer from the curValue
data array of gs_shader_result
.
Iterates through the vector of gs_shader_param
. For each param that is also a result, uses pos
member to determine the largest value of the mapping index, so the required size of the UAV block is known. Once this size, uavSize
, is known, and is not zero, initializes the uavBd
, uavTxfrBd
, and uavViewDesc
descriptors, and uses them to create the uavBuffer
and uavTxfrBuffer
, and uavView
of the gs_shader
, which are used for sending data to and from the UAV graphics memory.
Analogous to the behavior of gs_shader::BuildConstantBuffer(...)
, except there is some additional complexity due to UAV use and support for bi-directional data flow.
These functions reconstruct a vertex or pixel shader, maintaining the descriptors for const/uniform buffer but reinstantiating the const buffer and making sure all params work in the new instance of the shader.
Similarly to how const data is handled, we maintain the uavBd
, uavTxfrBd
, and uavViewDesc
descriptors, but we create new instances of the uavBuffer
and uavTxfrBuffer
, and uavView
of gs_shader
, which are required for sending data to and from the UAV graphics memory.
This function adjusts constData
data array, passed by reference, in response to an input gs_shader_param
. The pos
member of each param dictates where in the const data (uniform) memory chunk the param's data will go. When new or modified param data is due to be inserted into the chunk of const data, a pass-by-reference boolean uploadConst
is set to true
, to flag that the const data will need to be uploaded to refresh the uniform variable(s).
We mirror this arrangement as we add support for sending UAV memory chunks to the shaders, triggered by the counter params+results that need it:
- We add
uavData
data array that is passed into the function by reference. It will be adjusted in response to the input param when it is of typeGS_SHADER_PARAM_ATOMIC_UINT
. Once again,pos
member of each param is used as a mapping index, except this time it's indexing into the UAV data instead of the const/uniform data. - We add
uploadUav
, a boolean passed by reference, that will need to be set totrue
whenever an inputgs_shader_param
is of typeGS_SHADER_PARAM_ATOMIC_UINT
. We always force UAV data refresh anytime a UAV variable is encountered - even if the UAV memory chunk appears unchanged. We want UAV counters to be set to predictable values as we begin drawing with the effect.
This function initializes a local variable constData
, which is data vector, and then calls gs_shader::UpdateParam(...)
on every param of gs_shader
until constData
contains all the values to be assigned to the uniforms. If new or modified param data was encountered during the updates - the uploadConst
flag gets set, and the function uses ID3D11DeviceContext::Map(...)
to map the constants
buffer, so the constructed data block in constData
is copied and ends up in this const/uniform buffer.
We mirror this structure as we introduce UAV counter results. We add uavData
data array and uploadUav
boolean, and we also process these by gs_shader::UpdateParam(...)
. If any of the params were atomic counter results, the uploadUav
flag is set and:
- We call
ID3D11DeviceContext::OMSetRenderTargetsAndUnorderedAccessViews
to activate our UAV view - Call
ID3D11DeviceContext::Map(...)
on theuavTxfrBuffer
to load the transfer buffer. - Call ID3D11DeviceContext::CopyResource(...) to deliver the data from
uavTxfrBuffer
to theuavBuffer
in graphics memory.
After the draw we need to download the UAV data back:
- Add a local veriable, data vector
resultsData
to receive the UAV memory chunk. - As we work in the opposite direction, we call ID3D11DeviceContext::CopyResource(...) first to copy data from
uavBuffer
in the graphics memory to theuavTxfrBuffer
in the system memory. - Then use
ID3D11DeviceContext::Map(...)
to copy data fromuavTxfrBuffer
to theresultsData
vector. - Finally, copy data from the
resultsData
vector to individualgs_shader_result
, using thepos
member of each result as a mapping index.
This a constructor for a class that represents an active pixel shader, and is derived from gs_shader
base class. It instantiates and makes calls to d3d11-shaderprocessor, so the previously generated intermediate shader code can be parsed again into useful data structures for interacting with the shader. The final HLSL code is assembled and ID3D11Device::CreatePixelShader(...)
is called to create the shader and keep and handle to it inside gs_pixel_shader
.
Here, the only modification was changing shader model from ps_4_0
to ps_5_0
in order to support UAV buffer.
::construction:: TODO: Should gs_vertex_shader
constructor also be changed to use vs_5_0
so atomic counters can be supported in vertex shaders? Besides consistency, are there benefits to doing so? Is there a potential useful application for vertex shaders? Are there benefits to not doing it? Should we change it anyway since we are using higher pixel shader model and it's unlikely one will be supported by the hardware/drivers and not the other?