vdxa_fragmentprograms - shekh/VirtualDub2 GitHub Wiki

VirtualDub Plugin SDK 1.2

Fragment programs

The primary source of versatility in the graphics pipeline is the fragment program, which describes how textures and constant data are combined to form the output image. VDXA exposes fragment programs with the feature set of Direct3D's pixel shader 2.0 profile, which permits a high degree of programmability while avoiding the kinds of bottlenecks that hamper fast execution on the CPU.

General form

A fragment program works in terms of vectors with four components. Although each component is an arbitrary floating point number, the four components typically correspond to the red, green, blue, and alpha values of a pixel, where each value has the range 0-1. Many operations performed in a fragment program function identically on all four values, resulting in parallel vector processing. While a CPU routine might scale a pixel as follows:

r *= 0.5f;
g *= 0.5f;
b *= 0.5f;

...a fragment program simply operates on the vector:

color *= 0.5f;

It is also possible to operate on partial vectors or single values (scalars) when necessary for flexibility.

A major difference between a fragment program and a CPU based routine is that there are no globals or shared values — the program is run independently for each pixel in a drawing operation. This does mean that some techniques used in CPU-based processing cannot be used in VDXA-based processing. However, the functional nature of a fragment program also means that it can be run massively parallel on a GPU, resulting in very high speeds.

Inputs

Fragment programs receive several types of inputs:

Constant data
Up to 32 4-vectors of floats can be supplied to the fragment program by the CPU. These values can be changed for each drawing operation, but have the exact same value for each pixel processed.

Textures
32 pixels can be read from a set of 16 textures for each output pixel processed. The textures are read through texture sampling units, which can apply bilinear interpolation and wrapping during the texture read.

Texture coordinate interpolators
Texture coordinate interpolators compute the locations for texel fetches in source textures. They can also be used to evaluate arbitrary linear quantities, up to four floats per interpolator. Eight interpolators are available.

Outputs

The only output from a fragment program is a single 4-vector: the pixel that is written into the render target. Anything else that is written is a temporary that is discarded.

This means that certain algorithms require more ingenuity to implement. For instance, a histogram is not easily done in a fragment program, as it requires writing into a shared array. Using multiple passes and temporary render targets is a way to bypass this restriction.

Program form

Currently, VDXA only accepts one form of fragment program: raw Direct3D shader byte code. This was chosen for simplicity of implementation and strictness of specification, as it means that the host doesn't have to have a compiler and there are no issues with fragment programs not working at times due to compiler changes. It does mean, however, that fragment programs have to be pre-compiled before they can be used.

Since writing fragment programs directly in byte code is undesirable, there are a few ways to get to that point:

Microsoft Direct3D pixel shader assembler (psa.exe)
This component of the DirectX SDK takes Direct3D pixel shader instructions and translates them directly into byte code. This is as direct as you can get, and it absolutely ensures you get what you want, but it's a bit of a pain for any long programs.

Microsoft Direct3D effect compiler (fxc.exe)
Also a component of the DirectX SDK, this is a better way of generating fragment programs, because you can use High Level Shader Language (HLSL) instead. Effects are not supported, so you need to compile a function to the ps_2_0 profile instead using the /T and /E switches.

Microsoft D3DX utility library (d3dx9_*.dll)
The D3DX library is the programmatic interface to the assembler and compiler modules used by psa/fxc. This is useful if you want to build a custom tool for building your fragment programs. You can also use D3DX directly at runtime, but this is not recommended as redistribution of D3DX is a hassle.

NVIDIA Cg Compiler
This is an alternate compiler you can use if the Microsoft HLSL compiler is not usable. Cg and HLSL are similar enough that many programs will port between the two without modification.

The recommended way of including fragment programs in your plugin is to compile them as part of your build process and then embed them directly into the plugin module. One way to do this is to generate a C include file with an array that holds the byte code for the program.


Copyright (C) 2007-2012 Avery Lee.

⚠️ **GitHub.com Fallback** ⚠️