SamplingWithComputeShader - HexagramNM/NM_WindowCaptureVirtualCamera GitHub Wiki
ã³ã³ãã¥ãŒãã·ã§ãŒããçšããŠãå ±æãã¯ã¹ãã£ããæ åãµã³ãã«ãååŸãã
DirectShowã®ä»®æ³ã«ã¡ã©ã§æ åãã¬ãŒã ã®ãã¯ã»ã«ããŒã¿ãéãéããã§ã«å ±æãã¯ã¹ãã£ã«æ ãã¹ãç»åãã§ããŠããããããã®ãã¯ã¹ãã£ã®ãã¯ã»ã«ããŒã¿ãæž¡ãã°è¯ãã§ãããããããã©ãŒãããã«éããããããã以äžã®ãããªå€æåŠçãå¿ èŠã§ãã
-
ãã¯ã»ã«ããŒã¿ããã¢ã«ãã¡æåãåãé€ã
- å ±æãã¯ã¹ãã£ïŒBGRAã®4ãã€ã -> æ åãµã³ãã«ïŒBGRã®3ãã€ã
-
äžçªäžã«ãããã¯ã»ã«ã®ããŒã¿ãã¡ã¢ãªäžã§å é ã«æ¥ãããã«äžŠã³å€ãã
- å ±æãã¯ã¹ãã£ãšæ åãµã³ãã«ã§äžäžãé
DirectShowã§ã¯çŽæ¥DirectXã®ãã¯ã¹ãã£ãæž¡ãããšã¯ã§ããªããããäžåºŠCPUäžã«ãã¯ã»ã«ããŒã¿ãã³ããŒããããšã«ãªããŸããã³ããŒããŠããCPUäžã§äžèšã®å€æåŠçãè¡ãããšãã§ããŸããã DirectXã®ã³ã³ãã¥ãŒãã·ã§ãŒãã䜿ãããšã§ãCPUã«ãã¯ã»ã«ããŒã¿ãã³ããŒããåã«GPUäžã§å€æåŠçãè¡ãããšãã§ããŸãã GPUäžã§äžŠåã«åŠçãããå¹çãã倿åŠçãè¡ãããšãã§ããŸãã
ãã®ã³ã³ãã¥ãŒãã·ã§ãŒãã§ã¯DirectXã®ããã€ã¹ãããã€ã¹ã³ã³ããã¹ãã䜿çšããŸãã詳现ã¯ãã¡ããã芧ãã ããã以éã®ã³ãŒãã§ã¯ãããã€ã¹ã¯_dxDevice
ãããã€ã¹ã³ã³ããã¹ãã¯_dxDeviceContext
ã®å€æ°ã«å
¥ã£ãŠãããã®ãšããŸãã
ã³ãŒãã®è©³çްïŒå€æ°ã®åãªã©ïŒã«ã€ããŠã¯NM_WCVCam_DS/NMVCamFilter.h
ãNM_WCVCam_DS/NMVCamPin.cpp
ãã芧ãã ããã
ã³ã³ãã¥ãŒãã·ã§ãŒãã§å¿ èŠãªãã®ããŸãšãããšä»¥äžã®éãã«ãªããŸãã
- å ¥åãšãªãå ±æãã¯ã¹ãã£
- ã·ã§ãŒããªãœãŒã¹ãã¥ãŒ
- ã³ã³ãã¥ãŒãã·ã§ãŒã
- åºåå ã®GPUäžã®ãããã¡
- Unordered Access View
ãŸããåºåå ã®GPUäžã®ãããã¡ã¯CPUãããããã¡å ã®ããŒã¿ã«ã¢ã¯ã»ã¹ã§ããªããããå¥éCPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ãäœæããããã«ããŒã¿ãã³ããŒããå¿ èŠããããŸãã
ããããã®é¢ä¿ãå³ç€ºãããšä»¥äžã®ãããªæãã«ãªããŸããïŒã€ã¡ãŒãžã§æããŠããã®ã§ãå°ã å³å¯æ§ã«æ¬ ãããšããã¯ãããŸããïŒ
NM_WindowCaptureã§äœæããŠããããã£ããã£ãããŠã£ã³ããŠç»åãå«ããã¯ã¹ãã£ãå
¥åã«äœ¿ãããããã®ååŸãå¿
èŠã§ããDirectXã®ããã€ã¹ã«ããOpenSharedResourceByName
ã䜿çšããã°ãååŸããããšãã§ããŸãã詳ããã¯ãã¡ããã芧ãã ããã
ãŸããå ±æãã¯ã¹ãã£ã¯ä»ããã»ã¹ããã¢ã¯ã»ã¹ãããããããæ åãµã³ãã«ååŸæã«ãããããmutexãååŸããæä»åŠçãè¡ãå¿ èŠããããŸãã 詳ããã¯ãã¡ããã芧ãã ããã
å ¥åãšãªãå ±æãã¯ã¹ãã£ãã³ã³ãã¥ãŒãã·ã§ãŒãã«çŽã¥ããããã®ãã®ã§ãã
ã·ã§ãŒããªãœãŒã¹ãã¥ãŒã®äœæãããã€ã¹ã³ã³ããã¹ããžã®èšå®ã¯ãã³ã³ãã¥ãŒãã·ã§ãŒããå®è¡ãã床ã«è¡ããŸããã³ã³ãã¥ãŒãã·ã§ãŒãå®è¡ã®åºŠã«ãã·ã§ãŒããªãœãŒã¹ãã¥ãŒãããã€ã¹ã³ã³ããã¹ãã«èšå®ããªããšãæŽæ°ããããã¯ã¹ãã£ãåæ ããããæ åãæ¢ãŸã£ãŠããŸããŸããïŒã·ã§ãŒããªãœãŒã¹ãã¥ãŒã®äœæã¯DirectXã®ããã€ã¹èšå®æã«äžåºŠè¡ãã°è¯ãããã§ããïŒ
- ã·ã§ãŒããªãœãŒã¹ãã¥ãŒã®äœæ
CD3D11_SHADER_RESOURCE_VIEW_DESC shaderResourceViewDesc(D3D11_SRV_DIMENSION_TEXTURE2D, DXGI_FORMAT_B8G8R8A8_UNORM);
// _sharedCaptureWindowTexture: å
±æãã¯ã¹ãã£
// _formatterSRV: ã·ã§ãŒããªãœãŒã¹ãã¥ãŒ
_dxDevice->CreateShaderResourceView(_sharedCaptureWindowTexture.get(),
&shaderResourceViewDesc, _formatterSRV.put());
- ããã€ã¹ã³ã³ããã¹ããžã®èšå®
ID3D11ShaderResourceView* tempShaderResourceViewPtr[] = { _formatterSRV.get() };
_dxDeviceContext->CSSetShaderResources(0, 1, tempShaderResourceViewPtr);
ãã¯ã¹ãã£ãå ¥åã«ã䞊åã«èšç®åŠçãè¡ãããã®ã·ã§ãŒãã³ãŒãã§ããã³ã³ãã€ã«ã¯éåžžã®é ç¹ã·ã§ãŒãããã¯ã»ã«ã·ã§ãŒããšåãèŠé ã§è¡ããŸããã³ã³ãã€ã«ãããã€ã¹ã³ã³ããã¹ããžã®èšå®ã¯DirectXã®ããã€ã¹ãèšå®ããéã«äžåºŠè¡ãå¿ èŠããããŸãã
// hlslFormatterCode: ã³ã³ãã¥ãŒãã·ã§ãŒãã³ãŒãã®æåå
size_t hlslSize = std::strlen(hlslFormatterCode);
std::string csThreadsStr = std::to_string(CS_THREADS_NUM);
std::string windowWidthStr = std::to_string(VCAM_VIDEO_WIDTH);
std::string windowHeightStr = std::to_string(VCAM_VIDEO_HEIGHT);
com_ptr<ID3DBlob> compiledCS;
// 以äžã®ããã«ã·ã§ãŒãã³ãŒãå
ã®ãã¯ãã眮æããããèšå®ããããšãã§ããã
D3D_SHADER_MACRO csMacro[] = {
"CS_THREADS_NUM_IN_CS", csThreadsStr.c_str(),
"VCAM_VIDEO_WIDTH_IN_CS", windowWidthStr.c_str(),
"VCAM_VIDEO_HEIGHT_IN_CS", windowHeightStr.c_str(),
NULL, NULL
};
// ã³ã³ãã¥ãŒãã·ã§ãŒãã®ã³ã³ãã€ã«ãäœæïŒé ç¹ã·ã§ãŒãããã¯ã»ã«ã·ã§ãŒããšåãïŒ
D3DCompile(hlslFormatterCode, hlslSize, nullptr, csMacro, nullptr,
"formatterMain", "cs_5_0", 0, 0, compiledCS.put(), nullptr);
_dxDevice->CreateComputeShader(compiledCS->GetBufferPointer(),
compiledCS->GetBufferSize(), nullptr, _formatterCS.put());
// ããã€ã¹ã³ã³ããã¹ããžã®èšå®
_dxDeviceContext->CSSetShader(_formatterCS.get(), 0, 0);
ã·ã§ãŒãã³ãŒãã®æååã¯ã以äžã®èšäºãåèã«includeæãçšããŠçŽæ¥åã蟌ãã§ããŸãããŸããSampleFormatter.hlsl
ã¯ãã«ãæã«ã³ã³ãã€ã«ããªããããVisual Studioã§ã³ãŒãã®ããããã£ãéãããé
ç®ã®çš®é¡ããããã«ãã«å«ããªããã«å€æŽããå¿
èŠããããŸãã
C++ãœãŒã¹å ã«ã·ã§ãŒããœãŒã¹ãæååãšããŠåã蟌ã
#define HLSL_EXTERNAL_INCLUDE(...) #__VA_ARGS__
const char* hlslFormatterCode =
#include "SampleFormatter.hlsl"
;
ã³ã³ãã¥ãŒãã·ã§ãŒãã®çµæãæ ŒçŽããGPUäžã®ãããã¡ã§ããç¹ã«ä»¥äžã®ãã©ã°ã«æ³šæããŠèšå®ããå¿ èŠããããŸãã
-
BindFlags
:D3D11_BIND_UNORDERED_ACCESS
ã«æå®åŸè¿°ã®Unordered Access Viewã«çŽã¥ããã³ã³ãã¥ãŒãã·ã§ãŒãããã¢ã¯ã»ã¹ã§ããããã«ããããã®èšå®
-
MiscFlags
:D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS
ã«æå®ã³ã³ãã¥ãŒãã·ã§ãŒãäžã§ãããã¡ã
RWByteAddressBuffer
ã§ã¢ã¯ã»ã¹ã§ããããã«ãããã€ãåäœã§ããŒã¿ãæžã蟌ããããã«ããããã®èšå®
GPUäžã®ãããã¡ã¯DirectXã®ããã€ã¹ãèšå®ããéã«äžåºŠäœæããŠããå¿ èŠããããŸãã
UINT bufferByteSize = VCAM_VIDEO_WIDTH * VCAM_VIDEO_HEIGHT * PIXEL_BYTE;
D3D11_BUFFER_DESC bufferDesc;
bufferDesc.ByteWidth = bufferByteSize;
bufferDesc.Usage = D3D11_USAGE_DEFAULT;
bufferDesc.BindFlags = D3D11_BIND_UNORDERED_ACCESS;
bufferDesc.CPUAccessFlags = 0;
bufferDesc.MiscFlags = D3D11_RESOURCE_MISC_BUFFER_ALLOW_RAW_VIEWS;
// _gpuFormatterBuffer: åºåå
ã®GPUäžã®ãããã¡
_dxDevice->CreateBuffer(&bufferDesc, nullptr, _gpuFormatterBuffer.put());
GPUäžã®ãããã¡ã¯ã³ã³ãã¥ãŒãã·ã§ãŒãã§åŠçããéã«ãGPUã®è€æ°ã¹ã¬ããããèªã¿æžããè¡ãããŸãããã®éã«ç«¶åãªãGPUäžã®ãããã¡ã«ã¢ã¯ã»ã¹ã§ããããã«ããããã®ãã®ããUnordered Access Viewã§ãïŒããã¥ã¡ã³ãïŒãããã§ããUnordered Accessãšããã®ã¯ãè€æ°ã¹ã¬ããããé åºãåããã«èªã¿æžãã®ã¢ã¯ã»ã¹ããããããšãæãããã§ãã
ãã®Unordered Access Viewã®äœæãšããã€ã¹ã³ã³ããã¹ããžã®èšå®ã¯DirectXã®ããã€ã¹ãèšå®ããéã«äžåºŠè¡ãå¿ èŠããããŸãã
// Unordered Access Viewã®äœæ
D3D11_UNORDERED_ACCESS_VIEW_DESC uavDesc;
UINT bufferByteSize = VCAM_VIDEO_WIDTH * VCAM_VIDEO_HEIGHT * PIXEL_BYTE;
uavDesc.ViewDimension = D3D11_UAV_DIMENSION_BUFFER;
uavDesc.Format = DXGI_FORMAT_R32_TYPELESS;
uavDesc.Buffer.FirstElement = 0;
uavDesc.Buffer.NumElements = bufferByteSize / 4;
uavDesc.Buffer.Flags = D3D11_BUFFER_UAV_FLAG_RAW;
_dxDevice->CreateUnorderedAccessView(_gpuFormatterBuffer.get(), &uavDesc, _formatterUAV.put());
// Unordered Access Viewãããã€ã¹ã³ã³ããã¹ãã«èšå®
ID3D11UnorderedAccessView* uavs[] = { _formatterUAV.get() };
UINT initialCounts[] = { 0 };
_dxDeviceContext->CSSetUnorderedAccessViews(0, 1, uavs, initialCounts);
GPUäžã®ãããã¡ã¯CPUããã¢ã¯ã»ã¹ã§ããŸãããéã«CPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ãã·ã§ãŒãã®å ¥åºåã«èšå®ããããšã¯ã§ããŸããããã®ãããGPUäžã®ãããã¡ãšCPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ãäž¡æ¹äœã£ãŠãããã³ã³ãã¥ãŒãã·ã§ãŒãã§ã®åŠçåŸã«ãGPUäžã®ãããã¡ã«ããããŒã¿ãCPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ã«ã³ããŒããæµããšãªããŸãã
CPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ã®å Žåã¯ãCPUAccessFlags
ãD3D11_CPU_ACCESS_READ
ã«ããŠããå¿
èŠããããŸããCPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ã®äœæã¯DirectXã®ããã€ã¹ãèšå®ããéã«äžåºŠè¡ãå¿
èŠããããŸãã
UINT bufferByteSize = VCAM_VIDEO_WIDTH * VCAM_VIDEO_HEIGHT * PIXEL_BYTE;
D3D11_BUFFER_DESC bufferDesc;
bufferDesc.ByteWidth = bufferByteSize;
bufferDesc.Usage = D3D11_USAGE_STAGING;
bufferDesc.BindFlags = 0;
bufferDesc.CPUAccessFlags = D3D11_CPU_ACCESS_READ;
bufferDesc.MiscFlags = 0;
// _cpuSampleBuffer: CPUããã¢ã¯ã»ã¹å¯èœãªãããã¡
_dxDevice->CreateBuffer(&bufferDesc, nullptr, _cpuSampleBuffer.put());
ã³ã³ãã¥ãŒãã·ã§ãŒãã¯ããã€ã¹ã³ã³ããã¹ãã®Dispatch
ã¡ãœãããåŒã³åºãããšã§å®è¡ãããŸãã詳现ã¯åŸè¿°ããŸãããã³ã³ãã¥ãŒãã·ã§ãŒãã§ã¯ãåŠçãè¡ãã®ã«äœ¿çšããã¹ã¬ããã®æ°ãã·ã§ãŒãã³ãŒãå
ã§æå®ããŸãããã®Dispatch
ã¡ãœããã®åŒæ°ã¯ã·ã§ãŒãã³ãŒãã§æå®ãããè€æ°ã¹ã¬ããã®ã°ã«ãŒããããã€å®è¡ããããæå®ãããã®ãšãªããŸãã
Dispatch(gx, gy, gz)
ãšæå®ããå Žåã¯ã¹ã¬ããã°ã«ãŒããgx
à gy
à gz
ã®æ°ã ãå®è¡ããããšã«ãªããŸããïŒããã«ããããã®ã¹ã¬ããã°ã«ãŒãã¯ã·ã§ãŒãã³ãŒãã§æå®ãããæ°ã ãã®ã¹ã¬ããããã€ããšã«ãªããŸããïŒ
_dxDeviceContext->Dispatch(VCAM_VIDEO_WIDTH / (CS_THREADS_NUM * 4), VCAM_VIDEO_HEIGHT / CS_THREADS_NUM, 1);
ããã€ã¹ã³ã³ããã¹ãã®CopyResource
ã¡ãœããã§ãããã¡éã®ããŒã¿ã³ããŒãè¡ãããšãã§ããŸãã
// _cpuSampleBuffer: CPUããã¢ã¯ã»ã¹å¯èœãªãããã¡
// _gpuFormatterBuffer: GPUäžã®ãããã¡
_dxDeviceContext->CopyResource(_cpuSampleBuffer.get(), _gpuFormatterBuffer.get());
ãŸããCPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ã¯IDXGISurface
ã®Map
ã¡ãœãããçšããããšã§ãDXGI_MAPPED_RECT
æ§é äœããäžèº«ã®ããŒã¿ã«ã¢ã¯ã»ã¹ããããšãã§ããããã«ãªããŸããã¢ã¯ã»ã¹ãçµããããå¿
ãIDXGISurface
ã®Unmap
ã¡ãœãããåŒã³åºããŠãã ããã
com_ptr<IDXGISurface> dxgiSurface;
_cpuSampleBuffer->QueryInterface(IID_PPV_ARGS(dxgiSurface.put()));
DXGI_MAPPED_RECT mapFromCpuSampleBuffer;
dxgiSurface->Map(&mapFromCpuSampleBuffer, DXGI_MAP_READ);
// CPUããã¢ã¯ã»ã¹å¯èœãªãããã¡ã«ãããã¯ã»ã«ããŒã¿ããDirectShowã®ä»®æ³ã«ã¡ã©ã«éãã¡ã¢ãªã«ã³ããŒ
// sampleData: DirectShowã®ä»®æ³ã«ã¡ã©ã«éãæ åã®1ãã¬ãŒã åã®ãã¯ã»ã«ããŒã¿ïŒLPByteåïŒ
CopyMemory((PVOID)sampleData, (PVOID)mapFromCpuSampleBuffer.pBits,
VCAM_VIDEO_WIDTH * VCAM_VIDEO_HEIGHT * PIXEL_BYTE);
dxgiSurface->Unmap();
以äžã®ã·ã§ãŒãã³ãŒãã§ãã¯ã¹ãã£ã§ã®BGRAã®ãã©ãŒãããããæ åãµã³ãã«ã§ã®BGRã®ãã©ãŒãããã«å€æããŠããŸããoffscreenTexture
ã§å
¥åãšããŠäžããå
±æãã¯ã¹ãã£ã«ãoutputBuffer
ã§åºåå
ã§ããGPUäžã®ãããã¡ã«ã¢ã¯ã»ã¹ã§ããŸãã
æåŸã®indexã®èšç®ã§y座æšã«ãããéšåãdispatchThreadId.y
ã§ã¯ãªã(VCAM_VIDEO_HEIGHT_IN_CS - dispatchThreadId.y - 1)
ãšããããšã§ãæ åãµã³ãã«ã«åãããŠäžäžãå転ããããã«ããŠããŸãã
HLSL_EXTERNAL_INCLUDE(
Texture2D<float4> offscreenTexture : register(t0);
RWByteAddressBuffer outputBuffer: register(u0);
[numthreads(CS_THREADS_NUM_IN_CS, CS_THREADS_NUM_IN_CS, 1)]
void formatterMain(uint3 dispatchThreadId: SV_DispatchThreadID)
{
float4 pixel0 = offscreenTexture.Load(int3(4 * dispatchThreadId.x, dispatchThreadId.y, 0));
float4 pixel1 = offscreenTexture.Load(int3(4 * dispatchThreadId.x + 1, dispatchThreadId.y, 0));
float4 pixel2 = offscreenTexture.Load(int3(4 * dispatchThreadId.x + 2, dispatchThreadId.y, 0));
float4 pixel3 = offscreenTexture.Load(int3(4 * dispatchThreadId.x + 3, dispatchThreadId.y, 0));
uint3 bgr24_3;
bgr24_3.x = (uint(pixel0.b * 255.0) & 0xFF) | ((uint(pixel0.g * 255.0) & 0xFF) << 8)
| ((uint(pixel0.r * 255.0) & 0xFF) << 16) | ((uint(pixel1.b * 255.0) & 0xFF) << 24);
bgr24_3.y = (uint(pixel1.g * 255.0) & 0xFF) | ((uint(pixel1.r * 255.0) & 0xFF) << 8)
| ((uint(pixel2.b * 255.0) & 0xFF) << 16) | ((uint(pixel2.g * 255.0) & 0xFF) << 24);
bgr24_3.z = (uint(pixel2.r * 255.0) & 0xFF) | ((uint(pixel3.b * 255.0) & 0xFF) << 8)
| ((uint(pixel3.g * 255.0) & 0xFF) << 16) | ((uint(pixel3.r * 255.0) & 0xFF) << 24);
uint index = ((VCAM_VIDEO_HEIGHT_IN_CS - dispatchThreadId.y - 1) * VCAM_VIDEO_WIDTH_IN_CS
+ 4 * dispatchThreadId.x) * 3;
outputBuffer.Store3(index, bgr24_3);
}
)
泚æç¹ãšããŠãRWByteAddressBuffer
ãã¡ã¢ãªã¢ã©ã€ã³ã¡ã³ãã®åœ±é¿ã§4ãã€ãåäœã§ã®ã¢ã¯ã»ã¹ããã§ããªãããšãæããããŸãããã®ããã1ã€ã®åŠçã§æšª4ãã¯ã»ã«åããŸãšããŠåŠçããŠããŸããããããããšã§ãGPUäžã®ãããã¡ã«4ãã¯ã»ã«Ã3ãã€ã=èš12ãã€ãã1åã®åŠçã§æžã蟌ãããã«ããŠããŸããæ åã®çžŠãšæšªã®ãã¯ã»ã«æ°ã¯åºæ¬4ã®åæ°ã§ããã®ã§ãäœããèããå¿
èŠããããŸããã
ã³ã³ãã¥ãŒãã·ã§ãŒãã§ã¯ã[numthreads(tx, ty, tz)]
ã§1ã€ã®ã¹ã¬ããã°ã«ãŒããããã®ã¹ã¬ããæ°ãæå®ããŸãããã®å Žåãtx
à ty
à tz
ã®æ°ã ãã¹ã¬ããã°ã«ãŒãå
ã«ã¹ã¬ãããäœãããŸãããã ãäžèšã®ããã¥ã¡ã³ãã«ããããã«ã¹ã¬ããã°ã«ãŒãå
ã®ã¹ã¬ããæ°ã«äžéããããã³ã³ãã¥ãŒãã·ã§ãŒãã®ããŒãžã§ã³ãcs_5_0
ã®å Žåã¯1024åãäžéã§ãã
dispatchThreadIdã¯ããããã®ã¹ã¬ããã®IDã«ããããã®ã§ãx, y, zã«å¯Ÿå¿ãã3ã€ã®æŽæ°å€ãããªããŸããç°¡åã«ãããšãnumthreads
ã®åŒæ°ãšããã€ã¹ã³ã³ããã¹ãã®Dispatch
ã®åŒæ°ã«å¿ããŠäžèšã®ãããªç¯å²ã§x, y, zã®æŽæ°å€ãåããå
šãŠã®çµã¿åããã«å¯ŸããŠ1åãã€ã³ã³ãã¥ãŒãã·ã§ãŒãã®åŠçãèµ°ããŸãã
[numthreads(tx, ty, tz)] , Dispatch(gx, gy, gz) ã®å Žå |
[numthreads(20, 20, 1)] , Dispatch(24, 54, 1) ã®å Žå |
|
---|---|---|
xæå | 0 ~ (tx à gx - 1) |
0 ~ 479 |
yæå | 0 ~ (ty à gy - 1) |
0 ~ 1079 |
zæå | 0 ~ (tz à gz - 1) |
0 ~ 0 |