videofilt_gotchas - shekh/VirtualDub2 GitHub Wiki
VirtualDub Plugin SDK 1.2
Advice and gotchas
Here are some things to watch out for when writing your video filter:
The fa->dst.offset
field may inadvertently introduce a bug in your
filter if you are writing it as in-place. By default, this field is
initialized to zero, placing the destination bitmap to the start of the
buffer. This is appropriate for two-buffer filters, but for an in-place
filter it results in the destination bitmap being misplaced compared to
the source bitmap. In most cases this isn't a problem, except when
cropping is enabled. When cropping occurs on the source bitmap, this can
cause the cropping to malfunction.
To avoid this problem, simply copy over the offset from the source bitmap to the output bitmap:
fa->dst.offset = fa->src.offset;
When a scanline is 319 pixels (1276 bytes) in length, you must read or write no more than exactly 1276 bytes. You may not round this value up, write 1280 bytes, and hope there is additional unused memory at the end of the scanline. Doing so may cause a number of issues including corrupting adjacent buffers, destabilizing the application, or causing a crash. The same goes for alignment — you cannot assume that a scanline is aligned to 16 bytes just because it would be convenient. This may require some inconvenient fixup code for the odd bytes along the side.
There are a couple of exceptions. The first is that if you are running on a host that supports API V14 or higher, you can request aligned scanlines to simplify the situation (see CPU dependent optimization). The second is that you can bend the rules slightly if you are doing an advanced trick for optimizing unaligned accesses.
The trick is that you can safely read an unaligned word by reading the
two aligned words that contains it. In particular, you may read all
bytes in any 16 byte region aligned to a 16 byte boundary, or write all
of those bytes as long as you do not actually change the values of bytes
not belonging to the scanline. In other words, if a scanline spans
0136F004
to 0136F503
, you can issue an aligned 16-byte read at
0136F000
without risking an access violation. Optimized algorithms
that require aligned scanlines can often be used on unaligned scanlines
simply by applying appropriate masks on the edges.
V14+ only: If FILTERPARAM_ALIGN_SCANLINES
is set in
paramProc(), you can safely write an
integral number of 16 byte xmmwords even if this runs beyond the end of
the scanline. For instance, if you are writing a 319 pixel wide, 32-bit
scanline, you would normally only be able to write 319 * 4 = 1276
bytes, but instead you can write 80*16 = 1280 bytes. The contents of
the bytes beyond the end of the scanline are ignored.
When treating the pitch for a bitmap/pixmap, you should always treat it
as signed and use the standard C/C++ standard ptrdiff_t
type. In
particular, pixmaps can and often do have negative pitches, and using an
unsigned type will cause crashes. In most cases you can get away with
using a regular signed int
, but it's simpler and actually faster just
to use the correct type instead.
The upper byte of each 32-bit pixel, the alpha channel, is unused. Its value is completely arbitrary on entry to the video filter and ignored on output. If you are porting image processing code from other sources, make sure it does not rely on the alpha byte to be set in any particular manner.
Make sure your filter doesn't depend on any runtime libraries that you
don't ship with the filter binary. With Microsoft Visual C++, make sure
you either statically link your filter to the C runtime library (CRT) or
distribute the CRT DLL. This is particularly important for versions of
VC++ beyond 6.0, which no longer use MSVCRT.DLL
when the DLL version
of the CRT is enabled. In VC++, the CRT linking mode is controlled in
the Code Generation page of the compiler options in the project's
settings. Statically linking filter DLLs to the CRT is the safest and
most convenient model for distribution, and is recommended unless you
are experienced with CRT distribution issues.
VirtualDub specific: Older versions of VirtualDub could run up against the operating system's limit for thread local storage (TLS) handles, of which one was required for each instance of the CRT loaded. These are consumed for each filter loaded when filters are statically linked to the CRT, and other DLLs that needed to load like video and audio codecs also consumed TLS slots. Because there are only 64 TLS slots available per process in Windows 95/98/NT4 and 80 in Windows 98, it was possible to have enough filters and codecs installed that DLLs would fail to load. This is mostly a non-issue now because the TLS slot limit was raised to 2088 in Windows 2000 and VirtualDub started dynamically loading and unloading filter DLLs starting with 1.5.0.
You aren't guaranteed that the thread that startProc
or runProc
is
called on is a UI thread. COM may not be initialized — particularly
important for the shell APIs — and the thread may not even have a
message pump. In fact, you aren't even guaranteed that two consecutive
calls to runProc
occur on the same thread, only that two threads won't
run that function at the same time. Therefore, one thing that you should
never do is attempt to create a window or dialog from those
functions. Attempting to do so will likely destabilize the host process.
UI should be opened only in configProc
.
If you really must create UI in normally non-interactive entry points,
the only safe way to do so is to launch a separate thread and perform
all UI operations there. When doing so, you must make sure that this UI
never attempts to block on a host thread, such as by calling
SendMessage()
on a host window, as that may cause a deadlock. If this
UI can persist longer than filter instances, you must also hold a
reference on the filter DLL so that it isn't unloaded by the OS when the
host attempts to unload it.
The precision mode bits in the x87 floating point control word (FPUCW) belong to the application. So do the exception mask bits. And the SSE flush-to-zero (FTZ) and denormals-are-zero (DAZ) bits. Do you see a pattern here? Leave 'em alone and don't attempt to flip them in your filter unless you change them back before returning to the host. In fact, changing some of these bits and calling external code is also a violation of the Win32 calling interface. Don't do it, or at least, do it on a thread that you control.
The most common way that this rule is violated is via the compiler
runtime. Microsoft Visual C++ shouldn't cause a problem here, even with
/arch:SSE
or /arch:SSE2
, but older Borland C/C++ and Delphi runtimes
attempted to change FPU exception and precision settings on
initialization, and some versions of the Intel C/C++ runtime may also
attempt to do so depending on compile settings. Make sure these
mis-features are disabled in your filter's compilation settings.
The other way to violate is rule by accident is to attempt to initialize
Direct3D, which by default changes the x87 FPU precision to 32-bit
floating point for the current thread. You can avoid this by
initializing Direct3D in a worker thread, which you will probably need
to do anyway because of the threading issues noted above, even with
D3DCREATE_MULTITHREADED
. If you are initializing and shutting down
Direct3D within a single call, another way to avoid this problem is to
set D3DCREATE_FPU_PRESERVE
. Typically the vertex processing load is so
low in a video filter that this shouldn't introduce any noticeable
performance penalty.
VirtualDub specific: VirtualDub aggressively checks for and corrects any detected violations in the above rules. In many cases, if your filter leaves FPU settings in an incorrect state, it will force them back to the correct values. For certain egregious violations, most notably leaving MMX active, it will also display a warning to the user that the filter is broken.
Source frames beyond the first only have a subset of valid fields. None of the base bitmap fields are valid on secondary frames; only the pixmap and pixmap layout can be used to access the image data. The frame number, frame timestamps, and cookie are valid.
Copyright (C) 2007-2012 Avery Lee.