Schema Self Printing - softworkz/ffmpeg_output_apis GitHub Wiki

Motivation

When outputting data in a structured and machine-readable format, it is typically done with the intention of further processing it with other applications, tools or data systems. Developing any of those requires precise knowledge about the data formats and structues they will be dealing with. Formats which are intended for data exchange like XML, JSON or YAML allow to define the data content via schema definitions. These days, a lot of tooling exists already which makes consumption of data easy by auto-generating program code from those schema definitions, that's why schemas don't only have an informational but also a very practical and valuable purpose.

Current Situation

At the time of writing there exists only one such schema, which is for XML output generated by ffprobe and available from this URL:

http://www.ffmpeg.org/schema/ffprobe.xsd

Currently, the xsd file is manually update by developers on each change to the ffprobe output. While this procedure has been working well for more than a decade, it also has a few disadvantages:

Developers need to update the schema file manually
There are no schema definitoins for JSON or YAML
The URL is not easy to discover
The published schema doesn't necessarily correspond to the ffprobe version one is using

For ffprobe output alone, these points might not make a strong enough case but in the light of the intended direction of making AVTextFormat a public API within FFmpeg allowing it to be used for more than just FFprobe output, and also considering the evolvements of common practices towards formats like JSON and YAML, it would surely be beneficial to have an integrated way for providing data schemas.

Proposal

The idea is simple and straightforward: FFmpeg/FFprobe should be able to provide those schema definitions directly, same like the data is output already.

For a hand-curated schema it would be a tedious effort to provide schema definitions for multiple formats, but if these would be machine-generated, it would be easy to proide them for formats like JSON or YAML as well, and for all outputs from the AVTextFormatter APIs.

How?

The easy part is about elements/types and the hierarchy structure: Pretty much all of this can be inferred from the AVTextFormatSection definitions. What's missing there are the individual members that are output for each element and their types - that's crucial information. So how can we get to know about which fields are printed? The C language doesn't have any kind of reflection cabilities, that would make it easy... Of course we could add all fields and their types to the section definitions, but that would be an awful duplication with high chances to get out-of-sync with the actually printed fields. What remains is the only other way I could think of:

Capturing the fields from the output!

SchemaProxy

To give this a try, I have created a new AVTextFormatter, avtextformatter_schemaproxy.

Recording Phase
- The schemaproxy formatter does nothing in print_section_header and print_section_footer, but in the other print_ functions:
- It maintains one dictionary for each section type. Whenever a value is printed, it adds/sets the name of the field as key and the data type as value in the dictionary for that section.
- It doesn't print any actual output... ...until print_section_footer is called for the top-level element
Output Phase Now it is getting active itself
- Creates the actual AVTextFormatter specified by the user
- (TODO) Calls an API of the (schema-capable) formatter making it print the schema for its specific format, by supplying
  - The section definitions
  - The dictionary for each section with the recorded fields

The last bullet point isn't implemented yet, but what is done already is the "Schema Diagram Output" (see down below).

Something is Missing...

...in case you are wondering already: yes that's right. How does the whole procedure get even initiated? With a regular command line for probing, specified by the user? In that case - how to make sure that it outputs a complete schema? If we operate on a command line provided by the user, it's very unlikely that it would cover all of the defined sections, and in turn there would be missing fields in the generated schema.

The probably only way to achieve a consistent and complete schema result through this approach would mean to build up a full mock-up of all the objects/structures to simulate a situation where all sections would be printed.

When I had come to this point, I had quickly decided to give up on the whole idea, because I really didn't want to write all the code that would be needed for creating all the required elements through their APIs, dealing with tons of issues that need to be worked around because it's not a regular case, no real input to work with etc. - and afterwards freeing all the memory in the right order and again dealing with issues due to it being not a normal case.

To the Rescue

Thinking about it later again, I realized a number of things:

It is not really required to mock up a real-world situtation:
It doesn't matter which values the fields have, we only need to make sure to avoid NULL values where something is printed or needed as part of the printing procedures
It is sufficient to have just enough parts included that each section is printed only once
Maybe I can get away by doing it all just via stackk allocation...

30 LOC

And I could. I managed to mock up everything that is needed for a complete execution graph printout with only 30 lines of code. No manual allocations, no allocation checking, no freeing:

AVClass default_class = { .class_name = "default_class", .item_name = av_default_item_name, .version = LIBAVUTIL_VERSION_INT };
AVInputFormat ifo = { .name = "ifo" };
AVOutputFormat ofo = { .name = "ofo" };
AVFormatContext fc = { .iformat = &ifo, .oformat = &ofo, .url = "url.mkv", };
AVCodecParameters ipar = { .codec_id = AV_CODEC_ID_MPEG2VIDEO, .codec_type = AVMEDIA_TYPE_VIDEO };
AVCodec icodec =  { .name = "icodec" };
Decoder idecoder =  { 0 };
InputStream is = { .index = 0, .par = &ipar, .dec = &icodec, .decoder = &idecoder };
InputFile ifi = { .class = NULL, .index = 0, .streams = (InputStream*[]) { &is }, .nb_streams = 1, .ctx = &fc };
AVCodecContext enc_ctx = { .av_class = &default_class };
Encoder enc = { .enc_ctx = &enc_ctx };
AVStream av_os = { .index = 0, .codecpar = &ipar };
OutputStream os = { .index = 0, .st = &av_os, .type = AVMEDIA_TYPE_VIDEO, .enc = &enc, .ist = &is };
Muxer ofi = { .of.class = NULL, .of.index = 0, .of.url = "output.mkv", .of.streams = (OutputStream*[]) { &os }, .of.nb_streams = 1,.fc = &fc };
InputFilterPriv ifilt = { .ifilter.name = (uint8_t*)av_strdup("ifilt"), .index = 0, };
OutputFilterPriv ofilt = { .ofilter.name = (uint8_t*)av_strdup("ofilt"), .name = "filt", .index = 0 };
FilterGraphPriv fg = { .graph_desc = "desc", .fg.index = 0, .fg.inputs = (InputFilter*[]) { &ifilt.ifilter }, .fg.nb_inputs = 1, .fg.outputs =  (OutputFilter*[]) { &ofilt.ofilter }, .fg.nb_outputs = 1 };
AVHWDeviceContext hw_ctx = { .av_class = &default_class, .type = AV_HWDEVICE_TYPE_QSV, };
AVBufferRef hw_ctx_buf = { .data = (uint8_t*)&hw_ctx };
AVHWFramesContext frames_ctx = { .av_class = &default_class, .device_ctx = &hw_ctx, .format = AV_PIX_FMT_ARGB, .height = 1, .width = 1, .initial_pool_size = 1, .sw_format = AV_PIX_FMT_ABGR };
AVBufferRef frames_ctx_buf = { .data = (uint8_t*)&frames_ctx };
AVFilterLink link = { .type = AVMEDIA_TYPE_VIDEO, .color_range = AVCOL_RANGE_JPEG, .format = 1, .w = 1, .h = 1, .sample_aspect_ratio = av_make_q(1, 1), .colorspace = AVCOL_SPC_BT709, .sample_rate = 44000, .ch_layout = AV_CHANNEL_LAYOUT_STEREO, .time_base = av_make_q(1, 1) };
AVFilter filter = { .name = "filter", .nb_inputs = 1, .nb_outputs = 1, .description = "desc" };
AVFilterContext filter_ctx = { .av_class = &default_class, .name = av_strdup("filter_ctx"), .nb_inputs = 1, .inputs =  (AVFilterLink*[]) { &link }, .nb_outputs = 1, .outputs =  (AVFilterLink*[]) { &link }, .filter = &filter, .hw_device_ctx = &hw_ctx_buf, .extra_hw_frames = 1 };
AVFilterGraph graph = { .av_class = &default_class, .aresample_swr_opts = av_strdup(" "), .scale_sws_opts = av_strdup(" "), .nb_filters = 1, .filters = (AVFilterContext*[]) { &filter_ctx } };
is.file = &ifi;
os.file = &ofi.of;
link.src = &filter_ctx;
link.dst = &filter_ctx;
filter_ctx.graph = &graph;

It might not be a good practice in general, but the above will be executed only in the special case of schema printing, where the application runs of the sole purpose of doing so and exits right after.

Schema Diagrams

This is nothing more than a cherry on top, and I wanted to try out anyway how the mermaid formatter would work out and need to be changed to handle a different type of diagram. For the execution graph it is using a "FlowChart" diagram (probably the most common one @mermaid) and what appears to make sense for the data schema case is an "erDiagram" (Entity Relationship).

Examples

Here are two examples. For the first one, all fields are included, but the mocking of the fata for FFprobe isn't done yet, that's why the fields are available only for some but not all sections/elements: