Introduction - what and why

We want support for "binary blobs" to define per-vertex data. Exactly as glTF does.

Reasons:

Binary format means excellent performance of reading and writing of per-vertex data, due to avoiding string <-> float conversions during X3D classic/xml read/write.

Right now, string <-> float conversions are clearly (confirmed by my profiling sessions) the bottleneck when reading/writing classic/xml X3D versions. Real-life models have a big big number of floats, and so X3D reader/writer spends a lot of time on string <-> float conversions.

Note: This advantage is also provided by existing X3D binary encoding. But X3D binary encoding doesn't provide some other advantages listed below.
Ability to save to X3D file without losing any precision.

Right now, when we read binary data from e.g. glTF or STL, when converting them back to X3D -> they have a lot of (possibly) significant digits. You can use --float-precision=DIGITS of https://castle-engine.io/castle-model-converter to round the number, but you have to do this explicitly -- we don't know how many digits matter for you, for some people it may be 4, for some it may be 6.

To compound this issue more, when we read X3D and write it back to X3D, we also have an issue of producing extra digits in the output. That's because internally the numbers are converted to float (Single in Pascal) -- we don't store the information "how many significant digits the input contained" as it would be a big burden (to implement, to carry this information through all processing).

If we could save float (Single in Pascal) as binary, 4 bytes, it would solve this issue.

Note: This advantage is also provided by existing X3D binary encoding. But X3D binary encoding doesn't provide some other advantages listed below.
Possible excellent performance of uploading this data from RAM to GPU. Along with GPU-friendly data packing (like interleaving). Since X3D browser could "just take the binary blob and pass it to GPU" (though it is optional, X3D browser could also just use it as it does right now).
Better file sizes -- binary data is practically always smaller (even before gzipping) than equivalent text representation. Regular "float" is always just 4 bytes in binary, no matter how large, no matter how complicated is the fractional part etc.
Ideal alignment with glTF. glTF importer could just express glTF meshes as new nodes for binary data. Then it gets "optimal glTF read / write / rendering", along with "optimal X3D read / write / rendering". Everybody wins when standards are aligned :)

This is something I consider important for X3D, and I think it should be very important for X3D spec as well (compared to glTF, this is a significant drawback in X3D right now, when it comes to efficiency of reading / writing really large 3D data; it's more efficient to use glTF for this, for now; we can do better :) ).

More discusssion

Reusing existing glTF 2.0 standard ideas is the right course of action here, IMHO:

binary buffers, that can be read fast from file,
and are develivered in a straightforward fashion to GPU.
No need to intepret the binary data on the browser/player side. It's just a binary blob that we read from file and send to GPU.

Older X3D binary encoding is not a full solution to this:

X3D binary encoding gives us fast parsing and better compression. But that's not our only desire.
The goal is fast delivery to GPU, with interleaved data. As far as I know, X3D binary encoding doesn't provide this.
As a practical note: X3D binary encoding is unfortunately not popular enough. E.g. Blender exporter cannot generate such models, I don't think anyone ever made a Blender exporter that supports X3D binary encoding. In contrast, glTF export (with binary buffers) is provider by Blender exporters from Khronos (both for Blender <= 2.79 and Blender >= 2.80).

X3DOM: For users, they present ExternalGeometry and Shape Resource Containers, see https://doc.x3dom.org/author/Geometry3D/ExternalGeometry.html .

Internally, they present a new node, BufferView (previously: BufferGometryView).

Example how does glTF mesh translates to BufferGeometry in X3D

The text below comes from Andreas Plesch (thousand thanks for this valuable info!) mail on x3d-public (2018-11-24, thread "Re: [x3d-public] glTF Importing [was: X3D working group meeting minutes16 NOV 2018]")

It may be instructive to see an example of how currently x3dom is adding glTF to the X3D scene graph. Here is how a glTF of a single triangle becomes an X3D child of the inline node:

<inline url="https://raw.githubusercontent.com/cx20/gltf-test/master/tutorialModels/SimpleMaterial/glTF/SimpleMaterial.gltf"
namespacename="gltf" mapdeftoid="true">
  <transform id="gltf__NODE0" def="NODE0">
    <shape>
      <appearance>
        <physicalmaterial basecolorfactor="1 0.766 0.336 1"
metallicfactor="0.5" roughnessfactor="0.1" emissivefactor="0 0 0"
alphamode="OPAQUE" alphacutoff="0.5" model="roughnessMetallic"
diffusefactor="1,1,1,1" specularfactor="1,1,1" glossinessfactor="1"
normalspace="TANGENT" normalbias="-1,-1,1"
normalscale="1"></physicalmaterial>
      </appearance>
      <buffergeometry buffer="simpleTriangle.bin" position="0.5 0.5 0"
size="1 1 0" vertexcount="3" primtype="TRIANGLES">
        <buffergeometryaccessor buffertype="INDEX" view="0"
byteoffset="0" bytestride="0" components="1" componenttype="5123"
count="3"></buffergeometryaccessor>
       <buffergeometryaccessor buffertype="POSITION" view="1"
byteoffset="0" bytestride="0" components="3" componenttype="5126"
count="3"></buffergeometryaccessor>
        <buffergeometryview target="34963" byteoffset="0"
bytelength="6" id="gltf__0"></buffergeometryview>
        <buffergeometryview target="34962" byteoffset="8"
bytelength="36" id="gltf__1"></buffergeometryview>
      </buffergeometry>
     </shape>
  </transform>
</inline>

(buffergeometryaccessor/view became just bufferaccessor/view: https://github.com/x3dom/x3dom/issues/898)

and here is the glTF json :

{  "scenes" : [
    {  "nodes" : [ 0 ] }
  ],
  "nodes" : [
    { "mesh" : 0 }
  ],
  "meshes" : [
    {
      "primitives" : [ {
        "attributes" : { "POSITION" : 1 },
        "indices" : 0,
        "material" : 0
      } ]
    }
  ],
  "buffers" : [
    {
      "uri" : "simpleTriangle.bin",
      "byteLength" : 44
    }
  ],
  "bufferViews" : [
    {
      "buffer" : 0,
      "byteOffset" : 0,
      "byteLength" : 6,
      "target" : 34963
    },
    {
      "buffer" : 0,
      "byteOffset" : 8,
      "byteLength" : 36,
      "target" : 34962
    }
  ],
  "accessors" : [
    {
      "bufferView" : 0,
      "byteOffset" : 0,
      "componentType" : 5123,
      "count" : 3,
      "type" : "SCALAR",
      "max" : [ 2 ],
      "min" : [ 0 ]
    },
    {
      "bufferView" : 1,
      "byteOffset" : 0,
      "componentType" : 5126,
      "count" : 3,
      "type" : "VEC3",
      "max" : [ 1.0, 1.0, 0.0 ],
      "min" : [ 0.0, 0.0, 0.0 ]
    } ],
  "materials" : [
    {
      "pbrMetallicRoughness": {
        "baseColorFactor": [ 1.000, 0.766, 0.336, 1.0 ],
        "metallicFactor": 0.5,
        "roughnessFactor": 0.1
      }
    } ],
  "asset" : { "version" : "2.0" }
}

BufferGeometry references the binary data (file or dataurl or objecturl). It was most natural to also introduce the accessor/view nodes which define how the referenced data is interpreted. It is then up to the implementation how the binary data is used for rendering etc.

https://raw.githubusercontent.com/KhronosGroup/glTF-Sample-Models/master/2.0/BoxAnimated/glTF/BoxAnimated.gltf

has a simple animation of translation. It gets translated to this:

<transform id="gltf__NODE3" def="NODE3">
...
</transform>
<transform id="gltf__NODE0" def="NODE0" translation="0,0,0">
  <transform id="gltf__NODE1" def="NODE1">
    <transform rotation="0 0 0 6.283185307179586" id="gltf__NODE2" def="NODE2" >
...
    </transform>
  </transform>
</transform>

<timesensor loop="true" cycleinterval="3.708329916000366"
def="clockANI0" enabled="true" first="true"
id="gltf__clockANI0"></timesensor>

<orientationinterpolator def="interANI0CH0" key="sampler.input.array"
keyvalue="sampler.output.array" buffer="BoxAnimated0.bin"
id="gltf__interANI0CH0">
  <buffergeometryaccessor buffertype="SAMPLER_INPUT" view="2"
byteoffset="0" bytestride="0" components="1" componenttype="5126"
count="2"></buffergeometryaccessor>
  <buffergeometryaccessor buffertype="SAMPLER_OUTPUT" view="3"
byteoffset="0" bytestride="0" components="4" componenttype="5126"
count="2"></buffergeometryaccessor>
  <buffergeometryview target="undefined" byteoffset="7760"
bytelength="24" id="gltf__2"></buffergeometryview>
  <buffergeometryview target="undefined" byteoffset="0"
bytelength="32" id="gltf__3"></buffergeometryview>
</orientationinterpolator>

<route fromfield="fraction_changed" fromnode="clockANI0"
tofield="set_fraction" tonode="interANI0CH0"></route>

<route fromfield="value_changed" fromnode="interANI0CH0"
tofield="set_rotation" tonode="NODE2"></route>

<positioninterpolator def="interANI0CH1" key="sampler.input.array"
keyvalue="sampler.output.array" buffer="BoxAnimated0.bin"
id="gltf__interANI0CH1">
  <buffergeometryaccessor buffertype="SAMPLER_INPUT" view="2"
byteoffset="8" bytestride="0" components="1" componenttype="5126"
count="4"></buffergeometryaccessor>
  <buffergeometryaccessor buffertype="SAMPLER_OUTPUT" view="4"
byteoffset="0" bytestride="0" components="3" componenttype="5126"
count="4"></buffergeometryaccessor>
  <buffergeometryview target="undefined" byteoffset="7760"
bytelength="24" id="gltf__2"></buffergeometryview>
  <buffergeometryview target="undefined" byteoffset="32"
bytelength="48" id="gltf__4"></buffergeometryview>
</positioninterpolator>

<route fromfield="fraction_changed" fromnode="clockANI0"
tofield="set_fraction" tonode="interANI0CH1"></route>

<route fromfield="value_changed" fromnode="interANI0CH1"
tofield="set_translation" tonode="NODE0"></route>

Here, the glTF animation can be faithfully translated to TimeSensor/Interpolator/ROUTE combos with the addition that the interpolators take their key and keyValue field values from the binary buffer accessor nodes as well, if there is a buffer field value.

For x3dom and perhaps other implementations it seemed most natural to introduce these binary access nodes for glTF loading and perhaps other use.

From a specification perspective, however, introducing new nodes is something which needs to be carefully considered and is hard to do well. For example, it may be more appropriate to introduce new interpolator nodes rather than extending the existing ones.

So from a specification perspective it is attractive to not expose how the glTF scene is actually represented in the X3D scene graph and essentially keep it a black box. The spec. language could then even boil down to allowing glTF content and only referring to glTF to how the content should be rendered and animated. Implementation are then of course still free to use internal nodes and potentially expose them but in a non-standardized way.

But most glTF features do not require new X3D nodes to be represented in the scene graph, and an ability to import those is very useful, probably expected by authors and not too hard to implement since all pieces are already in place. From a specification perspective this ability then would need to be very well defined, again requiring careful selection of what can be imported and how. Other formats such jpeg or nrrd, though simpler, are already referenced in the spec., so I think it may be possible for the spec. to be precise on how certain glTF features get absorbed into a X3D scene.

For example, glTF "nodes" can be directly represented by named X3D Transform nodes, and glTF cameras by X3D Viewpoint nodes. The exported DEF names could be the glTF names (perhaps with a prefix) where provided or a unique constructed name using the index from the glTF ('gltfTRANFORM_0') where not provided.

Perhaps that is all which could be required in a spec. in terms of interplay between glTF and X3D.

It is less clear how glTF animations should be required to be absorbed into X3D from a spec perspective. One approach is as shown which requires a named TimeSensor for each animation (which can target different channels in glTF). The DEF name would be after the index in the glTF animations array ('gltfCLOCK_0'). But implementations may choose to not use a TimeSensor/Interpolator/Route combo and use a more direct approach. I suppose then still a virtual TimeSensor could be required to be exported to provide an interface for the animation.

If PhysicalMaterial gets picked up by X3D, it would become the node to represent the glTF material, also named after the index ('gltfMaterial_0').

I hope to find the time and strength at some point to compare glTF skinned skeleton animation with HAnim animation. I think they are largely compatible.

Endianess

Note that we have to specify endianess for binary data.

Just like glTF ( https://registry.khronos.org/glTF/specs/2.0/glTF-2.0.html ) :

All buffer data defined in this specification (i.e., geometry attributes, geometry indices, sparse accessor data, animation inputs and outputs, inverse bind matrices) MUST use little endian byte order.

Binary glTF is little endian.

Choosing little-endian makes sense -- it matches almost all the existing CPUs nowadays. And for X3D, choosing the same as glTF also has additional benefit (converting X3D <-> glTF is more effortless when we make same decisions).

Binary meshes - michaliskambi/x3d-tests GitHub Wiki

Introduction - what and why

More discusssion

Example how does glTF mesh translates to BufferGeometry in X3D

Endianess

⚠️ GitHub.com Fallback ⚠️

Binary meshes - michaliskambi/x3d-tests GitHub Wiki

Introduction - what and why

More discusssion

Example how does glTF mesh translates to BufferGeometry in X3D

Endianess

⚠️ **GitHub.com Fallback** ⚠️

⚠️ GitHub.com Fallback ⚠️