Converting glTF to X3D - michaliskambi/x3d-tests GitHub Wiki
Table of Contents:
- What is this?
- Who is this document for?
- Sample glTF files
- Meshes
- Transformations and their animations
- Extras (metadata)
- Cameras
- Materials
- Gamma Correction
- alphaMode
- alphaCutoff
- Per-vertex colors
- Texture coordinates
- Skinned mesh animation
- Punctual lights
- Image-based lighting
What is this?
This document reflects my (Michalis) experience of implementing glTF in Castle Game Engine. We load glTF files in Castle Game Engine, and convert them internally to X3D nodes before doing anything substantial (like rendering or animating). So I needed to express every idea from glTF (that I want to support in CGE) as some X3D node/field construction.
And I wanted to have lots of glTF features: Physically-Based Rendering, animations (skinned and not skinned), lights, cameras etc..
The unit that implements the conversion is here: X3DLoadInternalGltf source code. If anything in this is not clear, just go there and read the actual source.
I welcome feedback from other browsers about how they implement glTF. I have collected some information in Binary meshes about some X3DOM parts. You can write e.g. on x3d-public mailing list and I will try to incorporate it here.
Two most important accompanying documents for this:
- glTF 2.0 specification: https://github.com/KhronosGroup/glTF/tree/master/specification/2.0
- X3D 4.0 specification (description of nodes): https://www.web3d.org/specifications/X3Dv4Draft/ISO-IEC19775-1v4-CD1/
Who is this document for?
-
For people using glTF in conjunction with X3D. It may be helpful to you to know what happens under the hood when you do
Inline { url "model.gltf" }
in X3D.E.g. it is useful to know that each glTF animation is a
TimeSensor
in X3D, and you can control it from X3D. We demonstrate this case in skinned_anim_run_animations_from_x3d.x3dv (to test it, download demo models and openblender/skinned_animation/skinned_anim_run_animations_from_x3d.x3dv
with view3dscene. -
For browser implementors that already support X3D, and want to add glTF. This means you probably want to implement something similar to what I'm doing.
Sample glTF files
- Khronos: https://github.com/KhronosGroup/glTF-Sample-Models (free, good coverage of glTF features, some impressive some just a boring feature test)
- Sketchfab: https://sketchfab.com/features/gltf (a lot of really impressive graphically models)
- Google Poly: https://poly.google.com/search/gltf (not much worth, many duplicates, many copies from Khronos)
- Castle Game Engine tests: https://github.com/castle-engine/demo-models/tree/master/blender , https://github.com/castle-engine/demo-models/tree/master/gltf .
Meshes
glTF "mesh" is a collection of glTF "primitives". In X3D this is just a Group
of Shape
nodes.
Most glTF primitive modes translate naturally to X3D:
- glTF
Triangles
-> X3D[Indexed]TriangleSet
(indexed or not, depending on whetherindices
are provided in glTF) - glTF
TriangleStrip
-> X3D[Indexed]TriangleStripSet
- glTF
TriangleFan
-> X3D[Indexed]TriangleFanSet
- glTF
LineStrip
-> X3D[Indexed]LineSet
- glTF
Points
-> X3DPointSet
(note that we don't haveIndexedPointSet
in X3D; it could make sense for consistency, but the usefulness of it would be probably very low; for now in CGE, we just ignore indexes of glTFPoints
primitive, so we possibly display more points)
Careful: glTF Lines
do not naturally map to X3D [Indexed]LineSet
. glTF Lines
are like OpenGL GL_LINES
primitive, i.e. 2 vertexes for each line. X3D [Indexed]LineSet
is more like a number of OpenGL GL_LINE_STRIP
primitives.
In CGE/view3dscene, we have introduced [Indexed]LineSet.mode
field (see docs: https://castle-engine.io/apidoc/html/X3DNodes.html#TLineMode ):
STRIP
(default): results in X3D spec behavior, likeGL_LINE_STRIP
.LOOP
: similar toSTRIP
, but each polyline is automatically closed, so it's likeGL_LINE_LOOP
.PAIR
: results in "each 2 vertexes form a line", likeGL_LINES
.
In effect we can also handle:
- glTF
Lines
-> X3D[Indexed]LineSet
withlineMode
=PAIR
- glTF
LineLoop
-> X3D[Indexed]LineSet
withlineMode
=LOOP
Most of vertex attributes and texture parameters have a straightforward translation.
For explicit tangent information, CGE has extension Tangent
node.
Transformations and their animations
glTF "node" is X3D Transform
.
glTF samplers that animate transformations with Linear
interpolation can be expressed in X3D perfectly using TimeSensor
+ PositionInterpolator
(to animate translation/scale) or OrientationInterpolator
(to animate rotation).
glTF samplers that animate transformations with Step
or CubicSpline
could be expressed in X3D:
-
By simulating them using existing X3D linear interpolation.
- For
Step
, you can just duplicate appropriate keys and values. It's not efficient (you'll have 2x more time points), but it is correct. - For
CubicSpline
, you can calculate a number of values in-between to approximate the curve e.g. by 10 points. This means you'll have more time points, and it is not fully precise (you'll approximate curve by a number of points), but in practice it works well for typical models.
- For
-
Or add a field to X3D interpolators like
mode
to specify mode as [LINEAR
,STEP
,CUBIC_SPLINE
]. This is more efficient, and CGE is going in this direction.
For now, Castle Game Engine
- approximates
CubicSpline
by aLinear
with more points (to simulate a curve). - as an extension, adds
STEP
mode for interpolators.
Extras (metadata)
glTF "extras" is a key->value dictionary to express "any additional data" at various places of the model. The idea is identical to X3D "metadata".
We convert a glTF object with extras
by adding a MetadataSet
into the relevant X3D node. The name
of this MetadataSet
is "ContainerForAllMetadataValues"
(we have to invent something for MetadataSet
that merely acts as a container). Then as value, we place a number of MetadataString
, MetadataBoolean
, MetadataDouble
that correspond to the glTF extras.
For example this glTF:
"extras" : {
"MyObjectProperty" : "object prop value",
"FloatProperty" : 456.789
},
-> gets converted to thix X3D:
metadata MetadataSet {
name "ContainerForAllMetadataValues"
value [
MetadataString {
name "MyObjectProperty"
value "object prop value"
}
MetadataDouble {
name "FloatProperty"
value 456.78899999999999
}
]
}
Cameras
glTF node with a camera translates into X3D OrthoViewpoint
or Viewpoint
node wrapped in Transform
. This is mostly straightforward.
X3D Viewpoint.FieldOfView
is equal to glTF Camera.Perspective.YFov
.
Remember when converting that the default X3D Viewpoint.position
is 0 0 10
, for glTF camera you want to set it to 0 0 0
.
glTF says that the "+Y is up", which means that e.g. gravity should work in -Y direction, regardless of the camera node transformation. But in X3D, the transformation of X3DViewpointNode
changes the gravity vector. This can be solved by
-
(complicated) an extra
Transform
node that "cancels out" the transformation around viewpoint, and then you specify it using onlyorientation
field of viewpoint. This requires calculating accumulated rotation during conversion. -
(simpler) in CGE, we just use
X3DViewpointNode.gravityTransform
extension. Setting it to false means that gravity vector is not transformed by viewpoint transformation.
Materials
-
The standard glTF pbrMetallicRoughness material should be converted to X3D 4.0
PhysicalMaterial
node. The names and interpretation of the fieldsbase*
,emissive*
,metallicRoughness*
,normalTexture
is deliberately consistent betweenPhysicalMaterial
and glTF standard material, to make this a straightforward conversion. All texture data is treated the same way (same channels are used for the same purpose, same channels are ignored). -
glTF materials specified with KHR_materials_unlit extension should be converted to X3D 4.0
UnlitMaterial
. Note thatbaseColor/baseTexture
are converted to X3DemissiveColor/emissiveTexture
(we are inconsistent in naming here with glTF (base->emissive), because this is better: this color is really used like "emissive" and it allows forX3DOneSidedMaterialNode
to haveemissive*
fields that are inherited by all materials). -
glTF materials specified with KHR_materials_pbrSpecularGlossiness extension cannot for now be reliably converted to a standard X3D node. You can handle them by converting to pbrMetallicRoughness coefficients, but this is far from perfect. My X3DLoadInternalGltf source code has some code to convert them, but only on CPU at loading (so textures with SpecularGlossiness coefficients are ignored, which breaks the look of some models).
X3DOM has a little different variant of
PhysicalMaterial
that seems to account for specular/glossiness, judging from field names in the example on Binary meshes.glTF is backing off from
KHR_materials_pbrSpecularGlossiness
, recommending instead KHR_materials_specular -
We plan to introduce in X3D
PhysicalMaterial
additional fields to support features consistent with glTF PBR extensions. There are 6 PBR extensions relevant now, see
Gamma Correction
While gamma correction is not something to take into account at conversion moment (you don't need to convert nodes/colors differently), it's something to take into account when rendering.
Gamma correction is necessary to get the same rendering results as glTF. X3D does not specify whether to do gamma correction.
- Analysis what various implementations/specs do about gamma correction: https://github.com/michaliskambi/x3d-tests/wiki/Gamma-correction-in-X3D-and-glTF
- Documentation of gamma correction in Castle Game Engine: https://castle-engine.io/manual_gamma_correction.php
CGE does gamma correction by default only on PBR materials. This allows to get good look for PBR materials from glTF, but also keep backward-compatible look for Phong and unlit. To be precise:
- by default gamma correction is enabled for
PhysicalMaterial
(regardless if it comes from glTF or explicit X3D), - by default it is disabled for
Material
andUnlitMaterial
(again, regardless if it comes from glTF or explicit X3D).
This is not perfect (for 100% glTF compatibility one should enable it always, so also on UnlitMaterial
-- yes it has an effect on how the emissive color field is processed). But this default seems best to do "what authors expect" while simultaneously "not break a lot of existing models" (we use UnlitMaterial
a lot internally in CGE already, and the Phong Material
is used in almost all existing VRML/X3D models).
For 100% glTF correctness, you should use gamma on all glTF possible materials (so PBR and unlit). In CGE, user can switch GammaCorrection := gcAlways
to achieve this.
X3DOM does gamma correction always, by default.
Future X3D specification may address this.
alphaMode
glTF allows to specify alphaMode
which forces the author to explicitly choose alpha treatment: opaque, blend, mask (alpha-test).
X3D 4 now includes the Appearance.alphaMode
field to express it too.
I very encourage all browsers to implement it.
This is a great feature IMHO, because auto-detecting this unavoidably fails in some complicated situations. X3D 3 specification didn't say how to decide whether you use blending or alpha-testing. While some cases are easy to auto-detect (if Material.transparency
is 0.75
then you probably want blending), other cases are harder (need to analyze the texture contents to differentiate from yes/no alpha channel and smooth alpha channel; and what do you do in case multiple textures (using MultiTexture
or various texture slots) indicating different blending?).
In Castle Game Engine we also had an older solution to this, Appearance.alphaChannel
. It is now deprecated in favor of X3D 4 standard Appearance.alphaMode
.
alphaCutoff
X3D 4 now includes the Appearance.alphaCutoff
field to express it.
Note: X3DOM Appearance.alphaClipThreshold seems to provide a straightforward translation of this. (TODO: Not tested in X3DOM. Do linked X3DOM docs show good default (0.1)? CGE and glTF alphaCutoff is by default 0.5.)
Per-vertex colors
glTF mesh can contain a COLOR_0
attribute. This can be translated to X3D Color
or ColorRGBA
node (depending on whether accessor type is vec3 or vec4), with a caveat: the X3D Color
or ColorRGBA
nodes replace the color by default, while glTF attributes multiply them.
In Castle Game Engine we introduced mode
to X3DColorNode
to address this.
SFString [] mode "REPLACE" # allowed values: ["REPLACE","MODULATE"]
"REPLACE"
is default, and is compatible with X3D 3."MODULATE"
means to multiply per-vertex colors (with the same value as was replaced by"REPLACE"
, likeMaterial.diffuseColor
orPhysicalMaterial.baseColor
orUnlitMaterial.emissiveColor
, with alpha added fromXxxMaterial.transparency
).
See https://castle-engine.io/x3d_implementation_rendering_extensions.php#section_ext_color_mode .
So when you import glTF, simply set mode
to "MODULATE"
on Color
/ ColorRGBA
node, to get behavior required by glTF.
Texture coordinates
glTF says that vertical texture coordinates 0..1 go from top to bottom.
X3D, like OpenGL, says that vertical texture coordinates 0..1 go from bottom to top.
There are various possible ways to reconcile this.
-
In Castle Game Engine I introduced flipVertically field for this purpose. It is set to
TRUE
for all texture nodes created when importing glTF. This allows me to forget about this problem later (in shader code), and I don't need to process texture coordinates. I only need to flip image vertically at loading, which can be done in zero time (because many graphic formats, like PNG, actually already store the data from bottom to top). -
X3DOM just flips the Y texture coordinate in the shader for
PhysicalMaterial
. This is simple, but it also assumes thatPhysicalMaterial
always comes from glTF model. Which is not true for X3D 4,PhysicalMaterial
"stands on its own" -- X3D authors may use it, independently from glTF. -
You could also use
TextureTransform
to achieve this, i.e. flip texture coordinates.appearance Appearance { textureTransform TextureTransform { translation 0 -1 scale 1 -1 } ...
Skinned mesh animation
TODO. Work in-progress.
-
In Castle Game Engine we simply read the glTF skinned animation data, and "unpack" it at loading time, using CPU, into
TimeSensor
+CoordinateInterpolator
. This means that at runtime, we just doCoordinateInterpolator
animation, not skinned mesh animation anymore. This is not the final solution. Although in practice it works very nicely:- It is very efficient, even on large models (since
CoordinateInterpolator
is so simple, it's very nicely optimized, even though it means we update GPU vertex object every frame). - The loading time (when we calculate
CoordinateInterpolator
) isn't a practical problem. - It cooperates nicely with animation blending.
- The bones can still be animated, to attach additional objects to bones, e.g. attach a weapon to the animated hand.
Still, there are some big drawbacks:
- You can no longer transform bones (just
Transform
nodes) to modify skin at runtime. I mean, you can move bones (translate, rotate) at runtime, but it has no effect on the skinned mesh, since it's animation is now expressed asCoordinateInterpolator
, and it's already calculated. So you cannot do procedural animation, e.g. you cannot do inverse kinematics at runtime. You can only play the animation that was designed. - The memory use of long-running animation is significant. As we precalculate positions, normal vectors, and (in case of bump mapping) tangent vectors for all keyframes, the memory usage is non-trivial when the animation is long and the model is high-poly. We have a log message when it is more than 10 MB.
In CGE we have also implemented H-Anim, which is X3D way of doing skinned mesh animation. However our implementation of H-Anim is not optimized. It moves bones at runtime, but on CPU (not GPU), and this is slow at runtime for non-trivial models. Contrary to glTF, it is not obvious how to implement H-Anim on GPU, likely we should be able to calculate "inverse bind matrices" from H-Anim nodes (thus essentially converting H-Anim -> glTF animation) and then follow glTF skinned animation approach to make it suitable on GPU.
The future:
-
We must convert glTF skinned mesh animation into some X3D nodes. Either H-Anim nodes (with additional information to preserve "inverse bind matrices", not calculate them again, when not needed), or some new node like
SkinnedAnimation
node (that would be designed to match glTF approach easily) . These nodes should allow straightforward conversion from glTF, and efficient playback of animation on GPU.glTF animation data already leans extremely nicely toward GPU calculation, we definitely want to use it. I.e. the pipeline "glTF -> X3D nodes -> rendering" must preserve the "inverse bind matrices" information and nice GPU-friendly layout. It would be bad if in the middle of this pipeline we have to lose and then recalculate the data to make it efficient.
-
See Efficient skinned animation on GPU using new SkinnedAnimation node roadmap point.
-
Eventually we want to also speed up existing H-Anim implementation. Whether this happens, depends on user needs.
If any major 3D authoring software becomes capable of exporting to H-Anim, then it will have higher priority. Otherwise it may remain at lower priority, I'm afraid. As any major 3D authoring software I consider this trio: Blender, 3ds Max, Maya (following Unity and Unreal and CGE and Babylon docs). Right now neither of them has any support to export to H-Anim, as far as I know.
As an optimization, in CGE we also use Shape.collision to make the animated shape collide as a bbox, and animated Shape.bbox to make it reflect the current animation. This makes animation faster (no need to recalculate bounding boxes when shape is changing).
- It is very efficient, even on large models (since
-
In X3DOM, Andreas Plesch started investigating how to convert glTF skinned animation into H-Anim. It isn't finished (and so is not yet actually implemented in X3DOM), but should be a great starting point to resume. Thank you for documenting it! The links:
Punctual lights
glTF punctual lights mostly map nicely to X3D PointLight
, SpotLight
, DirectionalLight
.
The lights equations follows X3D, basically "sum material.emissive + contribution for each light".
-
radius
in glTF has a different falloff than in X3D. TODO: we should have a way to express this in X3D. For now, just copyingradius
from glTF to X3D is good enough.In glTF radius = 0 means "no radius (infinity)", while in X3D radius = 0 means literally "zero radius (does not affect anything)". In Castle Game Engine I added extension to interpret
radius
= -1 as infinity, so when reading glTF I just doif glTF radius = 0 then X3D radius := -1 else X3D radius := glTF radius
. -
ambientIntensity
of the light can stay 0. glTF has nothing equivalent. AndambientIntensity
has no effect onPhysicalMaterial
now. -
attenuation
: glTF attenuation is not like X3D, although it can be approximated. See https://github.com/KhronosGroup/glTF/blob/master/extensions/2.0/Khronos/KHR_lights_punctual/README.md#range-property , see getRangeAttenuation implementation in https://github.com/KhronosGroup/glTF-Sample-Viewer/blob/master/src/shaders/punctual.glsl .For now, I found it simplest to just set X3D
attenuation
to0 0 1
to achieve a realistic light similar to glTF. -
spot
: glTFinnerConeAngle
andouterConeAngle
are equivalent to X3DbeamWidth
andcutOffAngle
.However glTF specifies a different (non-linear) falloff. X3D has linear falloff, glTF has more complicated definition https://github.com/KhronosGroup/glTF/blob/master/extensions/2.0/Khronos/KHR_lights_punctual/README.md#inner-and-outer-cone-angles , sample implementation just implements it by GLSL
smoothstep
.TODO: Possibly we should upgrade X3D
SpotLight
to also do this by default? -
glTF
intensity
andcolor
translate to the same properties in X3D. Yeah, they are just multiplied later.
Image-based lighting
I made initial implementation / sketch of specification of X3D EnvironmentLight
node to express this. It is not 100% ready yet (neither the spec nor the implementation in CGE). Hopefully we will add it in the future X3D version :)