Accents, DIN 91379, non Latin scripts - LibrePDF/OpenPDF GitHub Wiki
To process text containing letters composed of multiple Unicode glyphs e.g. letters with accents, it is necessary to compute the correct positioning of the glyphs and code this positions into the resulting PDF file. For complex scripts glyph substitution and reordering is necessary.
OpenPDF can process such texts starting with release 1.3.24.
This page describes the usage for release 3.0.4 or newer with GlyphLayoutManager.
For release 3.0.3 using (the now deprecated) LayoutProcessor see Accents, DIN 91379, non Latin scripts (2026-04-02),
for older releases see Accents, DIN 91379, non Latin scripts (2025-06-06).
Internally OpenPDF uses Java2D builtin routines for glyph layout, reordering and substitution.
Since Java 9 these routines rely on the HarfBuzz shaping library.
We tested this approach with letters conforming to "DIN 91379: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM" (and the predecessor DIN SPEC 91379) which describes a subset of Unicode consisting mainly of Latin letters and diacritic signs. This standard is mandatory for the data exchange of the German administration with citizens and businesses since Nov. 2024.
The processing of text in other languages, bidirectional and complex scripts using this approach is possible, you are invited to try it and share the results.
GlyphLayoutManager is enabled per Document, and has no static state, so it is designed to work in a
multithreading environment processing multiple documents in separate threads.
Provide an OpenType font containing the necessary characters and positioning information, see below for some open source fonts. If no OpenType font is provided, GlyphLayoutManager will throw an exception.
import org.openpdf.text.pdf.GlyphLayoutManager;
...
float fontSize = 12.0f;
GlyphLayoutManager glyphLayoutManager = new GlyphLayoutManager();
// The OpenType fonts loaded with glyphLayoutManager.loadFont() are
// available for glyph layout. Only these fonts can be used.
String fontDir = "org/openpdf/examples/fonts/";
Font sans = glyphLayoutManager.loadFont(fontDir + "noto/NotoSans-Regular.ttf", fontSize);You can also load the font from an input source. You have to supply a name for loading the font that ends with ".ttf" or ".otf".
import org.openpdf.text.pdf.GlyphLayoutManager;
...
float fontSize = 12.0f;
GlyphLayoutManager glyphLayoutManager = new GlyphLayoutManager();
// Provide the input source
inputSource = ... ;
Font sans = glyphLayoutManager.loadFont("NotoSans-Regular.ttf", inputSource, fontSize);If an error occurs while loading a font, a GlyphLayoutFontManager.FontLoadException is thrown.
You enable advanced glyph layout by registering the glyphLayoutManager with the Document.
try (Document document = new Document().setGlyphLayoutManager(glyphLayoutManager)) {
// proceed as usual
}You can also use the following form:
try (Document document = new Document()) {
document.setGlyphLayoutManager(glyphLayoutManager);
PdfWriter writer = PdfWriter.getInstance(document, Files.newOutputStream(Paths.get(fileName)));
document.open();
document.add(new Chunk("A̋ C̀ C̄ C̆ C̈ C̕ C̣ C̦ C̨̆ D̂ F̀ F̄ G̀ H̄ H̦ H̱ J́ J̌ K̀ K̂ K̄ K̇ K̕ K̛ K̦ K͟H K͟h", serif));
// ...
}Optionally you can set the default GlyphLayoutManager font options before loading the fonts.
These options are used for all fonts loaded with this GlyphLayoutManager.
GlyphLayoutManager glyphLayoutManager =
new GlyphLayoutManager().setDefaultFontOptions(new FontOptions().setKerningOn().setLigaturesOn());If you want to use different options, you can set the font options per font while loading the font.
Font serifKerning = glyphLayoutManager.loadFont(fontDir + "noto/NotoSerif-Regular.ttf",
fontSize, new FontOptions().setKerningOn());
Font serifLigatures = glyphLayoutManager.loadFont(fontDir + "noto/NotoSerif-Regular.ttf",
fontSize, new FontOptions().setLigaturesOn());
Font serifKerningLigatures = glyphLayoutManager.loadFont(fontDir + "noto/NotoSerif-Regular.ttf",
fontSize, new FontOptions().setKerningOn().setLigaturesOn());GlyphLayoutManager.loadFont throws GlyphLayoutFontManager.FontLoadException if the font can not be loaded
or it is not an OpenType font.
The constructor of GlyphLayoutManager throws an IllegalStateException
if LayoutProcessor is enabled. Don't use the deprecated LayoutProcessor!
If a font is used that has not been loaded GlyphLayoutManager.loadFont an UnsupportedOperationException
is thrown. All fonts have to be loaded with GlyphLayoutManager.loadFont.
LayoutProcessor is the predecessor of GlyphLayoutManager and is deprecated now, use GlyphLayoutManager.
GlyphLayoutManager and FopGlyphProcessor can't be used together, you have to decide for one of them.
If you use GlyphLayoutProcessor, FopGlyphProcessor is switched off using document.setGlyphSubstitutionEnabled(false).
This call disables FopGlyphProcessor, and its functionality like glyph substitution and more will be provided by GlyphLayoutManager.
In addition to the default process for OpenPDF-html you have to create a GlyphLayoutManager,
load the fonts with GlyphLayoutManager and register the GlyphLayoutManager with the ITextRenderer.
public void test() throws Exception {
var htmlFilename = "GlyphLayoutHtmlExample.html";
var inputStream = this.getClass().getResourceAsStream(htmlFilename);
var documentBuilder = DocumentBuilderFactory.newInstance().newDocumentBuilder();
var document = documentBuilder.parse(inputStream);
var glyphLayoutManager = new GlyphLayoutManager();
var fontResolver = new ITextFontResolver();
loadFont(glyphLayoutManager, fontResolver, "Arimo-Regular.ttf", 12.0f,
"fonts/arimo/Arimo-Regular.ttf");
loadFont(glyphLayoutManager, fontResolver, "Arimo-Bold.ttf", 12.0f,
"fonts/arimo/Arimo-Bold.ttf");
var pdfFilename = "GlyphLayoutHtmlExample.pdf";
try (var outputStream = new FileOutputStream(pdfFilename)) {
var renderer = new ITextRenderer(fontResolver);
renderer.setDocument(document);
renderer.setGlyphLayoutManager(glyphLayoutManager);
renderer.layout();
renderer.createPDF(outputStream);
}
System.out.println("PDF created: " + pdfFilename);
}
private void loadFont(GlyphLayoutManager glyphLayoutManager, ITextFontResolver fontResolver, String fontName,
float fontSize, String fontResourcePath) throws IOException, GlyphLayoutFontManager.FontLoadException {
var fontUrl = this.getClass().getResource(fontResourcePath);
Objects.requireNonNull(fontUrl, "Font not found: " + fontResourcePath);
try (var fontStream = fontUrl.openStream()) {
var font = glyphLayoutManager.loadFont(fontName, fontStream, fontSize);
fontResolver.addFont(font.getBaseFont(), fontUrl.getFile(), null);
}
}This example shows the correct rendering for all letters from DIN 91379.
Code: GlyphLayoutDin91379.java
Result: GlyphLayoutDin91379.pdf
Java's Bidi-class is used to deduce the text direction for each chunk of text,
it should not be necessary to specify the text direction per font explicitly.
It is possible to set the direction per font, but this should not be necessary.
You can load the font from an input stream.
Optionally you can specify kerning and ligatures per document.
Optionally you can specify kerning and ligatures per font.
GlyphLayoutKernLigaPerFont.java
GlyphLayoutKernLigaPerFont.pdf
Show letters and symbols from the Unicode Supplementary Multilingual Plane,
- DIN 91379 (English Wikipedia)
- DIN 91379 (German Wikipedia)
- DIN 91379 Characters and Sequences (GitHub)
- DIN 91379:2022-08: Characters and defined character sequences in Unicode for the electronic processing of names and data exchange in Europe, with CD-ROM (access chargeable)
- Decision of IT Planungsrat 2022/51 (in German)
- HarfBuzz text shaping library
- HarfRust, HarfBuzz port to Rust