Sanitizing - novoid/lazyblorg GitHub Wiki

Following table does state how the Org-mode elements are sanitized during HTMLization as implemented in /lib/htmlizer.pysanitize_and_htmlize_blog_content(…).

HTML chars
HTML characters
  • see sanitize_html_characters(…)
  • replaces <, >, &, and m-dash with their HTML representations
Int. L.
Internal Links
  • see sanitize_internal_links(…)
  • Replaces all internal Org-mode links of type =id:foo= or [[id:foo][bar baz]] with their relative paths to those blog articles
Ext. L.
External Links
  • see sanitize_external_links(…)
  • Replaces all external Org-mode links of type [[foo][bar]] with <a href“foo”>bar</a>= and re-writes normal URLs as HTML tags as well.
Text Format
Transforms simple text formatting syntax into HTML entities
URL Ampersand
fixing something I broke above
  • see fix_ampersands_in_url(…)
  • sanitize_html_characters(…) (mentioned above) is really dumb and replaces ampersands in URLs as well. This method finds those broken URLs and fixes them.
  • If this method of fixing something that should be done in a correct way in the first place smells funny, you are right. However, this seemed to be the more efficient way regarding to implementation. Fix it, if you like :-)
  • NOTE: Does not replace several ampersands in the very same URL. However, this use-case of several ampersands in one URL is very rare.
Pandoc
Org-mode to HTML conversion using pandoc
  • I introduced pandoc as a fall-back for converting not yet supported Org-mode elements. This turned out very fine: great performance, great results. I might even think about moving self-implemented HTMLization to pandoc.
  • I am using the Python package pypandoc.
Templates involved
lazyblorg templates that are involved in the HTMLization process
Element HTML chars Int. L. Ext. L. Text Format URL Ampersands Pandoc Templates involved
Paragraph x x x x x #PAR-CONTENT#
Horizontal ruler
Heading x x x x x #SECTION-TITLE#, #SECTION-LEVEL#
List items x x x x x #CONTENT#
HTML block x #NAME#
Verse block x x x x #NAME#
Example block x #NAME#
Colon block x #NAME#
Quote block x x x x #NAME#
Src block x #NAME#
Table x x
LaTeX block x x
Others x x
⚠️ **GitHub.com Fallback** ⚠️