Comparison - sathishks/pdf2htmlEX GitHub Wiki
This page compares common approaches to present PDF files online.
| Convert to HTML 5 | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
|---|---|---|---|---|---|---|
| Example | pdf2htmlEX | PDF.js | pdftoppm (poppler) Google Doc | pdftohtml (poppler) | Adobe PDF Plugin | N/A |
| Briefing | PDF elements are converted into corresponding or closest HTML elements | PDF file is loaded, parsed and rendered by Javascript | PDF pages are converted into images and shown in web pages | Similar as “Convert to HTML 5”, but with much less features | Official plugin | Non-official PDF plugins, Flash-based plugins or others |
| Open source | Yes | Some (pdf.js) | Poppler is open source. Google Doc may be based on poppler as well, because they showed same errors. | Some (pdftohtml) | No | Maybe |
| Free | Yes | Some | Some | Some | Yes | Some |
note: There are free and/or open source tools for all but Adobe PDF plugin.
| Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
|---|---|---|---|---|---|---|
| Processing (server-side) | Normal, one time | None | Slow, one time | Fast, one time | None | None, usually |
| Loading (client-side) | Fast | Fast | Slow | Fast | Fast | Fast |
| Rendering (client-side) | Fast | Slow | Fast | Fast | Fast | Fast, usually |
| Network cost | Small 1 | Small | Large 2 | Small | Small | Small |
1: HTTP compression is required
2: Could be Huge if higher resolution is needed
| Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
|---|---|---|---|---|---|---|
| HTML 5 | Yes | Yes, usually | No | No | No | No |
| CSS | Yes | Yes | No | Yes | No | No |
| Javascript | No | Yes | No | No | No | No |
| Third-party plugin | No | No | No | No | Yes | Yes |
| Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
|---|---|---|---|---|---|---|
| Full PDF Feature ? | No, but usually enough | Maybe | Yes | No | Yes | Maybe |
| Text Extraction (select/copy/search) | Yes | Yes, with text layer | No, usually 1 | Yes | Yes | Maybe |
| Embedding Font | Yes | Yes | Yes | No | Yes | Yes, usually |
| Link | Yes | Yes | No, usually 2 | Yes | Yes | Maybe |
| Accurate rendering (layout/spacing) | Yes, usually 3 | Yes | Yes | No | Yes | Yes, usually |
| Read while loading | Yes | Yes | Yes | Yes | No | Maybe |
1: Text extraction can be supported with a text layer
2: Link may be handled with Javascript
3: There are PDF elements which cannot be converted into HTML losslessly
| Convert to HTML 5 (pdf2htmlEX) | Parse by JS | Convert to image | Convert to HTML 4 | Adobe PDF plugin | Other plugins | |
|---|---|---|---|---|---|---|
| Customizable UI/Theme | Yes | Yes | Yes | Yes | No | No, usually 1 |
| Extensible | Yes | Yes | Yes | Yes | No | Maybe 2 |
1: For some plugins there are commercial licensed versions with customizable UI
2: Some plugins have API available