XWPFConverterXHTML - opensagres/xdocreport GitHub Wiki
org.apache.poi.xwpf.converter.xhtml provides the DOCX 2 XHTML converter based on Apache POI XWPF.
You can test this converter with the REST Converter service http://xdocreport-converter.opensagres.cloudbees.net/
Download this converter with :
- maven :
 
<dependency>
  <groupId>fr.opensagres.xdocreport</groupId>
  <artifactId>org.apache.poi.xwpf.converter.xhtml</artifactId>
  <version>XDOCREPORT_VERSION</version>
</dependency>
where XDOCREPORT_VERSION is the XDocReport version (ex : 1.0.0).
- download the docx.converters-xxx-sample.zip
 
Here a sample to convert org.apache.poi.xwpf.usermodel.XWPFDocument to XHTML format :
import org.apache.poi.xwpf.converter.xhtml.XHTMLOptions;
import org.apache.poi.xwpf.converter.xhtml.XHTMLConverter;
...
// 1) Load DOCX into XWPFDocument
InputStream in= new FileInputStream(new File("HelloWord.docx"));
XWPFDocument document = new XWPFDocument(in);
// 2) Prepare XHTML options (here we set the `ImageManager` to store image and resolve iamge src)
XHTMLOptions options = XHTMLOptions.create().setImageManager( new ImageManager( new File(root), "images" ) );
// 3) Convert XWPFDocument to XHTML
OutputStream out = new FileOutputStream(new File("HelloWord.htm"));
XHTMLConverter.getInstance().convert(document, out, options);
If your docx have images and you wish display in the HTML you can configure class ImageManager with the XHTMLOptions by
options.setImageManager( new ImageManager( new File(baseDir), "images" ) );in which it will default do:
- 
extractimage underbaseDir/imageSubDir/ - 
resolveimage src attribute in html 
You can see a sample with our JUnit XHTMLConverterTestCase
If you want to embed image into html using base64, you can use Base64EmbedImgManager:
XHTMLOptions options = XHTMLOptions.create().indent( 4 ).setImageManager(new Base64EmbedImgManager());You can find full example here: XHTMLConverterEmbedImgTest