A JUMBOConverter will convert a document type ("file format") to CML without semantic loss.
(Sometimes the converter will work in the reverse direction). There are about 20-30 formats that
JUMBOConverters understand including CIF, Chemdraw (CDX), MDL/SDF, and many of the computational
logfiles.
Note that we do not duplicate Openbabel. OB is used for molecular and some spectral files. JUMBOConverters
are used for complex documentswhere the structure of the documents is important.
What is JUMBOParser?**
JUMBOParser is a new template-based approach to parsing legacy formatted ASCII files, e.g. traditional logfile output). It is line-oriented
since much output is written as FORTRAN records. It is declarative which means that:
* there is no need to write procedural code (such as Java, Python, etc.)
* all parsing apparatus is XML and can be written, read and processed externally to the JUMBOParser program
* unit tests and documentation are combined; by adding examples you create documentation and test the program
It is well suited to community development and can be very rapidly adapted (hours) to changes in output format.
What is a template?**
It is a declaration of what you expect the parsed XML output to look like. Templates are easy to read and create as they
map onto the record structure. templates also define the dictionary references that will be required.
What is a dictionary?**
It is a collection of **entry**s which describe the concepts in the parsed documents. Often these are basic chemical quantities
such as dipole moment and ionization potential. The dictionary defines their unitType (e.g. energy, length) and also
adds human-readable descriptions.