Adding and Encoding Page Images - LiteratureInContext/LiC-data GitHub Wiki
Facsimile page images are images of the actual pages from, for instance, the first edition of the published text. We include them to give a clearer picture of the material object—the book. You might take pictures of page images with your phone from a book in a library collection, or you might find existing page images online, often from a library website. In either case, you’ll need to rename the page images, and then connect them with the XML of your digital edition.
XML Tag
We'll be using this tag to explain the naming convention. This is the tag in the XML that is used to display page images:
<pb n="100" facs="pageImages/100.jpg"/>
The element is a TEI element for a “page break”.
The n="" attribute (or @n) in the pb tag represents the page number that the image in @facs="" will be displaying. This also has bearing on the alt text that is auto-generated by the application. For example, page 57 will be
Title pages will show n="[title page]" and similarly other unnumbered pages may use a different value for the n="". If there is a change to the numbering convention—any page number that is not actually the number printed on the page itself, you should use [square brackets] to indicate that you’ve added information.
For simplicity, we choose to name the images (that will be uploaded to the pageImages folder of the text on AWS) using the page they are representing. So, if the picture is showing the 100th page in the text it will simply be called 100.png or 100.jpg etc.
Pages with roman numerals: If the page number is a roman numeral, it is named with that roman numeral. Ex: vii.jpg.
Note that there is room for some variation, as long as consistency within each text is maintained. For instance, some contributors may choose to batch-renumber page images; in that case, you may need to use leading zeroes to make sure the images stay in the right order in your folder.
Unnumbered Pages: Title Pages, Blank Pages and More
Note: Most of these pages do not actually show up in the xml but must be named this way because they will still be uploaded to the AWS storage. Pages that aren’t numbered pages in the text will have to be named differently, so it’s easy for us to encode the elements in the larger XML file. Their names will vary and will be relative to the first numbered page: We name them using decimal numbers relative to the first page.
Reference Picture:
Saving Your Page Images
Save all page images to a folder on your computer. These images will later be uploaded to our AWS server, so you can give that folder a title that matches the storage naming conventions.
- If your XML document is identified in its xml:id as “franklin-autobiography,” then that means all page images will be stored in the “franklin-autobiography” folder on AWS.
- Inside “franklin-autobiography” will be two folders, one called “pageImages,” and one called “notes.”
- If you are also adding images to your notes, you may want to create a folder on your computer called, for instance, “franklin-autobiography,” with two folders inside of that—“pageImages” for page images, and “notes” for images you’ll include in annotations.