Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kobo] Detect Fixed layout epubs #1219

Closed
shavitmichael opened this issue Feb 24, 2020 · 14 comments
Closed

[Kobo] Detect Fixed layout epubs #1219

shavitmichael opened this issue Feb 24, 2020 · 14 comments

Comments

@shavitmichael
Copy link
Contributor

Kobo devices don't detect FixedLayout epubs when they are imported over the sync protocol.
CalibreWeb likely needs to parse epubs (and kepubs) in order to correctly set the epub's Format for the device.

Some research notes below

Why this happens:

When an epub/kepub is imported, it goes through the EpubParser::ParseMetadata function, where an "EpubType" enum is set by parsing the .opf in the following fashion:

if (volume.epubType() == -1) {
  if (opf["rendition:layout"] == "pre-paginated") {
   volume.setEpubType(ENUM_EPUB3FL)
  }
  if (opf.has("omf:version")) {
    volume.setEpubType(ENUM_EPUB3_OMF);
  }
}

A book's EpubType subsequently controls whether it will be displayed as a Fixed Layout Document.

When a book is imported over USB, the volume's epubType defaults to -1 and its epubType is therefore set using the above logic.
When a book is imported over the Sync protocol however, the volume's epubType has already been initialized based on the DownloadUrl's format string, in the following fashion:

switch ContentUrl.formatString():
  case "EPUB3FL":
    SetEpubType(ENUM_EPUB3FL)
  case "EPUB_KEPUB":
    SetEpubType(ENUM_KEPUB)
  case "EPUB3_OMF":
    SetEpubType(ENUM_EPUB3OMF)
  case etc...
  ...
   default:
      SetEpubType(-1);

Can we trick the Kobo device into figuring out the EpubType for us?

In order to rely on the EpubParser::parseMetadata's detection of FixedLayouts, we'd have to hit the default case in the format switch statement above. Unfortunately, the Kobo device will only consider a ContentUrl (a.k.a DonwloadUrl) for download by matching its format against one of the following format lists

List<String> format_list;
if content_type == CONTENT_TYPE_SAMPLE:
  format_list = ["EPUB3_SAMPLE", "EPUB3FL_SAMPLE", ...]
else if content_type == CONTENT_TYPE_FULL:
  format_LIst = ["EPUB3", "EPUB3FL", ...]
else if content_type == CONTENT_TYPE_INTERNET_ARCHIVE:
  format_list = ["EPUB"]

All of these have a matching case in the earlier switch statement except for "EPUB". And indeed we can import a book with an epubType of -1 by setting the "IsInternetArchive" field in the entitlement's BookMetadata json and setting the "format" field to "EPUB" in the downloadUrl.
This get's us really close but we're out of luck, it turns out the EpubParser::ParseMetadata function call in question is skipped over precisely when the IsInternetArchive is true :-( .

Other hacks

In the meantime I've been using kobopatch to removes the if (volume.epubType() == -1) check in the EpubParser::ParseMetadata function.
The following patch worked on version 4.18.13737 for me:

Detect EpubType:
  - Enabled: yes
  - ReplaceBytes: {Offset: 0x796630, FindH: 01 30, ReplaceH: 01 28} # add R0, #0x1 -> cmp R0, #0x1

(To be clear this is not the recommended solution for this issue, just a stop gap I've been using for myself).

@shermp
Copy link

shermp commented Apr 14, 2020

I have a query in the same ballpark as this issue. Feel free to ask me to create a new one if more discussion on it is needed.

The question is, is it possible to get the Kobo to sync epub files that can be opened in RMSDK (like sideloaded epubs), rather than ACCESS? I tried setting IsInternetArchive to True as an experiment. It didn't go well... (the book opened, but there was no content displayed.)

I'm happy to patch nickel, if that's what it takes.

@OzzieIsaacs
Copy link
Collaborator

I'm out, as I even don't understand the question?

@shermp
Copy link

shermp commented Apr 14, 2020

Was more for @shavitmichael

But basically, as it stands, when syncing with calibre-web, epub books are synced such that theY open with the kepub renderer (eg: the format in the book list shows KOBO EPUB).

I was wondering if it was possible to get syncing working such that epubs open with the epub renderer instead (eg: format shows EPUB)

@OzzieIsaacs
Copy link
Collaborator

Okay, this is something I understand. I tried to present the reader several formats upon syncing, but it looks like it always requests the keEpub one. So I don't know a way around. Looks like we have to wait for shavitmichael

@shermp
Copy link

shermp commented Apr 14, 2020

Yeah, I tried removing "EPUB3" from "EPUB": ["EPUB", "EPUB3"]. My Kobo completely ignored books when I did that.

@shavitmichael
Copy link
Contributor Author

It's actually quite late for me right now, and I have no idea if this is possible off the top of my head without patch; but I might as well drop-in the full list of acceptable formats :)

For the CONTENT_TYPE_FULL scenario, the device should accept:
["EPUB3", "EPUB3FL", "EPUB3OMF", "KEPUB", "FLEPUB"]
I suppose there's a small chance one of these other values might give you the desired behavior....

(FWIW, if you provide multiple downloadUrls with different format, the device picks the url based on the order of the list above. I'd probably only try one of these at a time if you want to experiment).

@shavitmichael
Copy link
Contributor Author

Yeah, I tried removing "EPUB3" from "EPUB": ["EPUB", "EPUB3"]. My Kobo completely ignored books when I did that.

Yeah we should probably get rid of "EPUB" from the code... it doesn't do anything.
The only way to get the device to accept "EPUB" without a patch is by setting that internet_archive field(but as you found that that bit may be broken so...) .

@shermp
Copy link

shermp commented Apr 14, 2020

Thanks for the answer @shavitmichael

I'm in no particular hurry, so don't worry about rushing out a full answer :)

Interesting to see that plain EPUB isn't in that list, even though you've added it in the calibre-web code. That explains why removing EPUB3 from the list didn't work.

@shermp
Copy link

shermp commented Apr 14, 2020

Not liking the chances without a patch. That list of format basically reads to me like "stuff RMSDK doesn't support", so I'm pretty sure they're all going to considered KOBO EPUB's. For giggles, I tried FLEPUB, and got what I expected.

@shavitmichael
Copy link
Contributor Author

Well, I've run into something weird...

But first, here's what I picked up from reversing libnickel:

  1. When a book is imported over usb, it goes through the following logic:
if file_extension matches ".kepub.epub"
  then parse the book with EpubParser
else
  plugin = PluginLoader::forExtension(file_extension)
  parse the book with the given plugin
  1. When a book is imported over the sync protocol:
if not IsInternetArchive:
  then parse the book with EpubParser
else
  plugin =  PluginLoader::forMimeType(MIME_TYPE_EPUB)
  parse the book with the given plugin

I haven't been able to confirm this, but I would expect the plugin returned from PluginLoader::forMimeType(MIME_TYPE_EPUB) and PluginLoader::forExtension(".epub") extension to be the same. It's therefore not unreasonable to expect the IsInternetArchive bit to give us what we want.


Now here's the weird part, when I tried adding an EPUB using IsInternetArchive yesterday, I ran into the same problem you did: I would get the table of content, but the rest of the book would be blank.
I was messing around with debug logging today, added a few books over usb, probably had a few reboots, and when I went back to test another EPUB import from CalibreWeb with IsInternetArchive it ended up working. Not only that, the book that was bugged the previous day now also opened correctly; without having re-downloaded it.
I've since tried syncing a few more epubs from gutenberg.org and they've all successfully opened.

I really have no clue why it didn't work the first time around, but here's the commit I was using if you'd like to test it out again: c66536d .

@shermp
Copy link

shermp commented Apr 16, 2020

Ok, that really is weird. I thought maybe the epub OPF wasn't being parsed or something.

What I might try is sending some epubs, then rebooting the Kobo to see how it behaves.

Thanks for investigating this!

@shermp
Copy link

shermp commented Apr 16, 2020

Nope, still nothing after a reset.

Even more baffling, images show up, as does the TOC. It almost looks as though it's in two column mode, as the images (including cover) take up half the screen width,

But no text...

Time to look at the database to see what it's put in the content table.

@shermp
Copy link

shermp commented Apr 16, 2020

The content database seems to be a bit of a weird mish-mash of kepub and epub style content for synced ebooks.

Things like the __userid field not set to adobe_user, NumShortCovers is zero, accessability set to 1 etc.

The covers all ways seem to be regenerated on the fly in the library view as well.

Nickel appears to have parsed the opf correctly, as the appropriate toc entries (type 9) are in the content table.

Conclusion: IsInternetArchive is weird.

@OzzieIsaacs
Copy link
Collaborator

It's merged in 0.6.20

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

3 participants