HN 格式的页面数据 - HinTak/caj2pdf GitHub Wiki

HN-格式的页面数据

CAJFILE_FindAllTextExW1, CAJFILE_GetPageTextW, and CAJFILE_SelectTextExW in cajviewer eventually calls WITS_21_S72::GetPageTextW, and WITS_21_S72::FindStringEx, which then calls WITS_21_S72::GetFirstCChar and WITS_21_S72::GetNextCChar.

CAJPage::LoadPage is the only place which uses CT_TAG (=COMPRESSTEXT) and it uses WITS_21_S72::ParsePage, after optionally calls UnCompress(). (CAJDocEditor::DistillB uses CT_TAG for compressing.)

WITS_21_S72::ParsePage are splitted into two, WITS_21_S72::ParseWits21(), WITS_21_S72::ParseSBS2().

Also of interests is WITS_21_S72::GetRectTextW.

WITS_21_S72::ParseWits21(), WITS_21_S72::ParseSBS2(), WITS_21_S72::GetFirstCChar and WITS_21_S72::GetNextCChar shares a few things in common: it seems to be a giant switch statement selecting different *CmdObj structures followed by data. These *CmdObj can be Pic, TextColor, Square, PreSquare, Flower, SplitLine, ChemLine, ArrawLine, Ellipse, String, FlowerSide, Bracket, PicRec, Line, S72FlowerSide, and positioning commands.

Likely when it is String we get text, and Pic for picture positioning.