FiVaTech: Page Level Web Data Extraction from Template Pages - wanghaisheng/awesome-ocr GitHub Wiki

deduce the template automatically

webpage segmentation, webpage structure labeling, and webpage text segmentation and labeling