(Puduppully) Data to text Generation with Entity Modeling - sogang-nlp-paper/WNGT-2019-DGT-NLG-Track GitHub Wiki

Idea

기존의 encoder-docoder + conditional copy (Wiseman et al, 2017) 방법에 entity modeling을 추가한 모델을 제시 하고 Rotowire dataset보다 summary가 더 길고 input records가 더 많은 MLB dataset을 제시.

본 논문의 contribution은 다음과 같다.

propose a entity-aware model for data-to-text generation
introduce a new dataset for data-to-text generation (MLB dataset)

Model

encoder-docoder + conditional copy + entitiy memory + hierarchical attetion

Entity Memory

entity representation을 구하고, decoder시 dynamically update 한다.

u: representation of entity k at time t (u 벡터를 구하는 network)

Hierarchical Attention

각 record에 대해 2차원 표현인 g matrix와 entity memory의 u 벡터들에 대해 hierarchical attention을 구하여 최종적으로 q 벡터를 구하고 이를 이용해 생성할 단어를 예측한다.

Experiment

여전히 template에 비해 성능이 낮다.

이전 모델(Puduppully et al., 2019)과 비교하여 전반적으로 모든 수치가 높지는 않다.