CSAN: Contextual Self Attention Network for User Sequential Recommendation - penny4860/study-note GitHub Wiki

1. 정리

요약

(image, text, item_id, behavior_type)을 모두 사용하는 개인화 추천 모델
각각의 정보를 혼합하는 방법을 제안
- raw input
  - behavior_type : 1-hot
  - 나머지는 모두 embedding
- semantic embedding vector
  - raw input을 concat & FC-layer
  - 최종적으로 1개의 128d vector로 표현
뒷 내용을 차차 읽어보자..

2. 내용

3. OUR PROPOSED METHOD: CSAN

3.2 Heterogeneous Behavior Embedding

3.2.1 Multi-type actions representation

1-hot encoding vector로 표현

3.2.2 Multi-modal content representation

text 정보
- doc2vec 모델을 사용
- 전체 text를 64d-vector로 표현
image 정보
- imagenet pretrained 모델 + PCA
- image가 여러장이면 elementwise 평균 vector
- 전체 image를 64d-vector로 표현

3.2.3 Semantic space embedding

raw input feature
1. action type : 1-hot
2. item 임베딩 :
3. text 임베딩 : 64d
4. image 임베딩 : 64d
output : semantic embedding vector
- raw input을 concat
- FC-layer를 통과해서 128d-vector를 만든다.