[function] create_document_vectors() - P3chys/textmining GitHub Wiki

Function: create_document_vectors()

Purpose

Aggregates word embeddings to create document-level vector representations.

Syntax

create_document_vectors(df, w2v_model, token_column='processed_text',
                       aggregation='mean')

Parameters

Parameter Type Default Description
df pandas.DataFrame Required DataFrame with tokenized text
w2v_model Word2Vec Required Trained Word2Vec model
token_column str 'processed_text' Column containing token lists
aggregation str 'mean' Aggregation method ('mean', 'max', 'sum')

Returns

  • numpy.ndarray: Document vectors of shape (n_documents, vector_size)