[function] create_document_vectors() - P3chys/textmining GitHub Wiki
create_document_vectors()
Function: Purpose
Aggregates word embeddings to create document-level vector representations.
Syntax
create_document_vectors(df, w2v_model, token_column='processed_text',
aggregation='mean')
Parameters
Parameter | Type | Default | Description |
---|---|---|---|
df |
pandas.DataFrame | Required | DataFrame with tokenized text |
w2v_model |
Word2Vec | Required | Trained Word2Vec model |
token_column |
str | 'processed_text' | Column containing token lists |
aggregation |
str | 'mean' | Aggregation method ('mean', 'max', 'sum') |
Returns
- numpy.ndarray: Document vectors of shape (n_documents, vector_size)