[function] create_tfidf_features() - P3chys/textmining GitHub Wiki

Function: create_tfidf_features()

Purpose

Creates TF-IDF (Term Frequency-Inverse Document Frequency) feature representation.

Syntax

create_tfidf_features(df, text_column='processed_text_string', max_features=5000,
                     ngram_range=(1, 1), min_df=2)

Parameters

Similar to create_bow_features() but without binary parameter.

Returns

  • Tuple: (feature_matrix, feature_names, vectorizer)
  • feature_matrix: Sparse matrix with TF-IDF values
  • vectorizer: Fitted TfidfVectorizer object