invert_hash Output - VowpalWabbit/vowpal_wabbit GitHub Wiki
When running with --invert_hash <file>
, vw will output an extra, human-readable index of the features.
Note that --invert_hash
is output-only and needs to be run with an input file in order to be able to map the string features to the internal hash, otherwise will just output the internal hash without the corresponding string name.
The file is divided into header, metadata, and feature-index sections. Header elements consist of the VW version used to create the model, and the optional model id, which can be embedded into the model using --id <id>
. The metadata take the form: <identifier>:<value>
, for a number of configuration, statistical, and learning metadata. A few are described below:
Name | Description |
---|---|
(Min/Max) label | The minimum/maximum values observed by the learner at the scorer level. |
bits | The bitness of the feature indicies. A larger value here increases the model size, but may be useful in high |
options | The model-defining arguments |
The feature-index section consists of entries of the form <feature_name>:<feature_index>:<weight>[offset]
. For example, a feature of the form:
ANamespace^cat_feature=category*BNamespace^num_feature:28:0.19[0]
Can be parsed as:
- The quadratic (
-q
) interaction between two features (there is a single*
). - The first feature is in the
ANamespace
, with namecat_feature=category
. - The second feature is in the
BNamespace
, with namenum_feature
. - The index of the generated quadratic feature is 28
- The weight corresponding to this feature is 0.19
- The reduction being used activates interleaved models and the brackets signify which model offset this weight belongs to