Building word2vec - linux-on-ibm-z/docs GitHub Wiki
Building word2vec
The instructions provided below specify the steps to build word2vec version 0.1c on Linux on IBM Z for following distributions:
- RHEL (8.8, 8.10, 9.2, 9.4, 9.5)
- SLES 15 SP6
- Ubuntu (20.04, 22.04, 24.04, 24.10)
General notes:
-
When following the steps below please use a standard permission user unless otherwise specified.
-
A directory
/<source_root>/
will be referred to in these instructions, this is a temporary writable directory anywhere you'd like to place it.
Build word2vec
1. Install standard utilities, packages and platform specific dependencies
-
RHEL (8.8, 8.10, 9.2, 9.4, 9.5)
sudo yum install -y gcc make wget tar unzip
-
SLES 15 SP6
sudo zypper install -y gcc make wget tar unzip
-
Ubuntu (20.04, 22.04, 24.04, 24.10)
sudo apt-get update sudo apt-get install -y gcc make wget tar unzip
2. Create a working directory and download word2vec source code
cd $SOURCE_ROOT
wget https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip
unzip source-archive.zip
3. Build word2vec
cd word2vec/trunk
make CFLAGS="-lm -pthread -O3 -Wall -funroll-loops"
4. Set environment variables
export PATH=$PATH:$SOURCE_ROOT/word2vec/trunk
5. Test word2vec using demo scripts
./demo-word.sh
./demo-phrases.sh
_**Note:**_ Enter test corpus as input and get word vectors as output, e.g. Input=france
6. Run word2vec binary
word2vec
_**Note:**_ The word2vec tool takes a text corpus as input and produces the word vectors as output.