FIt SNE - hasselmonians/knowledge-base GitHub Wiki
t-distributed stochastic neighborhood embedding is a powerful technique for dimensionality reduction and visualization of high-dimensional data sets. It generally performs much better than PCA; however the computational complexity of the algorithm limits it to small data sets (n < 3,000).
A popular implementation of t-SNE uses the Barnes-Hut algorithm to approximate the gradient at each iteration of gradient descent. This implementation, FFT-Accelerated Interpolation-based t-SNE (FIt-SNE) uses the fast Fourier transform to compute the convolution required in the Barnes-Hut algorithm across an interpolated equispaced grid. This dramatically speeds up computation time.
GNU/Linux & MacOS
Clone the repository or download the zip file. The only prerequisite the the fastest Fourier transform in the west (FFTW). It is accessible as a package on Debian- and Arch-based systems and can also be downloaded off the website.
The code needs to be compiled once. In the terminal, run:
g++ -std=c++11 -O3 src/sptree.cpp src/tsne.cpp src/nbodyfft.cpp -o bin/fast_tsne -pthread -lfftw3 -lm
while in the repository's directory.
Windows
On Windows machines, the researchers have provided a binary. Download and extract to the bin\
folder in the repository. All FFTW dependencies have been included. All downloaded DLL files need to be added to the bin\
directory.
Running FIt-SNE
The code is written in C++. There are wrappers in R, Python, and MATLAB each called fast_tsne
. For example, in MATLAB run
embedded_X = fast_tsne(X)
Options are passed as a secondary argument in a struct.