Home - baisc/tesseract GitHub Wiki

Table of contents

Installation for Windows

Installer for Windows for Tesseract 3.05-02 and Tesseract 4.00-beta are available from Tesseract at Mannheim University Library(UB Mannheim). These include the training tools. Both 32-bit and 64-bit installers are available.

The latest installers can be downloaded here: tesseract-ocr-setup-3.05.02-20180621.exe, tesseract-ocr-w32-setup-v4.0.0.20181030.exe (32 bit) and tesseract-ocr-w64-setup-v4.0.0.20181030.exe (64 bit). There are also older versions available.

An installer for the OLD version 3.02 is available for Windows from our download page. This includes the English training data. If you want to use another language, download the appropriate training data, unpack it using 7-zip, and copy the .traineddata file into the 'tessdata' directory, probably C:\Program Files\Tesseract-OCR\tessdata.

To access tesseract-OCR from any location you may have to add the directory where the tesseract-OCR binaries are located to the Path variables, probably C:\Program Files\Tesseract-OCR.

Experts can also get binaries build with Visual Studio from the build artifacts of the Appveyor Continuous Integration.

Compilation guide for Windows

Using Tesseract

!!! IMPORTANT !!! To use Tesseract in your application (to include tess or to link it into your app) see this very simple example https://github.com/tesseract-ocr/tesseract/wiki/User-App-Example.

Build the latest library

  1. Download the latest CPPAN (C++ Archive Network https://cppan.org/) client from https://cppan.org/client/.
  2. Run cppan --build pvt.cppan.demo.google.tesseract.tesseract-master.

For visual studio project using tesseract

  1. Setup Vcpkg the Visual C++ Package Manager.
  2. Run vcpkg install tesseract:x64-windows for 64-bit. Use --head for the master branch.

Static linking

To build a self-contained tesseract.exe executable (without any DLLs or runtime dependencies), use Vcpkg as above with the following command:

  • vcpkg install tesseract:x64-windows-static for 64-bit
  • vcpkg install tesseract:x86-windows-static for 32-bit

Use --head for the master branch. It may still require one DLL for the OpenMP runtime, vcomp140.dll (which you can find in the Visual C++ Redistributable 2015).

Build training tools

Today it is possible to build a full set of tess training tools on Windows with Visual Studio. The latest versions (Win10, VS2015/VS2017) are preferable.

To do this:

  1. Download the latest CPPAN (C++ Archive Network https://cppan.org/) client from https://cppan.org/client/.
  2. Run cppan --build pvt.cppan.demo.google.tesseract-master.

Develop Tesseract

For development purposes of Tesseract itself do the next steps:

  1. Download and install Git, CMake and put them in PATH.
  2. Download the latest CPPAN (C++ Archive Network https://cppan.org/) client from https://cppan.org/client/. CPPAN is a source package distribution system. Add CPPAN client in PATH too. (VS2015 redist is required.)
  3. If you have a release archive, unpack it to tesseract dir.

If you're using master branch (4.0) run

git clone https://github.com/tesseract-ocr/tesseract tesseract
  1. Run

    cd tesseract
    cppan
    mkdir build && cd build
    cmake ..
    
  2. Build a solution (tesseract.sln) in your Visual Studio version. If you want to build and install from command line (e.g. Release build) you can use this command:

cmake --build . --config Release --target install

If you want to install to other directory that C:\Program Files (you will need admin right for this), you need to specify install path during configuration:

cmake .. -G "Visual Studio 15 2017 Win64" -DCMAKE_INSTALL_PREFIX=inst

For development purposes of training tools after cloning a repo from previous paragraph, run

cppan --build .

You'll see a solution link appeared in the root directory of Tesseract.

Building for x64 platform

If you're building with cppan+cmake, run cmake as follows:

mkdir win64 && cd win64
cppan ..
cmake .. -G "Visual Studio 14 2015 Win64"

If you're building with cppan, edit cppan.yml and uncomment this line:

#generator: Visual Studio 14 2015 Win64 -> generator: Visual Studio 14 2015 Win64

Then run cppan --generate . - it will create a solution link for you.

(For VS2017, use '15 2017' instead of '14 2015'.)

3.05

If you have Visual Studio 2015, checkout the https://github.com/peirick/VS2015_Tesseract repository for Visual Studio 2015 Projects for Tessearct and dependencies. and click on build_tesseract.bat. After that you still need to download the language packs.

3.03rc-1

Have a look at blog How to build Tesseract 3.03 with Visual Studio 2013.

3.02

For tesseract-ocr 3.02 please follow instruction in Visual Studio 2008 Developer Notes for Tesseract-OCR.

3.01

Download these packages from the Downloads Archive on SourceForge page:

  • tesseract-3.01.tar.gz - Tesseract source
  • tesseract-3.01-win_vs.zip - Visual studio (2008 & 2010) solution with necessary libraries
  • tesseract-ocr-3.01.eng.tar.gz - English language file for Tesseract (or download other language training file)

Unpack them to one directory (e.g. tesseract-3.01). Note that tesseract-ocr-3.01.eng.tar.gz names the root directory 'tesseract-ocr' instead of 'tesseract-3.01'.

Windows relevant files are located in vs2008 directory (e.g. 'tesseract-3.01\vs2008'). The same build process as usual applies: Open tesseract.sln with VC++Express 2008 and build all (or just Tesseract.) It should compile (in at least release mode) without having to install anything further. The dll dependencies and Leptonica are included. Output will be in tesseract-3.01\vs2008\bin (or tesseract-3.01\vs2008\bin.rd or tesseract-3.01\vs2008\bin.dbg based on configuration build).