Run on Windows - MaastrichtU-IDS/data2services-pipeline GitHub Wiki
Disclaimer: the pipeline has not been tested on Windows as extensively as on Linux, and Windows is not as stable, so you might encounter some issues. Please quickly report them in issues, especially if a you have found a solution.
We recommend to use Git Bash
to clone the repository, and the Windows PowerShell
terminal (which is easier to use than the basic terminal).
All windows scripts are in the resources/windows_scripts
folder and designed to be run from this directory.
cd resources/windows_scripts
Install and fix Docker
-
Install here. You will need to create an account on Docker Hub for Windows.
-
Virtualization and Hyper-V must be activated.
- Docker will propose to install virtualization automatically after the Docker installation if they are not installed.
- Note that Docker Hyper-V is not available for Windows 10 Home edition (you will need Pro or Enterprise edition)
- If you still have issues with activating virtualization, check here.
-
Share drive in Docker >
Settings
>Shared Drives
> Share Drive C (or the on available, you will need to work in this drive, Docker will only be able to access data in Shared Drives) -
Firewall detected issue: common, see with your IT department or deactivate your firewall
-
If Docker can't access internet when building you might want to change the DNS (to use Google's one). E.g.:
wget: unable to resolve host address
: go toDocker Settings > Network > DNS Server > Fixed: 8.8.8.8
Clone
Open the Git Bash
application to download the directory with git. And execute the following commands to download the code required to run the pipeline:
# IMPORTANT: fix a bug on Windows. Newline causing Apache Drill execution to fail:
# Standard_init_linux.go:175 exec user process caused no such file
git config --global core.autocrlf false
git clone --recursive https://github.com/MaastrichtU-IDS/data2services-pipeline.git
Build
-
You need to download Apache Drill installation bundle and GraphDB standalone zip
- Register to get an email with download URL: request the
Free version
Download as standalone server
: a zip file
- Register to get an email with download URL: request the
-
Put Apache Drill and GraphDB files in their own folder in the
data2services-pipeline
git repository (let thenm unzipped) -
Build the images
cd resources/windows_scripts
./build.bat
# Create graphdb and graphdb-import directories in /data
mkdir /data/graphdb
mkdir /data/graphdb-import
Run Drill and GraphDB services
In a production environment it is considered that both Apache Drill and GraphDB services are present. Use docker
to start them. Other RDF stores should also work, but have not been tested yet.
Be careful, you might want to change the volumes to add the disk location required on Windows: c:/data:/data:ro
# Start Apache Drill
docker run -dit --rm -p 8047:8047 -p 31010:31010 --name drill -v c:/data:/data:ro apache-drill
# Start GraphDB
docker run -d --rm --name graphdb -p 7200:7200 -v c:/data/graphdb:/opt/graphdb/home -v c:/data/graphdb-import:/root/graphdb-import graphdb
Create "test" repository by accessing http://localhost:7200/repository
Run using Docker command
Check the Run using Docker commands
part of the main documentation to run the different parts of the pipeline.
Be careful you will need to edit the folder paths to point to the path you are using (c:/data
by default).
And make the command one line (remove newlines and \
as the PowerShell doesn't handle them).