NewInstallation - ufal/clarin-dspace GitHub Wiki
There are two ways to install DSpace. First is in Docker, which is easier and preferred. The second consists of downloading all necessary software, matching versions, configuring it, compiling, installing and running.
- Install Docker Desktop
- Docker compose v2 is required. On Linux, it does not always come by default, so if necessary, install it with official guide.
All necessary files are in frontend repository, so first, checkout the repository from github.
git clone https://github.com/ufal/dspace-angular
cd dspace-angular
-
All necessary files are in the frontend repository, so first, checkout the repository from GitHub.
-
In order to run DSpace in Docker,
.env
file in the front-end root folder (dspace-angular/
) with environment variables is necessary. There are two basic scenarios that require slightly different configurations - the example of .env file for each scenario is specified below. (Localhost/Public) -
After setting up
.env
file, run Docker and create users. -
After Docker containers are started, don't forget to set up Nginx as specified below, in order to be able to access DSpace from remote hosts.
When running on localhost, the frontend MUST run in development mode. The .env file example is here:
INSTANCE=0
DSPACE_HOST=localhost
DSPACE_VER=dspace-7_x
DSPACE_SSL=false
FE_CMD=yarn start:dev
#please do not edit the following variables unless you know what you are doing
DOCKER_OWNER=ufal
DSPACE_UI_IMAGE=${DOCKER_OWNER}/dspace-angular:$DSPACE_VER
DSPACE_REST_IMAGE=${DOCKER_OWNER}/dspace:$DSPACE_VER
DSPACE_REST_PORT=808${INSTANCE}
UI_PORT=400${INSTANCE}
DSPACE_REST_NAMESPACE=/server
DSPACE_UI_NAMESPACE=/
REST_URL=http://${DSPACE_HOST}:${DSPACE_REST_PORT}${DSPACE_REST_NAMESPACE}
UI_URL=http://${DSPACE_HOST}:${UI_PORT}${DSPACE_UI_NAMESPACE}
Example of .env in frontend:
INSTANCE=0
DSPACE_HOST=example.com
DSPACE_VER=dspace-7_x
DSPACE_SSL=true
# If you want to run the front-end in development mode, uncomment the next line
# FE_CMD=yarn start:dev
# NOTE!: The line above is NECESSARY for localhost.
#please do not edit the following variables unless you know what you are doing
DOCKER_OWNER=ufal
DSPACE_UI_IMAGE=${DOCKER_OWNER}/dspace-angular:$DSPACE_VER
DSPACE_REST_IMAGE=${DOCKER_OWNER}/dspace:$DSPACE_VER
DSPACE_REST_PORT=8${INSTANCE}
UI_PORT=8${INSTANCE}
DSPACE_REST_NAMESPACE=/server
DSPACE_UI_NAMESPACE=/
REST_URL=http://${DSPACE_HOST}:${DSPACE_REST_PORT}${DSPACE_REST_NAMESPACE}
UI_URL=http://${DSPACE_HOST}:${UI_PORT}${DSPACE_UI_NAMESPACE}
# If you want to set up JAVA_OPTS
# Server memory limit (4GB)
# JAVA_OPTS=-Xmx4g
You may need to change DSPACE_REST_PORT to something else, e.g.443. Feel free to leave out the $INSTANCE part and just use the port number.
In both versions, it is possible to modify the first section of values. An instance is an arbitrary number, but enables several DSpace instances to run on the same machine. Be sure to use different project names (-p
parameter for Docker Compose)! Also, be sure to check if your machine has sufficient resources (CPU, RAM) for that.
DSPACE_VER refers to image tag, most are in this list: Docker Tags
If your reverse proxy is on a different machine add HOST_IP=a.b.c.d
to your .env
where a.b.c.d
is the IP on the interface that you reverse proxy can reach
After setting up .env
file, run the commands for starting Docker (you can replace dspace-project-name
with something suitable for you):
docker compose --env-file .env -f docker/docker-compose.yml -f docker/docker-compose-rest.yml pull
docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml up -d --no-build
Now you should be able to open $UI_URL
(http://localhost:4000/ if you haven't changed it) in you browser. It takes a while before everything starts.
To add administrator and other users, use the following commands, docker compose files and .env exactly the same as above.
docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml -f docker/cli.yml run --rm dspace-cli create-administrator -e [email protected] -f firstname -l lastname -p password -c en -o organization
docker compose --env-file .env -p dspace-project-name -f docker/docker-compose.yml -f docker/docker-compose-rest.yml -f docker/cli.yml run --rm dspace-cli user --add -m [email protected] -g givenname -s surname -l en -p password -o organization
Obviously, it is possible to change parameters like -e for email, -m for email, -f for first name, -g for given name, -s for surname, -l for last name, -p for password, -o for organization. Only use the arguments for the command as specified above. Just modify values if needed.
In the folder with Docker compose files (docker
file in the above) it is also possible to have a config.prod.yml file for the front-end and a local.cfg file for the back-end.
https://github.com/dataquest-dev/DSpace/wiki/Custom-namespace
The main rule is just to be careful.
When volume is mounted on another disk, Docker doesn't allow the removal of the volume. Instead, error is displayed: Error response from daemon: remove <volume-name>: Unable to remove a directory outside of the local volume root /var/lib/docker: /<path-to-docker-storage>/volumes/test/_data
. It is possible to use this fact to add another layer of protection of volumes by placing them on another disk (which is sometimes necessary in any case, due to data size). It can be done simply by sym link /var/lib/docker/volumes
to a specified place on another disk. But be sure to test it before relying on it.
There are original installation instructions from vanilla DSpace. However, they are quite long and extensive and some parts are not necessary. They also list several possible versions, so here is a shortened list. Consult the original instructions if anything is unclear.
Make sure you know and are able to access the installed/extracted software.
- Java JDK, version 17.0.7
- maven, version 3.8.1
- ant, version 1.10.15
- postgres, version 13.11
- solr, version 8.11.4
- tomcat, https://dlcdn.apache.org/tomcat/tomcat-9/
- git
- Node.js, version 16.20.0
- yarn, install after node with command
npm install --global yarn
- We had issues, if user's home folder contained accents (e.g.
C:/Users/Tomáš Marný
), so we recommend having a home folder that does not contain accents.
- We had issues, if user's home folder contained accents (e.g.
-
create a database
- go to the database installation folder
createuser --username=postgres --no-superuser --pwprompt dspace
createdb --username=postgres --owner=dspace --encoding=UNICODE dspace
psql --username=postgres -c "CREATE EXTENSION pgcrypto;" dspace
-
download DSpace sources (this repo)
-
edit configuration in
dspace/config/clarin-dspace.cfg
(and other configs) -
use the command
mvn clean install
in the repo root -
(go to
/dspace/target/dspace-installer
) -
use command
ant fresh_install
in<dspace-repo>/dspace/target/dspace-installer
- above command creates dspace installation in a new folder. By default, it is
C:/dspace
or/dspace
. - locate it and make sure this command created it.
- from now on, we will refer to it as
<dspace-installation-folder>
- above command creates dspace installation in a new folder. By default, it is
-
(go to DSpace installation folder )
-
use the command
bin/dspace database migrate force
in<dspace-installation-folder>
-
create admin
bin/dspace create-administrator
in<dspace-installation-folder>
-
copy everything from
webapps/*
to<tomcat>\webapps
-
copy solr cores
cp -R [dspace]/solr/* [solr]/server/solr/configsets
-
download frontend sources
-
use command
yarn install
- make sure your database is running (it should be automatically)
- (go to frontend sources)
- use the command
yarn start
in<frontend-source>
- start solr by
solr start
- start tomcat by using
catalina run
The .env file can contain the following additional variables to configure S3
S3_STORAGE=1
S3_ENABLED=true
S3_RELATIVE_PATH=false
S3_BUCKET=docker-dummy-bucket
S3_SUBFOLDER=
S3_ACCESS=myaccestoken
S3_SECRET=mysecretpasswordtoken
S3_REGION_NAME=us-east-1
This should be valid since version 7.5. The first two must remain as is, in order to enable S3. The rest can (should) be modified.
The whole server block should look like this:
server {
listen 80;
server_name dspace.url;
location / {
proxy_pass http://localhost:4000;
}
location /server/ {
proxy_set_header Host $http_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_pass http://localhost:8080;
}
}
This assumes the following:
- DSpace is run in Docker
- The back-end runs on port 8080
- The front-end runs on port 4000
- settings in .env or config state following addresses:
- DSPACE_UI_URL: dspace.url
- DSPACE_REST_URL: dspace.url/server/
Of course, if some ports are different, change them in configuration.
TODO: document necessary headers (such as X-Forwarded-Proto and X-Forwarded-Port) and ref https://github.com/dataquest-dev/DSpace/issues/536
Returning just the cmdi metadata must be ensured in Clarin installations. Add this to the location /
block from above.
# placed in location block of DSpace frontend
# redirect .../handle/123456/123456?format=cmdi to .../cmdi/oai-metadata... which returns just XML file with metadata
# ? at the end of the redirect stops nginx from appending original parameters
if ($query_string ~* "format=cmdi"){
rewrite ^/(.*)handle/(.*)$ http://$http_host/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$2? redirect;
}
# if HTTP request to .../handle/123456/123456 contains header "Accept: application/x-cmdi+xml" or similar, redirect
# to the same as above.
# http_*name*of*header* returns any header, in this case Accept:
if ($http_accept ~ "(.*cmdi.xml*)"){
rewrite ^/(.*)handle/(.*)$ http://$http_host/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$2? redirect;
}
assuming the FE and BE are behind the same host (proxy); you can:
# CMDI content - # replace repository-ng with your path prefix, or tweak the regexp as above
if ($arg_format ~* "cmdi"){
rewrite ^/repository-ng/handle/(.*)$ /repository-ng/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$1? last;
}
if ($http_accept = "application/x-cmdi+xml"){
rewrite ^/repository-ng/handle/(.*)$ /repository-ng/server/cmdi/oai-metadata?metadataPrefix=cmdi&handle=$1? last;
}
# /CMDI content
To check the first part, use a command like
curl -k https://dspacehost.com/handle/1234/56789?format=cmdi -L
To check the second part, use a command like
curl -k https://dspacehost.com/handle/1234/56789 -L -H "Accept: application/x-cmdi+xml"
start from https://github.com/ufal/clarin-dspace/issues/1032#issuecomment-2066469795
run-cli-command.sh = sudo docker exec -w /dspace/bin dspace8 ./dspace "$@"
chmod +x /path/to/run-cli-command-88.sh
0 23 * * * cd /app && ./run-cli-command-88.sh oai import
20 0 * * * cd /app && ./run-cli-command-88.sh index-discovery
1 3 * * * cd /app && ./run-cli-command-88.sh subscription-send -f D
2 3 * * 0 cd /app && ./run-cli-command-88.sh subscription-send -f W
3 3 1 * * cd /app && ./run-cli-command-88.sh subscription-send -f M
0 4 1 * * cd /app && ./run-cli-command-88.sh cleanup
30 0 * * * cd /app && ./run-cli-command-88.sh health-report -e <YOUR_EMAIL>
or
cat /etc/cron.d/lindatrepo
MAILTO=root
RUNCMD="docker compose -p lindatrepo exec dspace /dspace/bin/dspace"
0 23 * * * root $RUNCMD oai import
20 0 * * * root $RUNCMD index-discovery
1 3 * * * root $RUNCMD subscription-send -f D
2 3 * * 0 root $RUNCMD subscription-send -f W
3 3 1 * * root $RUNCMD subscription-send -f M
0 4 1 * * root $RUNCMD cleanup -v
30 0 * * * root $RUNCMD health-report -e <YOUR_EMAIL>
Avoid running any /dspace/bin/dspace commands around midnight. That's when log rotation happens and we've seen log lost (probably due to multiple log rotations)
There exists a fear that OAI might be unstable. Please check if OAI shows items after adding (or harvesting) many of them.
They should be visible in the OAI interface. If something is wrong, either an empty site or an Error number and a short description
will be shown. Check logs of apache-tomcat, try folders /dspace/log
and tomcat/logs
(in docker it's usually /usr/local/tomcat/logs
).