Setup the Virtual Storage Element - PRIN-STOA-LHC/VirtualStorageElement GitHub Wiki

Here you can find the instructions on how to set-up the VSE on a CentOS6 server. Details may vary a little on other Linux distributions. Please note that this is a preliminary version of the VSE, with only basic features implemented.

Introduction

The idea of a VSE was introduced to solve the problem of data replication in the context of local analysis facilities in HEP. In particular, we refer to the dynamically expanding Virtual Analysis Facility (VAF) for the ALICE experiment. In the current set-up, fast data access for interactive analysis is achieved by copying the desired data-sets on a dedicated local storage. This solution is not optimal because:

data might already be present on the local storage element (SE). Nevertheless, in the current ALICE data-model files are stored with their physical-file-names (PFNs), which are not suited for direct access. A query to the AliEn catalogue is required to retrieve the logical-file-names (LFNs), which is the most time-consuming step during data access. This could be avoided by caching the catalog queries. About a factor 3 reduction in the analysis duration is achieved by bypassing the AliEn catalogue.
storage resources are dedicated to the analysis facility exclusively. By registering a third replica of the data-set in the AliEn catalogue, instead of merely copying the files, data could be exploited also by grid jobs running at the same site.

The VSE is a filesystem of links between LFNs and PFNs, which caches the response of the queries to the AliEn catalogue. The contents of the local SE should be made available on the VSE as POSIX path. This is possible by mounting directly the SE distributed filesystem on the VSE. Other solutions might be envisaged in case of a native xrootd SE. The interactive analysis accesses data through the VSE, for example via an xrootd or http interface. The VSE should also run a web server for browsing the links filesystem.

In this scenario the workflow would be:

user needs data
user browses the VSE web interface to check available data (authentication with grid certificate, authorisation with ALICE LDAP)
if the needed dataset is not present, make a request to the VSE administrator
the administrator stages the data using Root and the afdsmgrd stager with a custom copy script:
- executes alien_mirror (3rd replica on catalogue) → AliEn aliprod privileges needed
- creates symbolic links: /vse/alice/cern.ch/.../AliAOD.root→/glfs/alice-xrd/xrootd//09/43427/a35d75fe-e02e-11e3-bde3-9bcb3904c079
the syncVSEd daemon periodically checks the VSE consistency with the SE:
- uses inotify to detect created/removed links and checks consistency with AliEn catalogue (i.e. re-creates links if accidentally removed)
- checks for dangling links and removes them if file removed from catalogue
- no automatic download/removal of datasets (acts only on links), but error messages

Below the VSE set-up at the INFN Torino computing centre:

Immagine

Install CVMFS

As a first step we want to install cvmfs to retrieve the needed software (Root and afdsmgrd) without having to install it. Detailed instructions on how to do it are available elsewhere, i.e. Install CVMFS (grid certificate required). Briefly, the procedure is the following:

install the CAs

cd /etc/yum.repos.d 
wget http://repo-pd.italiangrid.it/mrepo/repos/egi-trustanchors.repo
yum -y install ca-policy-egi-core

download cvmfs and cvmfs-keys packages

wget --no-check-certificate -O /tmp/cvmfs.rpm https://ecsft.cern.ch/dist/cvmfs/cvmfs-2.1.19/cvmfs-2.1.19-1.el6.x86_64.rpm
wget --no-check-certificate -O /tmp/cvmfs-keys.rpm https://ecsft.cern.ch/dist/cvmfs/cvmfs-keys/cvmfs-keys-1.4-1.noarch.rpm

install and set-up cvmfs

yum --nogpgcheck -y localinstall /tmp/cvmfs.rpm /tmp/cvmfs-keys.rpm
cvmfs_config setup
service autofs start
chkconfig autofs on
cvmfs_config chksetup

write configuration in /etc/cvmfs/default.local (example)

CVMFS_HTTP_PROXY=http://t2-squid-01.to.infn.it:3128
CVMFS_CACHE_BASE=/var/lib/cvmfs
CVMFS_STRICT_MOUNT=yes
CVMFS_REPOSITORIES=alice.cern.ch
CVMFS_QUOTA_LIMIT=18000

load configuration and probe service

cvmfs_config reload
cvmfs_config probe

The ALICE software should now be available in /cvmfs/alice.cern.ch.

Set-up the ALICE environment

To set-up the ALICE environment you can use an env.sh script like this:

#!/bin/bash

source /cvmfs/alice.cern.ch/etc/login.sh
eval $( alienv printenv VO_ALICE@ROOT::v5-34-08-6 )

rm -f $HOME/.PoD
mkdir -p $HOME/.PoD
cat > $HOME/.PoD/user_xpd.cf0 <<_EOF_
xpd.datasetsrc alien cache:/opt/vse/var/proof/datasets-cache urltemplate:/vse cacheexpiresecs:604800
xpd.stagereqrepo dir:/opt/vse/var/proof/datasets
_EOF_

export LD_LIBRARY_PATH="/cvmfs/sft.cern.ch/lcg/external/Boost/1.53.0_python2.4/x86_64-slc5-gcc41-opt/lib:$LD_LIBRARY_PATH"
source /cvmfs/sft.cern.ch/lcg/external/PoD/3.12/x86_64-slc5-gcc41-python24-boost1.53/PoD_env.sh

alien-token-init svallero

pod-server stop
pod-server start

T="/tmp/stage-pod-$UID.C"
cat > $T <<_EOF_
{
  TProof::Open("pod://", "masteronly");
}
_EOF_

root -l -b /tmp/stage-pod-$UID.C

rm -f $T
pod-server stop

The script should be executed, not sourced. Remember to insert the right username after the alien-token-init command. Moreover, the user certificate should not be encrypted (see openssl rsa --help and make sure that the file is suitably protected), this is needed when running the syncVSE daemon.

Run the VAF data-stager (afdsmgrd)

The VAF data stager afdsmgrd can be found here. We use the version available in cvmfs. First create the directory structure:

mkdir -p /opt/vse/var/proof/datasets
mkdir /vse

Then clone this repository and move to the afdsmgrd directory:


git clone https://github.com/PRIN-STOA-LHC/VirtualStorageElement.git
cd VirtualStorageElement/afdsmgrd

The files where you might want to modify something are:

config/afdsmgrd.conf: this is the main configuration file. You have to modify:
- PMASTER (fqdn of your VSE)
- xpd.stagereqrepo (where staging requests are stored, you might want to change the default value)
- dsmgrd.urlregex (/vse is the default value for the link filesystem base path, you might want to change it)
- dsmgrd.stagecmd (path to the copy command, use the script af-xrddm-verify.sh to copy files without registering a third replica)
etc/vserc: set some variable for the copy script. This file is needed by the af-mirror-verify.sh script. If you use the af-xrddm-verify.sh script, have a look at the aafrc file.

To run the daemon launch the script launch_afdsmgrd.sh, where you may wish to configure some path according to your set-up (take a look at the script). The run and log directories are self explanatory ;)

Data-sets can now be staged with the standard procedure (using the env.sh script above): Staging data on the VAF.

Run the syncVSE daemon

To run the syncVSE daemon use the syncVSEd init script in this repository (you should have cloned it already). Change the configuration file syncVSEd.cfg according to your setup. Before running, you must install some additional python module. Surely you will need: python-daemon and pyinotify. Check the header in syncVSEd.py to see which modules are imported. To install a new python module do, for example:

yum -y install python-pip
pip install python-daemon pyinotify

Finally start the daemon:

./syncVSEd start

and check its status:

./syncVSEd status
./syncVSEd log

Remember that the user certificate should not be encrypted!

Configure the web server

The benefit of a web server is twofold:

data access: data could be accessed via the http protocol. In this case the /vse directory should be exported and protected in order to allow connections only form the worker-nodes subnet. In you PoD configuration you should specify the protocol accordingly. For instance one the VAF master you should set:

export VafDataSetStorage="http://alice-vse.to.infn.it//vse"

either globally in:

/etc/vaf/alice/remote.before

or user wise in:

$HOME/.PoD/user_xpd.cf0

Mind that the latter file is overwritten each time the vaf-enter command is executed.

browse the VSE: in this case the /vse folder should be linked to another exported directory, with access to the external world and some authentication/authorization module enabled.

For simplicity we installed the Apache httpd server:

yum -y install httpd

Concerning authentication, we use the standard ssl module configured in the following way in /etc/httpd/conf.d/ssl.conf:

SSLCACertificatePath /etc/grid-security/certificates

<Directory /var/www/html/vse>

  # Require SSL auth
  SSLVerifyClient require

  # Set envvars for CGI scripts to some small data
  # plus the whole encoded certificate
  SSLOptions +StdEnvVars +ExportCertData +FakeBasicAuth

  # Allow custom .htaccess
  AllowOverride all

</Directory>

We would like to authorize all active ALICE users based on a look-up of the certificate subject on the ALICE LDAP. Due to the way the ALICE data-base is formatted, it is not straightforward to use existing Apache authorization modules for this. Work is in progress.

Caveats

in order to execute the alien_mirror command, the user should have aliprod rights. This is a privileged user that can write the data-sets metadata, but also remove complete data-sets... so it is unlikely that the VSE administrators will be given such rights. Therefore, at present the script af-mirror-verify.sh cannot be used. Instead one could write another custom script that links on the VSE the files already present in the local SE and copies without registering the missing files (TODO).
xroodt is the standard data access protocol for ALICE jobs. In order to use it one should install an xrootd client on the VSE (instructions are not given here) and change the PoD directive accordingly:

export VafDataSetStorage="root://alice-vse.to.infn.it//vse"