Setting up Declad - fermitools/declad GitHub Wiki

Setting up Declad

Overview

To run a declad instance, you will need:

  • an account to run the service under
  • the software and dependencies
  • A suitable credential (like an x509 service/host certificate, and/or a refreshed Scitoken via the Managed Token Service) for authentication
  • a working rucio installation with suitable client configuration file
  • a working metacat installation with suitable client configuration
  • a cron job or similar to refresh metacat tokens
  • a declad config file

This document will describe each of these in more detail

An account

You will probably run this service under a custom account name, and configure the system in a directory in the home account.

Here at Fermilab we tend to use, for an experiment named "hypot" either the "hypotpro" or "hypotraw" account.

As an account who can sudo, do: sudo useradd -u userid accountname

and setup suitable login permissions (.k5login here at Fermilab, .ssh/authorized_keys as appropriate elsewhere)

[You can find the userid onsite at Fermilab by using wget or curl to fetch http://www-giduid.fnal.gov/cd/FUE/uidgid/uid.lis and looking for it (upper case), or by looking on any of the central gpvm cluster machines for that experiment.]

A credential

Currently Declad is setup to use a host/service x509 certificate for all its authentication needs, and it assumes you will use some external script to refresh the metacat/data-dispatcher token from that credential. We hope in future to also allow SciTokens, so if you have a service that refreshes SciTokens for other services, (i.e. our Managed Token Service) you could use it here, and it's easiest to use SciTokens for authentication for the file transfers as you do not need extra configuration on the fileservers to allow that credential in.

Installing the software

At the moment, the preferred way to install declad is with Spack and the recipes in the fermi SCD_Recpies repository. However, the dependencies are all Python, and so one could install them all in a virtualenv, or with pip install --user.

Installing with Spack

wget https://github.com/FNALssi/fermi-spack-tools/raw/v2_21_0/bin/bootstrap
sh bootstrap $HOME/packages
. $HOME/packages/setup-env.sh
spack buildcache list -al declad
spack install --cache-only declad/hash-from-above

If there is an error with the gcc compiler version, edit the $SPACK_ROOT/etc/spack/compilers.yaml file, duplicate the gcc-11.4.1 section, and change one of the sections to version 11.3.1.

If there is an error with the gpg key, do spack buildcache keys --install --trust --force.

For other installation options, see the bottom of this article.

create a "custom" directory

somewhere you're putting such things make a "custom" directory, and copy the custom/dune.py from the spack package, like:

  mkdir custom
  cp $(spack location -i declad)/custom/dune.py $HOME/custom/dune.py
  ln -s $HOME/custom/dune.py $HOME/custom/__init__.py

create a rucio config

mkdir -p $HOME/rucio_config/etc
vi $HOME/rucio_config/etc/rucio.cfg
[client]
rucio_host = https://xyz-rucio.fnal.gov
auth_host = https://xyz-rucio.fnal.gov

ca_cert = /etc/grid-security/certificates
account = xyzpro
auth_type = x509
client_cert = /home/xyzpro/certs/xyz-declad-cert.pem 
client_key  = /home/xyzpro/certs/xyz-declad-key.pem

This of course needs to be edited for your experiment's rucio service and home directory.

Create a start.sh

Now you want a start script to start your service; it needs to access your spack area or virtualenv to find the software, and set the environment variables to access the rucio and metacat instances, something like:

#!/bin/sh

# find the software
# . /path/to/virtualenv/activate
#
# or
. $HOME/packages/setup-env.sh
dver=2.0.4
spack load declad@$dver

# config for MetaCat and Rucio
export METACAT_SERVER_URL=https://metacat.server:port/instance/app
export RUCIO_HOME=$HOME/rucio_config

# find our $HOME/bin executables and $HOME/custom python files first
export PYTHONPATH=$HOME:$PYTHONPATH
export PATH=$HOME/bin:$PATH

nohup declad.py -dc declad_config.yaml  < /dev/null > logs/nohup.out 2>&1 &
echo $! > logs/declad.pid

Of course, for "dver" above, use the version number for declad that you installed earlier.

Oh, and also

mkdir $HOME/logs
so we have a place for all this output, and/or possibly symlink it to some scratch partition where you have more room for logs.

For symmetry, you may also want a "stop.sh", like:

#!/bin/sh

if [ -r logs/declad.pid ]
then
    kill $(<logs/declad.pid)
    rm logs/declad.pid
fi

But it isn't strictly necessary.

cronjob to refresh metacat authentication

create a script metacat_refresh.sh

# if you're using a proxy from that cert to authenticate file transfers refresh it:
grid-proxy-init -cert $HOME/certs/xyz-declad-cert.pem -key $HOME/certs/xyz-declad-key.pem

# if you're using a managed token to authenticate file transfers, refresh that
export HTGETTOKENOPTS="--credkey=xyzpro/managedtokens/fifeutilgpvm01.fnal.gov"
htgettoken -i xyz -r production -a htvaultprod.fnal.gov

# now refresh your metacat login token, either using your cert, or your managed token.

export METACAT_SERVER_URL=https://metacat.host:port/xyz_meta_prod/app
export METACAT_AUTH_SERVER_URL=https://metacat.host:authport/auth/xyz
metacat auth login -m x509 -c $HOME/certs/xyz-declad-cert.pem -k $HOME/certs/xyz-declad-key.pem xyzpro

And a cron entry for the refresh, and probably an entry to start the service

crontab -e 
0 * * * * /path/to/metacat_refresh.sh > logs/refresh.out 2>&1
@reboot /path/to/metacat_refresh.sh > logs/refresh.out 2>&1 ; /path/to/start.sh > logs/start.out 2>& 1

And finally, if you're using token authentication older versions of xrdcp for file transfers, you'll need some wrappers in $HOME/bin to get BEARER_TOKEN set to the current token, which look like:

bin/xrdcp:

#!/bin/sh

# run xrdcp, but with token authentication...

uid=$(id -u)
export BEARER_TOKEN=$(<${BEARER_TOKEN_FILE:-/var/run/user/$uid/bt_u$uid}) 

/usr/bin/xrdcp "$@"

and similarly for xrdfs. These combined with having $HOME/bin in your PATH in the start script, will have declad use these wrappers to set BEARER_TOKEN to the latest token contents each time it gets run.

Test authentication setup

Now test the authentication setup:

  • run the refresh script
  • use the METACAT_SERVER_URL and RUCIO_HOME values in your start script to test:
 
. $HOME/packages/setup-env.sh
spack load declad@2.0.4
METACAT_SERVER_URL=value_from_start.sh  metacat auth whoami
RUCIO_HOME=value_from_start.sh rucio whoami

(Use the version of declad you installed in the "spack load") Both should give your experiment production account name back.

configure declad_config.yaml

We will use vi (or your favorite editor) to create a declad_config.yaml in the $HOME directory.

Note that templates like "rel_path_pattern" in the config take metadata field names from the converted MetaCat metadata, not the SAM metadata field names.

Example contents below:

for xrootd remote dropbox

# Format of Declad config file as concluded from perusing sources.

debug_enabled: true              # debug messages in the log?
default_category: "migrated"     # default metadata category for unexpected uncategorized metadata attrs
destination_root_path: "/a/b/c"  # path part of $dst_url for templates, below also $path in same
destination_server: "host:port"  # host part of $dst_url for templates, below
source_root_path: "/x/y"         # path part of $src_url for templates, below
source_server: "host:port"       # host part of $src_url for templates, below, use localhost for local dropbox

copy_command_template:  "xrdcp $src_url $dst_url"                     # copy command 
download_command_template: "xrdcp xrootd:$server$src_path $dst_path"  # metadata file download with $server $src_path, $dst_path
delete_command_template: "xrdfs rm $path"                             # clean files out of Dropbox, with $server and $path 
quarrantine_location: /tmp/quarantine                                 # location for files / metadata that don't match, etc.

create_dirs_command_template:   "xrdfs $server mkdir -p $path"

# uncomment these and change if you don't want these default values
#history_db: history.sqlite                         # file to keep history
#interval: 30                                       # scan interval
#timeout: 30                                        # timeout for scans
#lowercase_meta_names: False                        # convert metadata fields to lowercase
#max_movers: 10                                     # max parallel copies
#keep_interval: 24*3600                             # how long to keep files
#low_water_mark: 5                                  # size of work queue to trigger new scans
#stagger: 0.2                                       # time interval in seconds between consecutive transfer task starts
meta_suffix: .json                                 # suffix for metadata file (I.e. metadata for Fred.xx is in Fred.xx.json) 

metacat_dataset: (for custom/dune.py)                # dataset to put files in
metacat_url:                                         # url to metacat service 

rel_path_function:                                   # relative path function ("hash" or "template") for placing files
# rel_path_pattern:                                  # template (using metadata fields) if you picked "template" above


#logging
log:                                                 # name of logfile
error:                                               # separate error log file, defaults to log: value.

rucio:
  dataset_did_template:                              # template for rucio dataset namespace:name
  declare_to_rucio: (True)                           # whether to declare to rucio
  target_rses: [list]                                # rse's to ask Rucio to forward dataset, above, towards
  drop_rse:                                          # dropoff rse that has the dst_url we're copying to, above.
  
samweb:
  user:                                              # Sam credentials: username
  url:                                               # url for samweb instance
  cert:                                              # user x509 cert for authentication
  key:                                               # private key for above

scanner:
  type: local                                        # scanner type: local or xrootd
  replace_location:                                  # replace Dropbox location with this in path for local scanner
  ls_command_template:    "xrdfs $server ls -l $location"               # directory listings: with $server and $location  
  parse_re:               "^(?P[a-z-])\S+\s+\d{4}-\d{2}-\d{2}\s+\d{2}:\d{2}:\d{2}\s+(?P\d+)\s+(?P\S+)$"                                      
                                                     # regexp for "ls" / "xrdfs" ls  output                                          
  filename_patterns:                                 # filename pattern(s) to watch for
  filename_pattern: *.hdf5

web_gui:
  prefix:                                            # website url prefix for web monitor
  site_title: (Declaration Daemon)                   # website title for web monitor
  port: (8080)                                       # website port for monitor

#graphite:                                           # for forwarding data to graphite service
#        host: filer-carbon.cern.ch
#        port: 2004
#        namespace: fts.protodune.np04-srv-024-hd5 
#        interval: 10
#        bin: 60

for local directory dropbox

# Format of Declad config file as concluded from perusing sources.

debug_enabled: true              # debug messages in the log?
default_category: "migrated"     # default metadata category for unexpeted uncategorized metadata attrs
destination_root_path: "/a/b/c"  # path part of $dst_url for templates, below also $path in same
destination_server: "host:port"  # host part of $dst_url for templates, below
source_root_path: "/x/y"         # path part of $src_url for templates, below
source_server: "localhost"       # host part of $src_url for templates, below, use localhost for local dropbox

create_dirs_command_template: ":" # command template using $server and  $path, use ":" if destination auto-creates directories
copy_command_template:  "xrdcp $src_url $dst_url"       # copy command 
download_command_template: "cp $src_path $dst_path"     # metadata file download command, with $server $src_path, $dst_path
delete_command_template: "rm $path"                     # clean files out of Dropbox, with $server and $path 

quarrantine_location: /tmp/quarantine                   # location for files / metadata that don't match, etc.

# uncomment these and change if you don't want these default values
#history_db: history.sqlite                         # file to keep history
#interval: 30                                       # scan interval
#timeout: 30                                        # timeout for scans
#lowercase_meta_names: False                        # convert metadata fields to lowercase
#max_movers: 10                                     # max parallel copies
#keep_interval: 24*3600                             # how long to keep files
#low_water_mark: 5                                  # size of work queue to trigger new scans
#stagger: 0.2                                       # time interval in seconds between consecutive transfer task starts
meta_suffix: .json                                 # suffix for metadata file (I.e. metadata for Fred.xx is in Fred.xx.json) 

metacat_dataset:                                     # dataset to put files in
metacat_url:                                         # url to metacat service 

rel_path_function:                                   # relative path function ("hash" or "template") for placing files
# rel_path_pattern:                                  # template (using metadata fields) if you picked "template" above


#logging
log:                                                 # name of logfile
error:                                               # separate error log file, defaults to log: value.

rucio:
  dataset_did_template:                              # template for rucio dataset namespace:name
  declare_to_rucio: (True)                           # whether to declare to rucio
  target_rses: [list]                                # rse's to ask Rucio to forward dataset, above, towards
  drop_rse:                                          # dropoff rse that has the dst_url we're copying to, above.
  
samweb:
  user:                                              # Sam credentials: username
  url:                                               # url for samweb instance
  cert:                                              # user x509 cert for authentication
  key:                                               # private key for above

scanner:
  type: local                                        # scanner type: local or xrootd
  replace_location:                                  # replace Dropbox location with this in path for local scanner
  ls_command_template: "ls -ln $location"                 # with $server and $location 
  parse_re:              "^(?P[a-z-])\\S+\\s+\\d+\\s+\\d+\\s+\\d+\\s+(?P\\d+)\\s+\\S.{11}\\s(?P\\S+)$"                                          
                                                     # regexp for "ls" / "xrdfs" ls  output                                          
  filename_patterns:                                 # filename pattern(s) to watch for
  filename_pattern:

web_gui:
  prefix:                                            # website url prefix for web monitor
  site_title: (Declaration Daemon)                   # website title for web monitor
  port: (8080)                                       # website port for monitor

#graphite:                                           # for forwarding data to graphite service
#        host: filer-carbon.cern.ch
#        port: 2004
#        namespace: fts.protodune.np04-srv-024-hd5 
#        interval: 10
#        bin: 60

Alternate installation

Installing with a virtualenv/or with pip install --user

if using virtualenv, first do a

python -m venv $HOME/venvs/declad
and activate the environment.
. $HOME/venvs/declad/bin/activate

if not using virutalenv, do pip install --user instead of just pip install, below.

Then pip install the dependencies:

  • webpie
  • metacat
  • rucio-clients
  • py-jinja2

Declad itself isn't currently pip-installable, you can clone it from https://github.com/fermitools/declad.git and symlink the "declad" subdirectory in your virtualenv or pip local "site_packages" directory.

⚠️ **GitHub.com Fallback** ⚠️