LA Quick Start Guide - AtlasOfLivingAustralia/documentation GitHub Wiki

Introduction

As you can see this is a work in progress page. Also, pages marked with ⚒ needs some extra effort

This document tries to explain a quick way to start deploying and using a new Living Atlas (LA) platform.

Basic introduction to LA

The LA is basically a great big data cube.

The LA uses names as a way of indexing this cube so that users can structure data according to taxonomy (e.g.: I only want records from the family Cassidini). This is done with the (BIE) module and name matching indexes with your own taxonomies.

The biocache holds occurrence records – this animal/plant was seen here, at this time, by this person, what, where, when and who. The information in the biocache is indexed by a solr index, which allows people to search the biocache for things they are interested in.

The name matching index contains a taxonomy suitable for processing. The supplied information in every occurrence record in the biocache is matched against the name matching index and the occurrence record is annotated with things like the matched name, higher taxonomy, quality of match etc. The link between the name matching index and the biocache is the taxonID or guid, which gives a unique identifier for each species, suitable for indexing in solr.

The BIE holds organising information – this species, this dataset, this locality, this region, this webpage. A person can search the BIE as a first entry point into the LA and get back a number of references to things that might be of interest. In particular, the BIE holds species and taxonomy information. It also holds references to, more or less, anything that can be used to search the biocache.

The collectory holds metadata about the datasets, data providers, collections and institutions the provide data to the biocache. The metadata particularly holds a description, URLs, contact information and licencing and copyright information. As well as datasets, it can also hold metadata about things like webpages, lists of species, etc.

The basic use of a LA platform is that a user goes to the BIE and types in a name. The BIE will search for the name and give the user a set of options. The user can then click on the link that is closest to what they want. If that link is a species (or genus etc.) page then the user can ask to be shown all the records in the biocache which match the taxonID.

But a user can search occurrences by institutions, collections, sub-collections, regions, species lists, spatial queries, etc.

Before install

Before start take into account the LA infrastructure requirements.

Also it's interesting to study other portals infrastructure to plan yours. See this wiki sidebar for other portals info.

Take into account some basic and recommended tasks before start your LA installation to make all this process straightforward, like to choose which domain your LA platform will use, configure your DNS (or not), how to access to your servers, etc.

Need an initial extra help?

To deploy a LA node is not an easy task. So furthermore our yearly in-person workshops we have a procedure based on remote sessions to bootstrap and/or launch a new LA instance within an institution with our help.

These sessions help newcomers to have an overview of the LA modules.

Install

The recommended way to install LA is using our ansible ala-install repository.

As you will need some custom inventories with the information about your portal, the Living Atlas Ansible Generator command line tool or its more easy web interface https://generator.l-a.site/ will do this for you simplifying the use of the previous repository. You can install all the recommended LA modules asking some basic questions and running one simple command ./ansiblew --alainstall=../ala-install all --nodryrun.

If you need just to test some virtual machine with some LA basic modules, the ala-demo playbook installation of the ala-install README is also a good basic start. There are some quite up-to-dates inventories from our last workshop in that repository.

We are reviewing the use of Docker & Debian packages for simplify even more all this process.

Additional recommendations

  • If you rerun your ansible installation several times, use --skip-tags nameindex (--skip if you are using the generator) or you will run out of disk (and patience) fast.
  • If you rerun ansible install on a production server, see our FAQs for more details. (eg. is recommended to skip nameindex,solr7_create_cores,cassandra tags).

Post-install

Secure your LA platform

Your can find in this wiki some basic recommendations to secure your LA infrastructure. Take special attention to your solr admin interface.

Take into account that if you don't enable CAS your admin web areas probably are open secure your admin areas. So maybe you should take some additional security step while you deploy your node initially.

Verify all basic services are up and running

After a correct full ansible deployment of your modules verify that are running correctly.

Also verify that cassandra is running correctly and also solr doing some basic low-level queries. Take into account the security considerations to access solr from the outside mentioned above.

If these basic services are not running correctly, other depending modules will not work or start correctly, like biocache-service.

Styling the LA modules

Probably one of the first things you wanna do is to style your LA modules with your branding.

The ala-demo playbook and the LA ansible generator installs for you some basic branding in /srv/yourdomain.org/www. But you can get more information in the previous link.

The https://github.com/living-atlases/base-branding is a good start in order to have a more advanced branding to start with and personalize.

Logger basic configuration

You should allowlisting your IPs in the logger admin interface to allow correct log recollection from that IPs. This is done in "Admin > Remote Addresses", that is /admin/remoteAddress URL. More info here.

Enable CAS and https

To not use the CAS Auth system in your LA modules is like to have a Wordpress without user management, roles, etc, and all the related functionality. Because of that, we recommend to install and use CAS in your LA platform. Also we recommend to enable https as soon as possible.

If you don't want to enable CAS initially, take some additional steps to secure your admin areas.

The LA ansible generator simplifies a lot these tasks. For instance, you can install CAS with some simple command: ./ansiblew --alainstall=../ala-install cas --nodryrun and some additional steps.

Configure API Keys

With CAS, userdetails and apikeys running you should create and configure your API Keys.

Upload image-service licenses

To finish the image-service installation use the https://images.your-l-a.site/admin screen to load the licence information from ala-install files ansible/roles/image-servicefiles/licence_mappings.csv and ansible/roles/image-service/files/licences.csv.

Configure properly sending mails

In order to send CAS account activation notifications, alerts notifications or download email notifications, you need to setup correctly the email sending for your domain via postfix, and also configure some LA modules to use that server. The most common is to configure your auth server, alerts and biocache services to relay your emails correctly to your email provider as described in the previous link.

On each VM that sends emails (CAS, biocache, alerts, etc.) you will need to install and configure postfix to send mail as above. Those services (CAS, biocache, alerts, etc.) in invoke postfix via localhost as described below.

Later take into account ansible variables like:

email_sender = [email protected]
mail.smtp.host = localhost
mail.smtp.port = 25
mail_host = localhost

Backup of your platform and version control of your configurations

It's important to start to backup⚒ your servers ASAP and to keep track of your configuration changes, inventories, branding, etc. See Version-control-of-your-configurations page for more details.

Localize the software to your local language/s

The LA software is translated in almost twenty languages. We use crowdin to translate⚒ it. If you need to improve or add new languages, just ask for an invitation to you or your team and join as in crowdin. You can find additional i18n development information in the ALA-Internationalization-(i18n)⚒ page.

Probably you will need your region variant (for instance: es-ar, es-ec, es-mx, etc) of some language (Spanish in this example). See a full list [supported languages and codes|https://support.crowdin.com/api/language-codes/]. So please ask us to activate in crowdin.

For instance, in case of German we have:

This is important also because there are some references to the national node in the translations that you should adapt. We are trying to remove this references from the translations and move them to the configurations.

Adapt some extra configurations

Charts

You can edit the charts configuration to adapt for your needs (like translate to your main language or add/remove values). Find the configurations in:

  • /data/ala-collectory/config/charts.json
  • or /data/collectory/config/charts.json
  • /data/ala-hub/config/charts.json
  • /data/ala-bie/config/charts.json

More info.

GBIF Account in collectory

In order to use the GBIF API in the collectory (useful tu use the repatriation tools,...) configure an user/password in you inventories and reconfigure your collectory.

[collectory:vars]
gbif_api_user = your-gbif-user
gbif_api_password = your-gbif-password
gbif_registration_enabled = true
gbif_use_doi = true
gbif_registration_dry_run = false

Usage of your LA modules

Urls of your LA node

Imagine that your domain is something like https://l-a.site and you are using subdomains.l-a.site for each LA module. Depending on the modules you configured these are the main urls for that modules:

  • Collections: https://collections.l-a.site
  • Collections administration: https://collections.l-a.site/admin
  • Biocache (occurrences): https://biocache.l-a.site
  • Biocache administration: https://biocache.l-a.site/admin
  • Biocache webservice: https://biocache-ws.l-a.site
  • Species: https://species.l-a.site
  • Species webservice: https://species-ws.l-a.site
  • Species webservice administration: https://species-ws.l-a.site/admin
  • SOLR non-public web interface: http://index.l-a.site:8983 See solr admin interface to tips to access this.
  • CAS Auth system: https://auth.l-a.site/cas
  • User details: https://auth.l-a.site/userdetails
  • User details administration: https://auth.l-a.site/userdetails/admin
  • Apikey management: https://auth.l-a.site/apikey/
  • CAS management administration: https://auth.l-a.site/cas-management/
  • Logger: https://logger.l-a.site/
  • Logger administration: https://logger.l-a.site/admin
  • Species list: https://lists.l-a.site
  • Species list administration: https://lists.l-a.site/admin
  • Regions: https://regions.l-a.site
  • Regions administration: https://regions.l-a.site/alaAdmin
  • Spatial: https://spatial.l-a.site
  • Spatial Webservice: https://spatial.l-a.site/ws
  • Spatial Geoserver: https://spatial.l-a.site/geoserver/

Layers of your Spatial service

We'll add some basic layers in order to use them later with your data and other modules (like regions), and later you can configure better your spatial portal.

Data management

You can sample and index data against the layers created in the previous step. And map your dataResource occurrences to their collectory institutions and collections.

Configure your Biocache UI

You can later tune your Biocache UI to add some new menus, charts, etc.

Configure your Regions

Next you can expose in a menu of the regions service the sampled polygon layers.

Configure the Spatial Portal

In a similar way you configured Biocache UI, you can configure the Spatial Portal with your menus, skin, etc.

Build a name index

In the Guide to Getting Names into the ALA your will get a detailed information about name indexing. And here a more simple example.

Also probably you should configure your species subgroups.

Configure some Species pages and species level traits

As a last step for a basic LA usage you can configure some Species pages and species level traits.

Configure your SDS service

If you have installed the Sensitive Data Service configure it.

Any problem?

Have a look to our [troubleshooting]] page, our [[FAQ]] and if it's not enough, try to [search for support.

Other useful links

Our sidebar is a good collection of highlighted pages, have a look on them.