Gbif Portugal production environment - AtlasOfLivingAustralia/documentation GitHub Wiki

Introduction

The first contact of GBIF Portugal with Atlas of Living Australia happened in the scope of the activities of the GBIF CESP mentoring project France-Portugal-Spain, in 2014. The platform raised the interest of GBIF Portugal, that had not a national data portal implemented. Since that time, Portugal has participated in several workshops with hands-on sessions, the first one in Paris, in 2015 (as a co-located event of the European Nodes Meeting). After a second hands-on workshop in Madrid, in the beginning of 2016, it was possible to launch the production portal in 19th October 2016, with the support of Santiago Martinez de la Riva (GBIF Spain) and David Martin (ALA). To know more, check the report of GBIF Portugal 2016.

The platform was implemented on a cloud service provided by INCD - National Distributed Computing Infrastructure. INCD is a research Infrastructure created to provide digital services and support to the other infrastructures of the Portuguese Roadmap of Research Infrastructures. One of these infrastructures is the Portuguese E-Infrastructure for Information and Research on Biodiversity (PORBIOTA), of which GBIF Portugal is a member. This framework enables the access to GBIF Portugal to the resources of the grid computing community in Portugal.

The name of the Portuguese data portal is Portal de Dados de Biodiversidade de Portugal, accessible through http://dados.gbif.pt. It is also accessible from the communication portal of GBIF Portugal, at www.gbif.pt/dados (PT).

Infrastructure

The cloud environment provided by INCD is managed with Openstack. The resources available in the project created in OpenStack are

  • up to 30 instances
  • up to 80 VCPUs
  • up to 130 GB RAM
  • up to 10 volumes
  • up to 2 TB of storage

Architecture

The Portal de Dados de Biodiversidade de Portugal is implementing the following ALA modules:

Biocache: for searching occurrence records. Collectory: for managing metadata about institutions and their datasets. Images: for managing associated to records.

Diagram

The architecture is as follows:

Portal de dados de biodiversdiade de Portugal

Data and data management

The datasets included in the portal architecture:

  • all datasets published through by a Portuguese institution, regardless of being endorsed or not by GBIF Portugal;

  • the repatriated dataset of Portugal.

Datasets are ingested to the platform after their download at GBIF.org. No datasets are ingested directly from the source IPT. This dataflow tries to ensure that all records are equal, regardless of being accessed at GBIF.org or at the national portal.