Configure your data repository - DLR-SC/DataFinder GitHub Wiki
DataFinder's primary use case is the management and archival of data of a designated community. I. e. DataFinder can be considered as data management framework providing common functionality, supporting data management and archival tasks but has always to be coustomized to reflect a community's needs. Thus, DataFinder supports members of such a community to organize their data in a common, standardized way and eases the identification of relevant data. This is particularly important within scientific communities in which one often encounters high employee fluctuation.
DataFinder allows you to map community's requirements by adaptation of the data repository configuration. The configuration consists of the following parts:
- The data model reflecting the relations and properties of your data.
- The storage configurations determining the storage resources used to persist data files.
- Script extensions customizing the DataFinder client to map specific working processes.
To determine the different configuration parts, we suggest following the steps of the below described process.
- Requirements Analysis In the first step you should analyze the designated community and their specific needs. Particularly, you should try to answer the following questions:
- What is the community's valuable data? What data has to be kept?
- How is data currently structured? Are there any specific relations?
- How are the different data objects connected with each other?
- What properties own the different data objects?
- What are their common working processes/best-practices?
- What tools are used to produce data? How are they connected?
- What IT infrastructure is available (already used file server etc.)? These things should be well documented as they form the basis of the next step.
- Configuration
- First you should create the data model in the administrator client: (FYI: xsd to datamodel.xml)
- Create data types to determine the required and optional meta data.
- Define relations which reflects the logical connections of the data types.
- Configure storage resource.
- Migration
- Customization